Phylogenetic Approach Reveals That Virus Genotype Largely Determines HIV Set-Point Viral Load

HIV virulence, i.e. the time of progression to AIDS, varies greatly among patients. As for other rapidly evolving pathogens of humans, it is difficult to know if this variance is controlled by the genotype of the host or that of the virus because the transmission chain is usually unknown. We apply the phylogenetic comparative approach (PCA) to estimate the heritability of a trait from one infection to the next, which indicates the control of the virus genotype over this trait. The idea is to use viral RNA sequences obtained from patients infected by HIV-1 subtype B to build a phylogeny, which approximately reflects the transmission chain. Heritability is measured statistically as the propensity for patients close in the phylogeny to exhibit similar infection trait values. The approach reveals that up to half of the variance in set-point viral load, a trait associated with virulence, can be heritable. Our estimate is significant and robust to noise in the phylogeny. We also check for the consistency of our approach by showing that a trait related to drug resistance is almost entirely heritable. Finally, we show the importance of taking into account the transmission chain when estimating correlations between infection traits. The fact that HIV virulence is, at least partially, heritable from one infection to the next has clinical and epidemiological implications. The difference between earlier studies and ours comes from the quality of our dataset and from the power of the PCA, which can be applied to large datasets and accounts for within-host evolution. The PCA opens new perspectives for approaches linking clinical data and evolutionary biology because it can be extended to study other traits or other infectious diseases.


A Significance of K values
The significance of the values of the estimator for phylogenetic signal K is assessed through a randomisation procedure [1] (described in Text S2 'Supplementary Methods'). The p-values of this test are shown in Table S1. A non-significant p-value means that the signal found on the tree does not differ from what one would expect on any tree with a similar shape. Note that there is no p-value for the other estimator used in the main text (λ) because it is determined through a maximum likelihood approach. This is why we always calculate λ over many trees to increase the reliability of the measure (see Supplementary Methods). Table S1. Phylogenetic signal (K) in the 4 datasets for three different traits. Here, as in all the tables, the significance code for the p-values is ' * * * ' ≤ 0.001, ' * * ' ≤ 0.01, ' * ' ≤ 0.05, a is n.s. but ≤ 0.1 and aa is n.s. and > 0. 1

B Distribution of λ values
In order to compensate for the variability in λ, we estimate it on 160 of the posterior trees (i.e. every 5 th tree in the posterior) obtained with the Bayesian estimation of the phylogeny (see Supplementary  Methods). Here, we show the distribution of these values of λ.
The tests for several of the trees failed to converge (i.e. λ = 0). However, we have reasons to think that this might not only be due to the shape of the phylogeny. First, because K estimated on the same trees revealed a significant signal. Second, because removing randomly 10% of the tips of a tree for which λ = 0 could lead to a tree with a λ > 0. This suggests that the problem could originate from the algorithm of the software we used. This is also why we only present the median value of λ estimated on many trees. C Robustness of tree to noise in the phylogeny C.1 Summary figure A potential concern is that the heritability values we find might be limited because, as for most HIV studies [2,3], our phylogenies have a high level of uncertainty. We have no straightforward way to improve our phylogenies but it is quite simple to make them worse. We do this for the phylogeny obtained with the MSM strict dataset in two ways. First, we analyse the bootstrap trees of the maximum likelihood approach. Second, we introduce errors in the phylogeny by swapping a given number of tips randomly. The results are summarised in Figure S2 and further described below.
The rationale for these tests is that one might be concerned that we would get an even higher signal (especially for spVL) if our phylogeny was better. We show that making it worse does not affect the signal suggesting that the method is robust to noise in the phylogeny.

C.2 Heritability on bootstrapped trees
We analysed the bootstrapped trees obtained with PhyML for the MSM strict dataset. All these trees are less accurate than the consensus tree because the bootstrap ignores the information of approximately 1/e of the nucleotides. K was calculated for 20 of these trees while λ was calculated for 100 trees. Table  S2 shows the mean, median and standard deviation for both estimators. Note that for two of the bootstrapped trees, we found unrealistically high signal for spVL (K of 1.42 and 2.52). The value of λ estimated on these same trees was also high but more realistic (0.88 and 0.76). Finally, the value of K * , which is another estimator introduced by Blomberg et al. [1] was of 0.66 and 0.61. Our interpretation is that bootstrapping generated these two trees with artificially high signal. We actually observed a similar effect when reducing the dataset to very low numbers (n = 45): most of the trees then yield non-significant signal but some trees can by chance lead to significant and very high signal. This underlines the importance of assessing phylogenetic signal with different approaches and testing for the robustness of the results.
We used the non-parametric Wilcoxon signed-rank test to check if the values we observe significantly differ from the values estimated on the best maximum likelihood tree (presented in Table 1 in the main text). We find that the bootstrapping has no effect on the estimates for spVL but the p-values of the randomisation tests tend to increase (Table S3). For prAZT, however, the value of K is tends to be significantly higher in the bootstrapped trees.

C.3 Adding errors to the tree
We swapped 2, 7 or 14 tips at random out of the 134 tips of the best MSM strict phylogeny obtained with PhyML. This corresponds to swapping approximately 3%, 10% and 20% of the tips. There were 20 repetitions for each procedure.
First, we found that adding errors has no effect on K for dsCD4. In other words, errors in the tree do not lead to false positive signal. Second, we found that trees that tend to lead to non-significant signal are associated with a value of K that deviates from the mean (either when K is unusually high or low). This could explain the robustness of the estimator: making the phylogeny worse thus affects the variance in the signal we observe but the mean value itself remains unaffected.
We apply the Wilcoxon test on the pooled error procedures (2, 7 or 14 branches swapped). Results are shown in table S4. We find that for both traits the value of K is not affected by the errors but that the p-value increases. Table S3. Wilcoxon test for an effect of bootstrap trees of the 'MSM strict' dataset. V is the value of the test statistic. In all four cases, the alternative hypothesis is that the mean of the BS trees differs from the original mean. There are 20 replicates.
Test true mean V p-value K for spVL 1 0.59 34 0.083 p-value for K spVL 0.0 78 0.0024 * * K for prAZT 2 0.91 51 0.014 * p-value for K prAZT 0.043 139 0.21 1 3 trees were excluded from this analysis: one because its p-value was 0.142 and two because they had unrealistically high values for K (1.42 and 2.52). 2 9 of the trees were excluded because they had a pvalue greater than 0.05. Table S4. Wilcoxon test for an effect of errors in the 'MSM strict' dataset. There are 60 repetitions for this procedure. V is the value of the test statistic. In all four cases, the alternative hypothesis is that the mean after sampling differs from the original mean.
Test true mean V p-value K for spVL 1 0.59 504.5 0.094 p-value for K spVL 0.0 1081 <0.001 * * * K for prAZT 2 0.91 4 0.854 p-value for K prAZT 0.043 1267 0.0015 * * 1 6 trees with p-values greater than 0.05 were excluded from the analysis. 2 18 trees with p-values greater than 0.05 were excluded from the analysis.
In addition to the Wilcoxon test, we also applied a generalised linear model (GLM) to see what the effect of increasing the amount of error is on our estimates. The GLM shows that increasing the number of errors does not affect the intensity of the average signal but that it significantly increases the p-values of the randomisation tests (Table S5).

D Signal in trees built with third codon positions only
We derived and analysed trees using third codon positions only to minimise potential convergent evolution due to drug resistance (Tables S6 and S7). This led to results qualitatively similar to that described in Table 1 in the main text.

E Effect of the substitution model and branch swapping algorithm
Different substitution models can be used to build phylogenetic trees. We built a phylogeny from the 'MSM strict' dataset using a maximum likelihood approach with different models and estimated K for spVL. We found that the substitution model has little effect on the heritability value (Table S8). We also built trees using the best of the NNI or SPR algorithm for branch swapping in PhyML (with a GTR substitution model). In this case, we found that K = 0.585 with a p-value of 4e-3 for spVL.

F Measuring heritability assuming stabilising selection
As explained in the Methods below, we used another estimator, d, which assumes a stabilising selection model of evolution instead of a Brownian model of evolution. As expected [1], we find similar qualitative values for d and K (Table S9). Quantitatively, the values are different but this is to be expected since d does not correct for the shape of the tree, which makes comparisons among trees more difficult.

G Effect of tree size on the estimator
In the main text, we mention the possibility that the value of our estimators for phylogenetic signal could be affected by tree size. To further test this assumption, we calculated λ for 80 trees of various sizes (average size was 274 tips with a standard deviation of 107) generated by the algorithm described in the Material and Methods of the main text. This allows us to simulate a process with known heritability. Here, we look at the percentage of change between the expected heritability measure and the measure observed. Using a generalised linear model, we look for the effect of tree size and of the known heritability value. In table S10, we show that tree size does have a slight effect on the accuracy of λ (the larger the tree, the more accurate the estimate is). This increase might be due to the fact that larger trees also tend to exhibit an artificially high signal [1]. Here, this effect, however, is much smaller than the effect of the heritability value. If this value is too low, the accuracy of the estimation of λ decreases. This suggest that low heritability values might be the main constraint that prevent us from detecting signal in the 'MSM lib' and the 'all strict' datasets.
H Link between spVL and drug resistance Some drug resistance mutations are known to have fitness costs and, since drug resistance is likely to be a highly heritable trait, one could legitimately be worried that the heritability we detect for spVL is a side effect of the heritability of prAZT. To rule out this effect, we looked for a correlation between drug resistance and spVL in our MSM strict dataset (and also in the liberal dataset). In this case, prAZT was determined on the whole pol sequence and not only on the few positions known to be associated with drug resistance because we were not using the sequences to build a phylogeny. We found no significant correlation between the two using a glm ( Pr(> |t| = 0.28, Figure S3). Therefore, it is very unlikely that heritability of spVL is due to variations in resistance to AZT (but other drug resistance mutations could have an effect).
Note that we did not use the phylogenetic comparative approach for this regression because prAZT is not a normally distributed trait.

I Confounding factors
As we show below in further details, the traits we measure, such as set-point viral load (spVL), can vary with patient age, sex or transmission group. These confounding factors are not specific to our dataset [4] and they are also observed in other cohorts than the SHCS (for a review, see [5]). Confounding factors can affect our estimate of heritability because, for instance, patients from the same transmission group have been shown to cluster on the phylogeny [6]. Figure S4A shows that, in the SHCS, men having sex with men (MSM), are significantly older than heterosexuals (HET), who are older than intravenous drug users (IDU). Also, as shown on Figure S4B, set-point viral load is significantly lower for heterosexuals. Men in the SHCS tend to be older than women ( Figure S4C) and to have a higher spVL ( Figure S4D). Overall, spVL tends to increase with age in the liberal dataset ( Figure S4E in red) but not when the dataset is narrowed to MSM or to patients fitting the strict spVL criterion. Finally, the largest proportion of the patients included in this study are MSM ( Figure S4F), which reflects the structure of the SHCS [4].

I.1 Factors affecting spVL
In our largest dataset, spVL is affected by patient sex (males tend to have a higher spVL), age (older patients have higher spVL) and transmission group (MSM and IDU have higher spVL than HET). Finally, the selection criterion that we use to minimise within-host patient variance in viral load also affects spVL (this criterion tends to remove patients with a spVL lower than the average). This is summarised in Table S11.
When we only consider the MSM transmission group (Table S12), we find that spVL is mainly affected by the spVL selection criterion and the patient age. Again, MSM patients with less variance in viral load tend to have slightly higher spVL.
If we consider the 'strict' dataset (Table S13), we find no significant effect on spVL. This suggest that the improvement in detection of spVL heritability in the MSM strict dataset is due to the phylogeny.
Finally, in the 'MSM strict' dataset (Table S14), none of the factors influences spVL.

MSM strict dataset
Probability of resistance to AZT (using the whole pol sequence) Log(spVL) Figure S3. spVL as a function of prAZT in the MSM strict dataset.

I.2 Factors affecting dsCD4
If we apply a similar approach to dsCD4 (Table S15), we find that this trait is strongly affected by the transmission group (IDU have a lower dsCD4). Note that the confounding factor comes from men in the IDU transmission group. Also, co-infection by hepatitis C can affect dsCD4 in IDU.
In the MSM strict dataset (Table S16), we only get biases linked with STDs (syphilis or HBV). This means that phylogenetic signal for dsCD4 could be overestimated. However, we do not detect any signal (see the main text).

I.3 Factors affecting prAZT
If we apply a similar approach to prAZT (Table S17), we find that, as dsCD4, this trait is strongly affected by the transmission group (IDU tend to be more resistant to AZT).

J Correlations between traits
To study correlations between traits with the comparative approach we make some assumptions about the trait distributions [7]. Here, spVL and dsCD4 are normally distributed in the population, as shown on Figure S5. The prAZT distribution was strongly biased towards 0, which could alter the correlations involving this trait. This is why we ignored this trait in the analyses. Table S18 summarises all the correlations we tested in our datasets using different methods to correct for phylogenetic dependence. All the correlations involving the CD4 were estimated using only patients with at least 5 CD4 counts during the asymptomatic phase. The sample sizes were 65, 99, 304 and 478 instead of 134, 230, 404 and 661.    Table S18. Regression between traits with or without correction for phylogenetic signal. OLS is the ordinary generalised least square (using the gls tool in R), RegBM indicates a correction based on the tree assuming brownian motion [8,9], RegP indicates a correction based on the tree using Pagel's λ [10] and RegBM indicates a correction following [11]. For RegP and RegBM the parameter describing the amount of signal was assumed to be 1 but the value has little effect on the results. OLS spVL * * * and dsCD4 * * * risk groups RegBM spVL * * * and dsCD4 aa RegP spVL * * * and dsCD4 * * * all liberal OLS spVL * * * and dsCD4 * * * RegBM spVL * * * and dsCD4 aa RegP spVL * * * and dsCD4 * * *