The authors have declared that no competing interests exist.
Conceived and designed the experiments: AC PM NB AT. Performed the experiments: SD BJ MM HFG AC. Analyzed the data: PM NZ IB AC. Contributed reagents/materials/analysis tools: PM SD BJ NZ IB MM HFG NB AT AC. Wrote the paper: PM NB AT AC. Developed the web service: IB.
HIV-1 infects CD4+ T cells and completes its replication cycle in approximately 24 hours. We employed repeated measurements in a standardized cell system and rigorous mathematical modeling to characterize the emergence of the viral replication intermediates and their impact on the cellular transcriptional response with high temporal resolution. We observed 7,991 (73%) of the 10,958 expressed genes to be modulated in concordance with key steps of viral replication. Fifty-two percent of the overall variability in the host transcriptome was explained by linear regression on the viral life cycle. This profound perturbation of cellular physiology was investigated in the light of several regulatory mechanisms, including transcription factors, miRNAs, host-pathogen interaction, and proviral integration. Key features were validated in primary CD4+ T cells, and with viral constructs using alternative entry strategies. We propose a model of early massive cellular shutdown and progressive upregulation of the cellular machinery to complete the viral life cycle.
Viral pathogens, such as HIV-1, are fully dependent of the cellular machinery to complete the replication cycle. The cell offers a permissive environment, and deploys a number of antiviral defense strategies. The present work follows the process of infection of the cell with simultaneous measurements of viral replication intermediates together with the concurrent assessment of the host transcriptional changes. The main observation is that the cell undergoes a profound modification of its physiology, with a marked early decrease in expression of several thousands of genes, followed by a more discrete increase in the expression of sets of genes that may contribute to the success of the viral replication program. The cell system used in this study has limited response of paradigmatic cellular defense genes. Key features of the experimental model were validated in primary cells and with different viral vectors. The data and model generated here constitute a resource that can be used for the assessment of single gene responses to HIV-1 infection, and as comparative reference for the understanding of other viral and cellular programs, such as those implicated in successful defense against viral infection or in latency.
The life cycle of HIV-1 and its interaction with the host cell has been extensively studied
Here, we jointly investigated, through repeated measurements in time, the dynamics of viral products and cellular responses in a model of universal cell infection (
The aim of this project was to create a first model of the productively infected cell by capturing the dynamics of all expressed host genes, concomitantly with viral replication steps. Integration of the cellular and viral data was achieved through rigorous mathematical approaches. The analyses underscored the features of the successful viral replication occurring despite a profound perturbation of the cell at the transcriptional level. Data are provided as a fully interactive web resource to allow reader-specific queries.
Progression of the viral life cycle was characterized through quantitative measurement of nine species of viral intermediates (
(
Temporal expression patterns of 7,991 genes modulated in concordance with key steps of viral replication (panel
High-throughput RNA sequence analysis identified 10,559 genes and 399 miRNAs expressed in the experimental system. Fifty-two percent of the overall variability in the transcriptome was explained by linear regression on the three main phases of the viral life cycle as identified by the viral progression modeling, namely reverse transcription, integration, and late phase (
Downregulation of cellular genes was generally early (4 hours post-infection), profound, and persistent throughout the experiment (
In contrast to the downregulated genes, upregulation occurred progressively and at later time points (
We further examined the pattern of expression of host genes reported to interact with HIV-1 proteins. We first analyzed the expression profile of 443 genes previously identified in a screen of physical interactions of all 18 HIV-1 proteins with human factors
Transcription factors and miRNAs are two key components of transcriptional regulation. Over two thirds of the 18 co-regulated gene clusters exhibited significant overrepresentation of the putative targets of one or more transcription factor or miRNA. Several major transcription factor genes were downregulated along with their corresponding targets. For example, 1,080 (23%) of the downregulated genes were targets of the large-scale transcriptional regulators SP1, MAZ, and ELK1, that were found also to be downregulated (
Many chromosomal regions were enriched in gene clusters, suggesting location-specific co-regulation. We investigated the possibility that such regional gene expression profiles are influenced by the spatial pattern of HIV-1 integration into the host genome. We identified 40,430 unique viral integration sites. Consistent with previous work
(
One of the difficulties in trying to study HIV infection in cultured cells, as compared with what may happen
Principal component analysis is used to explore the overall variance structure of the transcriptome datasets. With each point representing a whole transcriptome sample, the figure presents the transcriptome of cells that were universally infected (HIV), cells exposed to heat-inactivated virus (Heat-inactivated), cells exposed to a mixture of 1∶10 infectious/heat-inactivated virus (HIV[1/10]), and non-infected cells (Mock). One mock sample failed and is not plotted. The transcriptome of mock cells and that of cells exposed to heat-inactivated viruses clustered together across the top principal components. Infected cells, on the other hand, spread away from the mock space as infection progressed, with the most distant dot corresponding to the latest time point (24 h). The mixture 1/10 infectious/noninfectious material occupies the intermediate space. Clustering of the two hours samples corresponds to end of cell exposure to the virus or control materials.
The experimental system consisted of a highly permissive T cell line (SupT1) and a VSV.G pseudotyped HIV vector to achieve universal infection. To validate our results, we used primary cells and natural viral entry. Activated CD4+ T cells from two donors were transduced with HIV vectors pseudotyped with both VSV.G and CXCR4-tropic envelopes. As expected, the rate of infection of primary cells was inferior to that of the T cell line (
RT-qPCR was used to validate key patterns of expression using heat-inactivated virus, primary cells, and natural viral envelope. (
Research on the infected cell generally follows the paradigm of “single gene, single interaction”. However, this approach fails at fully capturing and quantifying the complexity of the system. In contrast, the non-reductionist study presented here reflects the intricate cellular response to infection where, at the transcription level, a large proportion of genes are modulated in concordance with key steps of viral replication. As such, this work provides a referential resource on the viral life cycle that can be contrasted across cellular systems and viral strains, and also across diverse pathogens. The approach should be extended to study the establishment of and reactivation from viral latency
HEK293T cells were co-transfected with 15 µg pNL4-3ΔEnv-GFP (NIH AIDS Research and Reference Reagent program, Cat. #11100) and 5 µg pMD.G, using the calcium phosphate method (Invitrogen). pNL4-3ΔEnv/GFP encodes the HIV vector segment with a 903 bp deletion in the
SupT1 cells (5×106 cells) were either mock treated or infected with 15 µg p24 equivalent of HIV-based vector by spinoculation at 1500 g for 30 min at room temperature, in presence of 5 µg/ml polybrene (Sigma), in 400 µl final volume – for a total of 72 tubes for mock and 72 tubes for infected condition (
DNA was extracted using DNeasy Blood and Tissue kit (Qiagen), and quantified using Nanodrop-1000 spectophotometer (Nanodrop). Viral DNA forms (early RT, late RT, 2-LTR) were assessed by qPCR as described in
Cell samples were stored in RNALater at 4°C until RNA extraction with Trizol Reagent (Life Technologies). Total RNA was quantified using Nanodrop-1000 spectrophotometer (Nanodrop) and Total RNA Nanochip (Agilent). Viral splice variants were assessed by one-step RT-qPCR (Qiagen) in duplicate using different pairs of primers and probe (
The temporal dynamics of the measured viral life cycle markers were modeled explicitly using an ordinary differential equation. We defined the true (noise-free) abundance of the marker,
Total RNA was extracted using Trizol (Invitrogen). Quality was assessed by capillary electrophoresis using a total RNA NanoChip in the 2100 Bioanalyzer (Agilent). RNA was quantified using Qubit fluorometer (Invitrogen). Gene expression profiles were obtained by generating a SAGE library followed by high-throughput sequencing using SOLiD 3 (Sequencing by Oligonucleotide Ligation and Detection) system technology (Applied Biosystems)
SAGE reads were aligned to the reference genome using Bowtie version 0.12.7
Total RNA was extracted using Trizol (Life Technologies). Quality was assessed by capillary electrophoresis using a small RNA chip in the 2100 Bioanalyzer (Agilent). Library preparation was performed using the Total RNA-Seq kit (Life Technologies) starting with 2 µg of total RNA and according to manufacturer's instructions. Briefly, total RNA was ligated to adapters, reverse transcribed, purified, size selected on gel, amplified by PCR with barcoded primers, purified and size selected on gel (110–130 bp) again. Emulsion PCR (ePCR) and SOLiD sequencing were performed as described for SAGE samples.
Low-quality reads (as identified by the ABI standard protocol) and reads with ambiguous bases were removed. Using the FASTX-Toolkit (
In order to characterize the association between the sequence of viral events and cellular gene expression profiles, we examined the linear correspondence of host gene (including mRNA and miRNAs) expression patterns to the three main phases of the viral life cycle, namely reverse transcription, integration, and late phase. Each of the three columns in the feature matrix,
Gene expression profiles over the 24 h observation time period were clustered to identify co-regulated gene sets. For this purpose, we analyzed all 7,991 genes that were significantly described by the regression model, i.e., for which at least one regression coefficient in
Enrichment analysis was performed using Fisher's exact test based on the hypergeometric distribution to test for over-representation of specific gene sets in the clusters. Enrichment tests were performed in two ways, first for the major regulation groups, namely, upregulated, downregulated, and mixed, and second for each of the 18 gene clusters individually (
“Location-specific”: Genes were labeled according to two separate types of co-localization, one classifying genes by their physical position on the chromosomal bands, and the other according to the Gene Ontology (GO) cellular component classification of the genes. Both annotations are available from the Molecular Signatures Database (MSigDB ver. 3,
“Sequence-based”: We checked sequence-based regulations by analyzing sets of genes that share the same transcription factor binding motif as defined in the TRANSFAC database (version 7.4,
“Functional”: We used canonical pathway classification of genes according to the Reactome database (ver. 40,
“HIV-1-related”: We compiled a list of previously reported HIV-1 related genes. This list included HIV-1 host factors reported in
The FDR was controlled according to the procedure in
Identification of viral integration sites in the 24 h time point sample was performed as previously described
The absolute quantification of integrated HIV-1 copies was done by qPCR essentially as described in
Let
CD4+ T cells were isolated from two healthy blood donor buffy coats and stimulated using anti-CD3/anti-CD28 and IL-2 as described in
Total RNA was extracted using Illustra RNAspin mini isolation kit (GE Healthcare). RNA (2 µg) was reverse transcribed using High-Capacity cDNA Reverse Transcription (Life Technologies). After cDNA purification (Invitek), DNA was quantified using Nanodrop-1000 (Nanodrop) and diluted at 5 ng/µl. Fourteen representative genes from upregulated and downregulated clusters were selected and quantified by qPCR using 10 ng cDNA, and commercially available Gene Expression Assays (Applied Biosystems,
Includes material and methods for modeling viral progression, material and methods for clustering of gene expression time courses, 11 supporting figures and 4 supporting tables.
(PDF)
We thank Marzanna Künzli and Sirisha Aluri from the Functional Genomics Center Zurich for high throughput sequencing, Nirav Malani and Frederic Bushman for integration site genome mapping and Jacques Fellay, David Gfeller and Paul McLaren for discussion. We want to remember our dear colleague and friend Marek Fischer, PhD who died in December 2010, who contributed to this work. The following reagent was obtained through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH: pNL4-3-deltaE-EGFP (no. 11100) from Haili Zhang, Yan Zhou, and Robert Siliciano