Skip to main content
Advertisement
  • Loading metrics

The impact of HTLV-1 expression on the 3D structure and expression of host chromatin

  • Hiroko Yaguchi,

    Roles Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Anat Melamed,

    Roles Data curation, Formal analysis, Methodology, Software, Writing – review & editing

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Saumya Ramanayake,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Helen Kiik,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Aviva Witkover,

    Roles Investigation, Methodology, Writing – review & editing

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Charles R. M. Bangham

    Roles Conceptualization, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    c.bangham@imperial.ac.uk

    Affiliation Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, United Kingdom

Abstract

A typical HTLV-1-infected individual carries >104 different HTLV-1-infected T cell clones, each with a single-copy provirus integrated in a unique genomic site. We previously showed that the HTLV-1 provirus causes aberrant transcription in the flanking host genome and, by binding the chromatin architectural protein CTCF, forms abnormal chromatin loops with the host genome. However, it remained unknown whether these effects were exerted simply by the presence of the provirus or were induced by its transcription. To answer this question, we sorted HTLV-1-infected T-cell clones into cells positive or negative for proviral plus-strand expression, and then quantified host and provirus transcription using RNA-seq, and chromatin looping using quantitative chromosome conformation capture (q4C), in each cell population. We found that proviral plus-strand transcription induces aberrant transcription and splicing in the flanking genome but suppresses aberrant chromatin loop formation with the nearby host chromatin. Reducing provirus-induced host transcription with an inhibitor of transcriptional elongation allows recovery of chromatin loops in the plus-strand-expressing population. We conclude that aberrant host transcription induced by proviral expression causes temporary, reversible disruption of chromatin looping in the vicinity of the provirus.

Author summary

The human T cell leukemia virus HTLV-1 causes an aggressive leukemia or lymphoma in ~5% of people infected with the virus, and a further 1–4% develop a chronic inflammatory disease that leads to progressive paralysis of the legs. HTLV-1 is a retrovirus, like HIV, and these viruses insert themselves–as a ‘provirus’—into the DNA of the T lymphocytes that they infect: this is largely why the viruses are very difficult to eradicate. The provirus remains dormant for most of the time, but it can be reactivated when the T cell meets a change in temperature or pH, or a number of other conditions.

We recently discovered that HTLV-1, when it is integrated into the host DNA, changes the 3D structure of the genome in the infected cell, and interferes with the normal function of the host genes that lie near HTLV-1 in the genome. What we have now found is that the change in the 3D structure of the genome is caused simply by the presence of the provirus, but that when the provirus is reactivated the 3D structural changes are temporarily removed, and nearby host genes can be abnormally activated by the provirus.

Introduction

The 3D structure of chromatin is important in many biological processes, including transcription regulation, DNA replication, cell cycle and differentiation. The CCCTC-binding factor (CTCF) is a key player in chromatin loop formation [1,2]. CTCF binds a non-palindromic 20-nucleotide DNA motif at ~50,000 sites in the human genome [2]; two CTCF molecules bound to different genomic sites can dimerize, forming the boundary of chromatin loops extruded through the cohesin complex. Chromatin looping plays a critical part in gene regulation by regulating the contacts between specific enhancers and promoters.

Human T cell leukemia virus type 1 (HTLV-1, also known as human T-lymphotropic virus type 1), mainly infects CD4+ T-cells in vivo. Most carriers remain asymptomatic, but in 5% to 10% of the infected individuals, HTLV-1 infection leads to either an aggressive T-cell malignancy known as adult T-cell leukemia/lymphoma (ATL), or a chronic progressive neuro-inflammatory condition called HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP; referred to hereafter as HAM) [3]. HTLV-1 persists in the host mainly by clonal proliferation of infected cells, and a typical host carries >104 long-lived HTLV-1-infected T-cell clones [4]: each clone can be distinguished by the unique integration site of the single-copy provirus in the host genome [5].

The HTLV-1 provirus has two strands (Fig 1). The plus strand, transcribed from the 5′ LTR, encodes the structural proteins, the transcriptional transactivator protein Tax, the regulator of mRNA splicing and transport Rex, and minor accessory proteins. The minus strand, transcribed from the 3′ LTR, encodes the regulatory protein HBZ [6].

thumbnail
Fig 1. Structure and expression of the HTLV-1 provirus in the host genome.

(A) Definitions regarding classification of the direction of transcription. Here we use ‘same sense’ to denote transcription from the same strand of the host genome as the proviral plus strand (e.g. tax gene). We refer to flanking host transcription upstream and downstream of 5′LTR-driven plus-strand proviral expression respectively as (a) same (plus) sense, 5′ side of the provirus and (b) same sense, 3′ side of the provirus. Similarly, we refer to transcription upstream and downstream of 3′LTR-driven minus-strand proviral expression respectively as (c) antisense, 3′ side of the provirus and (d) antisense, 5′ side of the provirus. In this study we focus on the effects of plus-strand expression. (B) Diagram of HTLV-1 proviral genome and splicing pattern of the regulatory genes tax (encoded in plus strand) and HBZ (encoded in minus strand); red boxes represent exons. The 4C viewpoint containing the CTCF binding site (BS) is shown.

https://doi.org/10.1371/journal.ppat.1011716.g001

Single-cell heterogeneity in HTLV-1 proviral expression has been shown in naturally HTLV-1-infected T cell clones [79]. At a given time, a small proportion of cells express intense bursts of tax, the proportion varying between the clones [7] (Tables 1 and 2). By contrast, the minus strand is expressed in approximately 50% of circulating infected T cells at a given time [8]. Each HTLV-1-infected T-cell clone has its own pattern of proviral expression, that is, the frequency, intensity and duration of the transient transcriptional burst of the proviral plus-strand [1012]. Since the HTLV-1 proviral sequence varies little within the host, these clone-specific differences are thought to be largely due to the unique genomic integration site of the provirus; other factors may include the antigen specificity and epigenetic modifications of the host cell.

thumbnail
Table 2. Tax expression of HTLV-1-infected T cell clones.

https://doi.org/10.1371/journal.ppat.1011716.t002

Disorganization of 3D chromatin structure can cause diseases by rewiring interactions between genes and regulatory elements [13], and certain mutations in CTCF are linked with human disease [1416]. We previously showed that the HTLV-1 provirus contains a CTCF binding site (BS): the provirus creates novel loops with the host genome, inducing transcriptional deregulation in the host genome flanking the provirus [10,17]. However, it remained unknown whether these effects were exerted simply by the presence of the provirus or by its transcription.

To answer this question, we sorted naturally-infected HTLV-1+ T cell clones to obtain subsets of provirus-expressing cells and non-expressing cells, and analysed the effects of proviral expression on chromatin looping, using quantitative circular chromosome conformation capture (q4C) [10], and on host transcription, using RNA-seq.

Results

HTLV-1 proviral plus-strand expression reduces chromatin looping between the provirus and the host genome

We hypothesized that the chromatin loops between the provirus and host genome region are regulated by expression of the provirus. To test this hypothesis, we focused on the proviral plus-strand, which is expressed in intense intermittent transcriptional bursts [7]. We used the HTLV-1 transcriptional transactivator protein Tax as a marker of plus-strand expression. T cell clones naturally infected with HTLV-1, each of which has a unique integration site (Table 1), were crosslinked with paraformaldehyde, flow-sorted into Tax+ and Taxpopulations (S1 Fig), and subjected to q4C assay, using a fragment of the proviral sequence containing its CTCF binding site as the q4C viewpoint [10] (Fig 1B).

In the Taxpopulation of clones 3.60 and HA1, we identified q4C peaks (long-range chromatin loops between the provirus and the host genome) (Fig 2A): the identified peaks often overlapped CTCF binding sites, as previously reported in the study using unsorted T cells [10].

Unexpectedly, the frequency of the chromatin loops identified in Taxcells was much lower in Tax+ cells, although the technical peak at the viewpoint (VP), which is always present in a successful 4C-seq assay [18], remained in Tax+ cells (Fig 2A).

thumbnail
Fig 2. HTLV-1 plus-strand expression results in fewer chromatin loops (number of q4C peaks) between provirus and host genome.

(A) q4C profiles of Taxand Tax+ cells of two different clones (3.60 and HA1). For each clone, the top panel depicts the q4C profile in the 5′ and 3′ host genome flanking the provirus (two biological duplicates), quantified as the normalized frequency of ligation events in overlapping windows (window width 10 kb, step 1 kb). On the horizontal axis, positive values denote positions extending from the 3′ LTR side of the provirus; negative values denote positions 5′ of the 5′LTR. VP–viewpoint in q4C (proviral integration site). Diamonds mark the positions of reproducible chromatin contact sites identified by the peak calling algorithm. CTCF track–open arrowheads denote positions of CTCF-binding sites (BS); the filled arrowhead denotes the CTCF-BS in the provirus. Genes track shows RefSeq protein-coding genes in the flanking host genome. The q4C profiles of remaining clones are shown in S2 and S3 Figs. (B) Number of identified peaks in non-expressing (Tax- or GFP-) and expressing (Tax+ or GFP+) subsets isolated from 6 clones: total (all peaks) and (C) peaks with or without a CTCF binding site.

https://doi.org/10.1371/journal.ppat.1011716.g002

Tax protein is a transcriptional transactivator that stimulates transcription of both HTLV-1 proviral genes and many host genes. To exclude the possibility that the anti-Tax antibody used in intracellular staining altered chromatin looping, we performed q4C assays using T-cell clones transduced with a reporter construct which expresses a modified EGFP with a half-life of ~2h (d2EGFP-TBX4B) when stimulated by Tax protein [11]. The GFP signal intensity was positively correlated with Tax expression [12]. Provirus-expressing (GFP+) cells and non-expressing (GFP) cells were flow-sorted after fixation in 1% paraformaldehyde and subjected to the q4C assay. Consistent with the results from intracellular Tax staining (Fig 2A), GFP+ cells also showed fewer chromatin loops compared with GFP cells (S2 Fig), and the 4C profiles of GFP+ d2EGFP-TBX4B cells were consistent with those of sorted Tax+ TBX4B cells. We conclude that the lower frequency of chromatin looping observed in Tax+ cells was associated with proviral transcription and was not caused by staining Tax protein. Also, the observation of similar changes in looping frequency in both the transduced and non-transduced cells indicates that transduction per se did not affect the results of q4C analysis.

A decrease in the number of q4C peaks at distant chromatin contacts in the host genome was observed in provirus plus-strand-expressing subsets in all T cell clones investigated (Figs 2B, S2 and S3), in regions either with or without CTCF binding sites (Fig 2C).

To corroborate these observations by an independent technique, 3C-qPCR was performed to quantify the frequency of the interaction between the provirus and the two host regions where the principal 4C peaks were identified in the Taxpopulation of clone 11.63 (S3 Fig), using specific primers designed to amplify the junction between the provirus and the host genome (S4A Fig and S1 Table). The results showed a reproducible significant decrease in chromatin contact frequency between the provirus and both peaks in Tax+ cells (S4C and S4D Fig), although there was no difference in the contact frequency observed within the provirus (S4B Fig).

Next, to quantify the difference in chromatin loop formation between provirus-expressing and non-expressing cells, peak regions were defined in the non-expressing cell population, and peaks were then sought at the same sites in the provirus-expressing cell population. (Fig 3A) At each site, we then compared the normalized peak height of the q4C profile in non-expressing cells with that in provirus-expressing cells. The results show that HTLV-1 plus-strand expression resulted in a significant reduction in the frequency of these chromatin loops, especially on the 3′ side of the integration site (Fig 3B). This reduction was observed in both regions with and without CTCF binding sites (Fig 3C).

thumbnail
Fig 3.

HTLV-1 plus-strand expression results in a significant reduction in contact frequency (q4C peak height) with the host genome (A) Schematic to show quantification of change in peak frequency with plus-strand expression. First, reproducible q4C peaks were identified in the non-expressing (Tax-negative) cells, as previously described [10]. Then, peaks were sought (using the same algorithm) in each corresponding genomic location in the plus-strand-expressing cell population. (B) Normalized peak height of q4C peaks identified in all clones analysed, respectively on the 5′ and 3′ sides of the provirus. Peak height is defined as the maximum number of ligation events per region (normalised to total ligation events in a sample) calculated for each peak region. p = 0.027 and p = 9.7 * 10−13 for upstream and downstream regions, respectively (unpaired two-tailed Wilcoxon test. (C) Normalized peak height of q4C peaks identified in all clones analysed, comparing peaks that contain a CTCF site and those without CTCF sites. Peak height was significantly greater in non-expressing cells, both in peaks with a CTCF site and those without (p = 0.0018 and p = 1.8 * 10−13, respectively, unpaired two-tailed Wilcoxon test).

https://doi.org/10.1371/journal.ppat.1011716.g003

HTLV-1 proviral plus-strand expression drives host transcription downstream in cis

Throughout this work, we refer to the direction of flanking host transcription relative to the orientation of the provirus. Thus, host transcription upstream (5′) of the 5′LTR, in the same sense as the proviral plus-strand, is denoted as ‘same-sense, 5′ side of provirus’ (Fig 1 arrow a), and transcription in the same sense 3′ of the 3′LTR as ‘same-sense, 3′ side of provirus’ (Fig 1 arrow b). Similarly, we refer to host transcription in the same sense as the proviral minus-strand as ‘antisense, 3′ side of provirus’ (Fig 1 arrow c) and ‘antisense, 5′ side of provirus’ (Fig 1 arrow d).

Next, we asked whether the aberrant host transcription is associated with proviral transcription or simply with the presence of the provirus, regardless of transcription. The landscape of host transcription flanking the provirus has been investigated in ATL cases [19,20], in which there is usually a dominant single malignant clone. However, there are typically widespread epigenetic and transcriptional abnormalities in malignant cells, and expression of the proviral plus-strand is frequently lost in ATL clones [21]. We previously reported abnormal in cis host transcription near the integration site in naturally-infected, non-malignant HTLV-1-positive T-cell clones [10], but since only a fraction of cells express the proviral plus-strand at a given instant, we could not distinguish whether the abnormal host transcription was associated with proviral plus-strand expression, or whether there was constitutive activation irrespective of the plus-strand burst, perhaps by the proviral enhancer.

RNA isolated from fresh, unfixed cells is required for optimal RNA-seq results. However, intracellular staining of Tax protein requires prior fixation. Therefore, to elucidate whether the aberrant host transcription was induced by proviral plus-strand transcription, we used d2EGFP-transduced clones (11.63, 11.50 and TBX4B) (see Table 1) [11], from which GFP+ (Tax-expressing) and GFP(Tax-non-expressing) cells were isolated by live-cell sorting; total cellular RNA was extracted and subjected to a stranded RNA-seq analysis. The proviral plus-strand was highly transcribed in all 3 clones in GFP+ cells but not in GFPcells (Fig 4A and 4B), confirming that GFP+ cell-sorting enriched a cell population with active HTLV-1 plus-strand transcription.

thumbnail
Fig 4. Proviral and host transcription and splicing in live-sorted T cell clones.

(A) RNA-seq analysis of HTLV-1 proviral expression in live-sorted d2EGFP clones (TBX4B, 11.50 and 11.63). (B) Tax expression measured by qPCR with primers specific for tax mRNA or 18S ribosomal RNA (18S rRNA). qPCR plots are expression values normalized to 18S rRNA. Data represent a mean of two biological replicates; error bars are SEM. AU—arbitrary units. (C) Host RNA expression 30kb on either side of the proviral integration site. On the horizontal axis, positive values denote positions extending from the 3′ LTR side of the provirus; negative values denote positions 5′ of the 5′LTR. Each row shows the transcription density (normalized RNA-seq read count) flanking that genomic position in the clone indicated at the right-hand side. In each case, transcription orientation and positions are shown relative to the integrated provirus. Read density shown in blue shows transcription in the same orientation as the proviral plus-strand (same sense); red shows transcription in the opposite sense to the proviral plus-strand (antisense). (D) Identification of splice sites of viral-host fusion transcripts in d2EGFP-TBX4B clone cells. Coverage tracks of same sense transcription (blue) and antisense transcription (red) in Integrative Genomics Viewer (IGV). Exons of PNPLA3 in the 3′ side of the integration site are highlighted in yellow. (E) Fusion transcripts between an HTLV-1 plus-strand major splice donor (red, proviral exon H1 or H2) and the canonical splice acceptor site in the host PNPLA3 gene (blue, PNPLA3 exon 3) were identified in GFP+ (HTLV-1 plus-strand-expressing) cells. To identify splice sites of fusion transcripts, reads were aligned to a reference genome (hg19) containing the HTLV-1 provirus (AB513134) genome in the TBX4B clone integration site at chr22:44323198. Fusion transcripts are shown with fused sequences.

https://doi.org/10.1371/journal.ppat.1011716.g004

Host transcription in the same sense on the 3′ side of the provirus near the clone’s respective proviral integration site was greater in GFP+ cells than in GFPcells in all 3 clones (Fig 4C). For example, in clone d2EGFP-TBX4B, HTLV-1 is integrated between exon 2 and exon 3 of the PNPLA3 gene (Fig 4D), which is not normally expressed in T cells (Fig 4C). The PNPLA3 gene (same-sense, 3′ side of the provirus) was highly transcribed in the GFP+ population, but was not transcribed in GFPcells of clone d2EGFP-TBX4B, or in two other clones (d2EGFP-11.50 and d2EGFP-11.63) regardless of Tax expression, suggesting that the observed increase in PNPLA3 mRNA did not result from Tax-mediated transactivation (Fig 4C). Aberrant transcription in the same-sense on the 5′ side of the provirus was also seen in two clones (11.50 and TBX4B) at a low level (Fig 4C and 4D).

In addition to splicing between the 3′LTR and the host genome (antisense, 5′ side of provirus), which has also been reported in ATL cases [19], in plus-strand-expressing cells we found that the exons of 5′LTR (exon H1/exon H2 in Fig 1) were spliced out to form not only normal viral mRNAs but also viral-host fusion transcripts, by fusion to downstream host plus-strand transcripts. For example, in the GFP+ subset of d2EGFP-TBX4B cells, proviral exons H1 or H2 were found to be fused to exon 3 of PNPLA3 (Fig 4D and 4E). The observed events always occurred between a canonical HTLV-1 splice donor and a canonical host splice acceptor. Such splicing events were observed even when no host gene was present in the flanking region (S5 Fig). We then reanalysed the RNA-seq data obtained previously [11] and found a novel transcript in clone Timer-3.60 cells expressing the plus-strand, formed by splicing out of the proviral plus-strand exon H1 and fusion to novel host exons (S5B Fig). These splicing events again followed the canonical GT|AG mRNA processing rule. Fusion was observed between a host splice acceptor and proviral splice donor up to 120 kb away. By contrast, splicing was not observed between a host splice donor and an HTLV-1 splice acceptor; although low-level host transcription in the same-sense on the 5′ side of the provirus was induced by proviral plus-strand expression in two clones (Fig 4C and 4D).

Treatment of T cell clones with an inhibitor of transcriptional elongation allows recovery of chromatin loops in the Tax+ population

The observation of upregulated host transcription near the integration site and decreased chromatin loops in the provirus-expressing cell populations raised the question whether provirus-induced host transcription disrupts chromatin loop formation between the provirus and host genome. To test this hypothesis, we treated the infected cells with flavopiridol, an inhibitor of transcriptional elongation. Flavopiridol was reported to inhibit both elongation and readthrough transcription induced by influenza virus [22]. Treatment of cells with 1 nM flavopiridol for 1.5 hours did not affect Tax protein expression (S6 Fig) so the cells could still be sorted for Tax protein. Flavopiridol treatment also did not reduce tax mRNA expression significantly. However, the level of aberrant host transcription in the same sense, 3′ side of the provirus was significantly decreased (Fig 5A and 5B).

thumbnail
Fig 5. Treatment of T cell clones with a transcriptional elongation inhibitor allows recovery of chromatin loop formation in the Tax+ population.

(A) After treatment with 1 nM flavopiridol (FP) for 1.5 hrs, total RNA was extracted from clone 11.63 and subjected to RT-qPCR for tax and three regions in the 3′ flanking host genome, respectively a: +188 bp, b: +535 bp and c: +3,198 bp from the 3′ end of the provirus. (B) Relative expression intensity (normalized to 18s rRNA) of tax and the host genome at positions a, b and c. Data are mean ± SEM. (N = 3). * P<0.05 (paired t-test). (C) q4C profiles of mock-treated Tax- cells (top track) and Tax+ cells (middle track), and flavopiridol (FP)-treated Tax+ cells (bottom track). Diamonds mark the positions of reproducible chromatin contact sites identified by the peak calling algorithm. Open arrowheads denote positions of CTCF-BS; the filled arrowhead denotes the CTCF-BS in the provirus. Gene track shows RefSeq protein-coding genes in the flanking host genome.

https://doi.org/10.1371/journal.ppat.1011716.g005

After flavopiridol treatment, the cells were stained intracellularly, sorted for Tax protein, and then subjected to q4C assay. While the frequency of chromatin loops between the provirus and the host genome was lower in Tax+ cells than in Taxcells (as in Fig 1 above), chromatin loops remained in Tax+ cells when they were treated with flavopiridol (Fig 5C). We conclude that inhibition of transcription elongation by flavopiridol reduced host transcription and allowed the preservation and restoration of chromatin loops.

Discussion

The results presented here lead to two main conclusions. First, q4C analysis of sorted T cell clones for plus-strand expression revealed that the chromatin loops formed between the provirus and the immediately flanking host genome are reduced or lost during plus-strand proviral transcription. Second, plus-strand proviral transcription causes (1) aberrant host transcription on the same-strand, chiefly on the 3′ side of the provirus and to a lesser extent on the 5′ side; (2) fusion transcripts between provirus and host RNA; the 5′LTR (exon H1/exon H2) was fused to host exons in the same sense on the 3′ side of the provirus, although intra–HTLV-1 splicing remained intact.

In clone TBX4B we observed that transcriptional activation of a host gene more distant (1.4 Mb) from the provirus was associated with expression of the wild-type provirus, but not with the provirus in which the CTCF binding site was knocked out (S7 Fig). We postulate that, while the nearby chromatin loops are diminished or lost during HTLV-1 proviral plus-strand transcription, more long-range CTCF-dependent loops can remain intact. This result (S7 Fig) suggests that the long-range enhancer effect of HTLV-1 requires the CTCF binding to the provirus and active viral transcription. Further investigation will be needed to clarify this mechanism.

The chromatin loops formed by the provirus reported here and previously [10] are made with adjacent regions of the host genome. The frequency of these loops declines rapidly within ~5 Mb of the provirus. These observations are consistent with the size distribution of normal chromatin loops in the human genome [23], and with the mechanism of loop extrusion by which these loops are formed [24]. It is known that contacts can also be made with more distant locations on the genome, including other chromosomes; however, such distant contacts are typically much less frequent, although certain trans-chromosomal contacts may be evolutionarily conserved [25].

Aberrant read-through transcription and splicing is a known feature of retroviral transcription [26]. Previous studies reporting RNA-seq analysis in ATL cases [19,20] showed aberrant antisense expression in the host genome flanking the 5′ side of the provirus; the 3′LTR sequence was spliced out and fused to host exons on the antisense strand on the 5′ side of the provirus. However, aberrant host transcription in the same sense as the proviral plus-strand (either 5′ or 3′ to the provirus), and splicing between the 5′LTR and host exons in the same sense on the 3′ side of the provirus, were rarely observed. In the non-malignant clones in the present study, although viral-host transcripts in the minus-strand were sometimes observed, plus-strand transcription always dominated.

We observed splicing between the 5′LTR and host exons in the plus-strand on the 3′ side of the provirus (Figs 4D, 4E and S5), even when there was no identified host gene in either strand. The observation that proviral-host chimeric RNA splicing events can activate cryptic host exons was recently reported in HIV-1-infected Jurkat T cell clones carrying reporter proviruses integrated into the introns of three cancer-related genes [27]. Our observations show that HTLV-1 plus-strand expression has the potential to disrupt host cell functions by creating novel transcripts, with potential biological functions, under the transcriptional control of the HTLV-1 5′LTR. To explore this possibility, further research is needed to test the stability and activity of these aberrant mRNAs.

Whereas frequent splicing was observed between a proviral donor and host acceptor, we did not observe splicing between a host donor and proviral acceptor, even if the host genome 5′ side region was transcribed. This observation implies that any same-sense host transcription on the 5′ side of the provirus stops at the integration site, whereas LTR-driven transcription can result in read-through into the host genome in the same sense on the 3′ side of the provirus.

CTCF, which binds to the HTLV-1 provirus, can regulate mRNA splicing: further work is required to test the possibility that CTCF bound to the provirus causes aberrant splicing in the flanking host genes.

The results reported here show that the aberrant host transcription induced by the HTLV-1 provirus differs from the host gene transcriptional landscape at the HIV-1 integration site [27]. No aberrant host transcription was observed in the antisense strand in HIV-1-infected cells. By contrast with HIV-1, HTLV-1 can induce host transcription in the same sense on the 5′ side of the provirus (Fig 4C), whereas both proviruses induce strong same-sense transcription on the 3′ side of the provirus. Whereas downstream transcription can be at least partially explained as readthrough transcripts from the provirus, same-sense transcription on the 5′ side of the provirus in an HTLV-1+ clone cannot be explained by readthrough from the 5′LTR. Two observations suggest that this same-sense transcription on the 5′ side of the provirus is due to an enhancer effect of the HTLV-1 5′LTR. First, the aberrant same-sense transcription was seen only in GFP+ cells (Fig 4C and 4D); second, the transcription correlated with the level of plus-strand proviral transcription in timer protein reporter clones (S8 Fig).

An unexpected finding in this study was that aberrant loop formation between the provirus and host chromatin is suppressed by proviral plus-strand transcription. Since CTCF appears to remain bound to the provirus during expression of tax [8], the observed decrease in chromatin looping cannot be attributed to loss of CTCF binding. The observations that proviral transcription increases aberrant transcription and splicing in the flanking host genome (Figs 4 and S5) and that reducing proviral-expression-induced host transcription with an inhibitor of transcriptional elongation allows recovery of chromatin loops in the Tax+ population (Fig 5) suggest that aberrant host transcription induced by the proviral expression disrupts chromatin looping. This inference is consistent with the report by Heinz et al. (2018) [22] that during influenza A virus (IAV) infection the IAV non-structural protein 1 (NS1) induces global inhibition of transcription termination of highly transcribed genes and causes readthrough transcription for hundreds of kilobases, resulting in disruption of chromatin interactions.

Chromatin looping between the HTLV-1 provirus and the host genome is mediated by CTCF binding to the provirus [14]. We postulated that CTCF binding confers a selective advantage on the provirus; however, the mechanism of this putative advantage is not known. At least three non-mutually-exclusive possibilities can be identified. First, through its function as an epigenetic barrier, CTCF may regulate epigenetic modification of the provirus [28] Second, CTCF (again through its barrier function) might block unwanted activation of plus-strand proviral transcription by the recently reported intragenic enhancer [29]. Third, CTCF may control the intranuclear position of the chromatin containing the provirus, which is correlated with selective survival of HTLV-1-infected T cell clones in vivo [30].

Methods

Ethics statement

All donors gave written informed consent in accordance with the Declaration of Helsinki to donate blood samples to the Communicable Diseases Research Tissue Bank, approved by the UK National Research Ethics Service (15/SC/0089).

Cells

The details of HTLV-1-infected T-lymphocyte clones used in this study are shown in Table 1. All clones were derived as previously described [11] from peripheral blood mononuclear cells (PBMCs) of donors attending the National Centre for Human Retrovirology (NCHR) at Imperial College Healthcare NHS Trust, St Mary’s Hospital, London. The identification of genomic insertion sites by LMPCR was described elsewhere [31]. The cells were maintained in RPMI-1640 (Sigma, R0883) supplemented with L-glutamine, penicillin+streptomycin and 20% fetal bovine serum (Gibco, 10500–064) in 5% CO2 at 37°C. IL-2 (Miltenyi Biotec, 130-097-745) was supplemented (100 unit/ml) into the culture twice a week. The integrase inhibitor raltegravir (Selleck Chemicals, MK-0518) was used at 10 μM throughout the culture to prevent secondary infection.

3C and q4C assay of sorted cells

HTLV-1- infected T cell clones were stained with LIVE/DEAD Fixable Near-IR Dead Cell Stain kit (Invitrogen, L34976) to enabling gating on live cells and then crosslinked in phosphate-buffered saline (PBS) containing 1% formaldehyde for 10 min at room temperature. Then the Tax protein was stained intracellularly with anti-Tax-AF647 (clone LT-4; 0.4 μg/ml), using a Foxp3 staining kit (eBioscience, 00-5523-00) and viable Tax+ and Tax- cells were sorted (S1 Fig). For the Tax expression reporter clones (d2EGFP clones), the cells were stained with LIVE/DEAD Fixable Near-IR Dead Cell Stain kit, crosslinked as above, and viable GFP+ cells and GFP- cells were sorted. After sorting, q4C was performed [10].

q4C analysis was carried out as previously described [10]. Reads spanning 100 to 150 bp were inspected for quality using FastQC, and filtered for correct amplification using the NlaIII restriction sequence and preceding 4 bases (total 8 bases) as an identifying barcode for the q4C viewpoint using Cutadapt. Reads where the first NlaIII site was incompletely digested (therefore containing an additional NlaIII fragment) were further trimmed by Cutadapt. Finally, Trimgalore was used to trim low quality reads (<20) and remove the Ilumina adaptor sequence, keeping reads of minimum 30 b in length. Trimmed reads were aligned to a combined reference of human (hg19) and viral (AB513134) genomes using Bowtie2. Ligation sites were quantified from aligned reads using Perl and R scripts. Ligation sites were quantified in windows of 10 kb across the chromosomes in order of base coordinates. Peaks were called as described previously [10] using a three-state hidden Markov model, requiring agreement between both samples (biological replicates).

Real-time 3C qPCR was described elsewhere [17]. Primer pairs and probe (S1 Table) were used to detect long-range chromatin contacts between the provirus and host genome region at Peak 1 (34.5 kb downstream of the provirus) and Peak 2 (68 kb downstream of the provirus) of chromosome 19. As control, we used a primer pair and Taqman probe to detect the contacts of two regions in the provirus, and a 101 bp sequence from the provirus was amplified to normalize the qPCR using internal control primer sets and probe (S1 Table). Data analysis was done with LinRegPCR software (version 2014.5). Six replicates were performed on each library, using a QuantStudio 7 Flex Real-Time PCR System utilizing TaqMan Gene Expression Master Mix (Applied Biosystems).

RNA-seq and RT-qPCR

d2EGFP clones were stained with LIVE/DEAD, and live cells were sorted with a BD FACSAria lll cell sorter under containment level 3 (CL3) conditions, as described elsewhere [11]. Total RNA was extracted from the sorted cells using an miRNeasy kit (Qiagen) and RNA-seq libraries were prepared using the Ribo-Zero Plus rRNA Depletion kit and TruSeq Stranded mRNA HT Sample Prep Kit and sequenced with the NovaSeq6000 (150 bp paired-end reads).

RNA-seq reads were inspected for quality using FastQC, and trimmed using Trimgalore to remove low-quality reads and Ilumina adapter sequences. Trimmed reads were aligned against a combined reference of human (hg19) and viral (AB513134) genomes using GSNAP v. 2019-06-10. Read coverage was counted using Bedtools coverage after split to exons using Bedtools bamtobed tool using the resolution denoted in each figure.

Reads were aligned to a combined reference of human (hg19) and viral (AB513134) inserted into the integration site at chr22:44323198, chr19:33829548 and chr04:70567285 to identify splice sites of fusion transcripts in TBX4B, 11.63 and 3.60, respectively.

RNA-seq of timer protein reporter clones was described elsewhere [11]. RNA-seq of TBX4B clones containing wild-type CTCF Binding site (BS) in the provirus and mutant (Mut) clone containing a mutated CTCF-BS were performed as described elsewhere [10].

For RT-qPCR, first-strand cDNA was synthesized with First Strand cDNA Synthesis Kit using random primers. Primers used are shown in S1 Table. Real-time qPCR was then performed in 6 replicates on each library, using a QuantStudio 7 Flex Real-Time PCR System utilizing Fast SYBR Green Master Mix (Applied Biosystems).

Supporting information

S1 Table. DNA sequence of primers and probes used in 3C-qPCR and RT-qPCR.

https://doi.org/10.1371/journal.ppat.1011716.s001

(TIFF)

S1 Fig. Flow sorting of Taxand Tax+ cells.

HTLV-1-infected T cell clone 11.63 was stained for live cells, crosslinked in 1% formaldehyde, stained intracellularly for Tax protein, and flow-sorted to isolate Taxand Tax+ subsets (see Methods).

https://doi.org/10.1371/journal.ppat.1011716.s002

(TIFF)

S2 Fig. Flow sorting of provirus-expressing and non-expressing cells by an independent technique produced similar q4C profiles.

(A) q4C profiles of Tax(upper panel) and Tax+ (lower panel) cells sorted from clone TBX4B after intracellular staining of Tax. (B) q4C profile of non-expressing (GFP-) (upper panel) and provirus-expressing (GFP+) (lower panel) cells isolated from d2EGFP-TBX4B clones, selected by GFP signal (without Tax staining).

https://doi.org/10.1371/journal.ppat.1011716.s003

(TIFF)

S3 Fig. Transcription of the proviral plus-strand is accompanied by a loss of contacts between the provirus and flanking host chromatin.

(A) q4C profiles of Tax(upper panel) and Tax+ (lower panel) cells from clone 11.63. (B) q4C profiles of non-expressing (GFP) (upper panel) and provirus-expressing (GFP+) cells (lower panel) from clone d2EGFP-11.50. Vertical axes show the normalized frequency of chromatin contacts between the provirus and the host genome.

https://doi.org/10.1371/journal.ppat.1011716.s004

(TIFF)

S4 Fig. Quantitative 3C (3C-qPCR) analysis confirmed decreased frequencies of chromatin looping.

(A) q4C profile of Taxcells of clone 11.63. The technical peak seen in the q4C viewpoint (VP) and two of the main peaks (Peak 1) and (Peak 2) identified in the output of the Taxfraction of clone 11.63. (B) As control, the frequency of chromatin interactions was quantified by 3C-qPCR on sorted Taxand Tax+ cells, using a primer pair and Taqman probe to detect the contacts between two regions: VP and another region in the provirus (S1 Table). There was no significant difference between Tax+ and Taxcells (combined p value = 0.932, Fisher’s method of combining p values). (C and D) Primer pairs and probe were used to detect long-range chromatin contacts between the provirus and host genome region at Peak 1 (C) or Peak 2 (D). Results of 3C-qPCR of two biological replicates (rep) are shown. Peak1 contact frequency was significantly higher in Taxcells than in Tax+ cells (combined p value 0.012, Fisher’s method). Peak2 contact frequency was significantly higher in Tax- cells than in Tax+ cells (combined p value 0.000607, Fisher’s method).

https://doi.org/10.1371/journal.ppat.1011716.s005

(TIFF)

S5 Fig. Fusion transcripts between HTLV-1 provirus and host genome.

(A) Identification of splice sites of fusion transcripts in the plus-strand-expressing cells of clone d2EGFP-11.63 and (B) in timer protein reporter clone Timer-3.60. Plus-strand fusion transcripts between HTLV-1 exon1 (H1) and same sense, 3′ side host gene exon (blue) or novel host exons (green) are shown with fused sequences.

https://doi.org/10.1371/journal.ppat.1011716.s006

(TIFF)

S6 Fig. Treatment with flavopiridol for 1.5 hrs did not alter Tax protein expression.

Clone 11.63 cells were treated with 1nM flavopiridol for 1.5 hrs and then stained for Live/Dead and then Tax protein.

https://doi.org/10.1371/journal.ppat.1011716.s007

(TIFF)

S7 Fig. Expression of distant host genes correlates with expression of tax.

(A) Normalized mRNA read counts of two genes that lie >1.4 Mb from the provirus in clone Timer-TBX4B, in the four successive phases of the HTLV-1 plus-strand transcriptional burst: DN–double negative (HTLV-1 silent); blue–early burst; DP double-positive (mid-burst); red–late burst. Results of two independent experiments are shown. Expression of both SMC1B and RIBC2 closely followed the trajectory of the HTLV-1 burst in clone Timer-TBX4B, but not in the unrelated HTLV-1-infected clone Timer-3.60. Data from [11]. (B) Knockout of the CTCF binding site in the provirus in clone Timer-TBX4B (middle panel) abolished the transcription of both SMC1B and RIBC2 observed in the wild-type clone (lower panel). Results of two independent experiments are shown. Neither gene was expressed in an unrelated HTLV-1-infected clone ED. The results suggest that maintenance of a CTCF-dependent chromatin loop between the host genome and the provirus is required for the burst of transcription of these distant genes associated with the HTLV-1 plus-strand burst.

https://doi.org/10.1371/journal.ppat.1011716.s008

(TIFF)

S8 Fig. RNA-seq analysis of HTLV-1 proviral expression in live-sorted cells of clone Timer-TBX4B.

Cells were sorted into four populations based on the fluorescence of the Timer protein, DN–double negative (HTLV-1 silent); blue–early burst; DP double-positive (mid-burst); red–late burst. [11].(A) Coverage tracks in IGV of plus strand HTLV-1 provirus transcription and (B) host gene PNPLA3 transcription. (C) Transcription in PNPLA3 exons 1 and 2 (note range on vertical axis 0 to 500).

https://doi.org/10.1371/journal.ppat.1011716.s009

(TIFF)

Acknowledgments

We thank Parisa Amjadi from the CL3 Cell Sorting Facility at The Centre for Immunology and Vaccinology at Imperial College London for cell sorting. We are grateful to Tomas Fitzgerald at the European Bioinformatics Institute and Aris Aristodemou for helpful discussion. We thank Laurence Game at MRC London Institute of Medical Sciences for q4C sequencing, and Oxford Genomics Centre for cDNA library preparation and RNA sequencing.

References

  1. 1. Baranello L, Kouzine F, Levens D. CTCF and cohesin cooperate to organize the 3D structure of the mammalian genome. Proc Natl Acad Sci U S A. 2014;111(3):889–90. Epub 2014/01/09. pmid:24398527; PubMed Central PMCID: PMC3903212.
  2. 2. Phillips JE, Corces VG. CTCF: master weaver of the genome. Cell. 2009;137(7):1194–211. Epub 2009/07/01. pmid:19563753; PubMed Central PMCID: PMC3040116.
  3. 3. Bangham CRM, Matsuoka M. Human T-cell leukemia virus types 1 and 2. In: Howley PM, Knipe DM, editors. Fields Virology. RNA Viruses. 3. 7th ed: Wolters Kluwer; 2023. p. 527–57.
  4. 4. Laydon DJ, Sunkara V, Boelen L, Bangham CRM, Asquith B. The relative contributions of infectious and mitotic spread to HTLV-1 persistence. PLoS Comput Biol. 2020;16(9):e1007470. Epub 2020/09/18. pmid:32941445; PubMed Central PMCID: PMC7524007.
  5. 5. Cook LB, Rowan AG, Melamed A, Taylor GP, Bangham CR. HTLV-1-infected T cells contain a single integrated provirus in natural infection. Blood. 2012;120(17):3488–90. Epub 2012/09/08. pmid:22955925; PubMed Central PMCID: PMC3482858.
  6. 6. Kulkarni A, Bangham CRM. HTLV-1: Regulating the Balance Between Proviral Latency and Reactivation. Front Microbiol. 2018;9:449. Epub 2018/04/05. pmid:29615991; PubMed Central PMCID: PMC5867303.
  7. 7. Billman MR, Rueda D, Bangham CRM. Single-cell heterogeneity and cell-cycle-related viral gene bursts in the human leukaemia virus HTLV-1. Wellcome Open Res. 2017;2:87. Epub 2017/10/25. pmid:29062917; PubMed Central PMCID: PMC5645716.
  8. 8. Miura M, Miyazato P, Satou Y, Tanaka Y, Bangham CRM. Epigenetic changes around the pX region and spontaneous HTLV-1 transcription are CTCF-independent. Wellcome Open Res. 2018;3:105. Epub 2019/01/08. pmid:30607369; PubMed Central PMCID: PMC6305241.
  9. 9. Mahgoub M, Yasunaga JI, Iwami S, Nakaoka S, Koizumi Y, Shimura K, et al. Sporadic on/off switching of HTLV-1 Tax expression is crucial to maintain the whole population of virus-induced leukemic cells. Proc Natl Acad Sci U S A. 2018;115(6):E1269–E78. Epub 2018/01/24. pmid:29358408; PubMed Central PMCID: PMC5819419.
  10. 10. Melamed A, Yaguchi H, Miura M, Witkover A, Fitzgerald TW, Birney E, et al. The human leukemia virus HTLV-1 alters the structure and transcription of host chromatin in cis. Elife. 2018;7. Epub 2018/06/26. pmid:29941091; PubMed Central PMCID: PMC6019074.
  11. 11. Kiik H, Ramanayake S, Miura M, Tanaka Y, Melamed A, Bangham CRM. Time-course of host cell transcription during the HTLV-1 transcriptional burst. PLoS Pathog. 2022;18(5):e1010387. Epub 2022/05/17. doi: pmid:35576236.
  12. 12. Ramanayake S, Moulding DA, Tanaka Y, Singh A, Bangham CRM. Dynamics and consequences of the HTLV-1 proviral plus-strand burst. PLoS Pathog. 2022;18(11):e1010774. Epub 2022/11/29. pmid:36441826; PubMed Central PMCID: PMC9731428.
  13. 13. Anania C, Lupianez DG. Order and disorder: abnormal 3D chromatin organization in human disease. Brief Funct Genomics. 2020;19(2):128–38. Epub 2020/02/07. pmid:32025693; PubMed Central PMCID: PMC7115703.
  14. 14. Boyling A, Perez-Siles G, Kennerson ML. Structural Variation at a Disease Mutation Hotspot: Strategies to Investigate Gene Regulation and the 3D Genome. Front Genet. 2022;13:842860. Epub 2022/04/12. pmid:35401663; PubMed Central PMCID: PMC8990796.
  15. 15. Spielmann M, Lupianez DG, Mundlos S. Structural variation in the 3D genome. Nat Rev Genet. 2018;19(7):453–67. Epub 2018/04/26. pmid:29692413.
  16. 16. Norton HK, Phillips-Cremins JE. Crossed wires: 3D genome misfolding in human disease. J Cell Biol. 2017;216(11):3441–52. Epub 2017/09/01. pmid:28855250; PubMed Central PMCID: PMC5674879.
  17. 17. Satou Y, Miyazato P, Ishihara K, Yaguchi H, Melamed A, Miura M, et al. The retrovirus HTLV-1 inserts an ectopic CTCF-binding site into the human genome. Proc Natl Acad Sci U S A. 2016;113(11):3054–9. Epub 2016/02/29. pmid:26929370; PubMed Central PMCID: PMC4801255.
  18. 18. Krijger PHL, Geeven G, Bianchi V, Hilvering CRE, de Laat W. 4C-seq from beginning to end: A detailed protocol for sample preparation and data analysis. Methods. 2020;170:17–32. Epub 2019/07/29. pmid:31351925.
  19. 19. Kataoka K, Nagata Y, Kitanaka A, Shiraishi Y, Shimamura T, Yasunaga J, et al. Integrated molecular analysis of adult T cell leukemia/lymphoma. Nat Genet. 2015;47(11):1304–15. Epub 2015/10/06. pmid:26437031.
  20. 20. Rosewick N, Durkin K, Artesi M, Marcais A, Hahaut V, Griebel P, et al. Cis-perturbation of cancer drivers by the HTLV-1/BLV proviruses is an early determinant of leukemogenesis. Nat Commun. 2017;8:15264. Epub 2017/05/24. pmid:28534499; PubMed Central PMCID: PMC5457497.
  21. 21. Takeda S, Maeda M, Morikawa S, Taniguchi Y, Yasunaga J, Nosaka K, et al. Genetic and epigenetic inactivation of tax gene in adult T-cell leukemia cells. Int J Cancer. 2004;109(4):559–67. Epub 2004/03/03. pmid:14991578.
  22. 22. Heinz S, Texari L, Hayes MGB, Urbanowski M, Chang MW, Givarkes N, et al. Transcription Elongation Can Affect Genome 3D Structure. Cell. 2018;174(6):1522-36.e22. Epub 2018/08/23. pmid:30146161; PubMed Central PMCID: PMC6130916.
  23. 23. Kempfer R, Pombo A. Methods for mapping 3D chromosome architecture. Nat Rev Genet. 2020;21(4):207–26. Epub 2019/12/19. pmid:31848476.
  24. 24. Davidson IF, Peters JM. Genome folding through loop extrusion by SMC complexes. Nat Rev Mol Cell Biol. 2021;22(7):445–64. Epub 2021/03/27. pmid:33767413.
  25. 25. Lohia R, Fox N, Gillis J. A global high-density chromatin interaction network reveals functional long-range and trans-chromosomal relationships. Genome Biol. 2022;23(1):238. Epub 2022/11/11. pmid:36352464; PubMed Central PMCID: PMC9647974.
  26. 26. Coffin JM, Hughes SH, Varmus HE. The Interactions of Retroviruses and their Hosts. In: Coffin JM, Hughes SH, Varmus HE, editors. Retroviruses. Cold Spring Harbor (NY)1997.
  27. 27. Liu R, Yeh YJ, Varabyou A, Collora JA, Sherrill-Mix S, Talbot CC Jr., et al. Single-cell transcriptional landscapes reveal HIV-1-driven aberrant host gene transcription as a potential therapeutic target. Sci Transl Med. 2020;12(543). Epub 2020/05/15. pmid:32404504; PubMed Central PMCID: PMC7453882.
  28. 28. Cheng X, Joseph A, Castro V, Chen-Liaw A, Skidmore Z, Ueno T, et al. Epigenomic regulation of human T-cell leukemia virus by chromatin-insulator CTCF. PLoS Pathog. 2021;17(5):e1009577. Epub 2021/05/22. pmid:34019588; PubMed Central PMCID: PMC8174705.
  29. 29. Matsuo M, Ueno T, Monde K, Sugata K, Tan BJY, Rahman A, et al. Identification and characterization of a novel enhancer in the HTLV-1 proviral genome. Nat Commun. 2022;13(1):2405. Epub 2022/05/04. pmid:35504920; PubMed Central PMCID: PMC9065021.
  30. 30. Melamed A, Fitzgerald TW, Wang Y, Ma J, Birney E, Bangham CRM. Selective clonal persistence of human retroviruses in vivo: Radial chromatin organization, integration site, and host transcription. Sci Adv. 2022;8(17):eabm6210. Epub 2022/04/30. pmid:35486737; PubMed Central PMCID: PMC9054021.
  31. 31. Gillet NA, Malani N, Melamed A, Gormley N, Carter R, Bentley D, et al. The host genomic environment of the provirus determines the abundance of HTLV-1-infected T-cell clones. Blood. 2011;117(11):3113–22. Epub 2011/01/14. pmid:21228324; PubMed Central PMCID: PMC3062313.