The HTLV-1 viral oncoproteins Tax and HBZ reprogram the cellular mRNA splicing landscape

Viral infections are known to hijack the transcription and translation of the host cell. However, the extent to which viral proteins coordinate these perturbations remains unclear. Here we used a model system, the human T-cell leukemia virus type 1 (HTLV-1), and systematically analyzed the transcriptome and interactome of key effectors oncoviral proteins Tax and HBZ. We showed that Tax and HBZ target distinct but also common transcription factors. Unexpectedly, we also uncovered a large set of interactions with RNA-binding proteins, including the U2 auxiliary factor large subunit (U2AF2), a key cellular regulator of pre-mRNA splicing. We discovered that Tax and HBZ perturb the splicing landscape by altering cassette exons in opposing manners, with Tax inducing exon inclusion while HBZ induces exon exclusion. Among Tax- and HBZ-dependent splicing changes, we identify events that are also altered in Adult T cell leukemia/lymphoma (ATLL) samples from two independent patient cohorts, and in well-known cancer census genes. Our interactome mapping approach, applicable to other viral oncogenes, has identified spliceosome perturbation as a novel mechanism coordinated by Tax and HBZ to reprogram the transcriptome.


Introduction
The ability of a retrovirus to transform its host cell was originally attributed to integration of retroviral DNA into the host cell's genome. This integration allowed the discovery of cellular oncogenes and related cellular signaling pathways such as the SRC [1], EGFR [2] MYC [3] RAS [4], and PI3K pathways [5]. However, no universal model has been developed to explain oncogenic transformation as a phenotypic result of retroviral integration. The only transforming retrovirus identified in humans to date is the human T-cell leukemia virus type 1 (HTLV-1), which causes adult T-cell leukemia/lymphoma (ATLL). ATLL has a long latency period of approximately 20-60 years, which suggests the occurrence of rare and complex genomic changes during disease progression [6,7]. Proviral integration sites for HTLV-1 are enriched in cancer driver genes, resulting in altered transcription of those genes [8]. Additional genomic changes have also been observed in distant genes that play a role in various mechanisms of Tcell signaling [6,7]. However, the key initial drivers of ATLL are the viral proteins Tax and HBZ, which can independently induce leukemia in transgenic mouse models [9,10].
Studies of Tax and HBZ have been summarized in several reviews [11][12][13][14][15]. These data have been compiled into a KEGG pathway (hsa05166) which highlights the ability of Tax and HBZ to interfere with at least one component of each of the twelve signaling pathways regulating the three cancer core processes: cell fate, cell survival, and genome maintenance [16]. The effects of Tax and HBZ are mediated primarily via protein-protein interactions, and positive or negative transcriptional regulation [14,17]. Tax and HBZ often act in opposing directions in order to control the host's immune response and sustain long-term malignant transformation [18,19].
Our molecular understanding of genomics and transcriptomics deregulation following HTLV-1 infection has come from studying ATLL patient samples [6,20,21]. Because Tax and HBZ show different expression kinetics during ATLL progression [22], it has remained challenging to systematically analyze the relative contribution of each viral protein in reprogramming the host cell's transcriptome and proteome. Here, we carried out a systematic identification of the interactome networks between Tax/HBZ and cellular regulators of gene expression. We then measured the effects of Tax and HBZ on gene expression at both the transcriptional and post-transcriptional levels. Integration of these interactome and transcriptome datasets provides mechanistic insights into HTLV-1 infection-associated alternative splicing events and leukemogenesis.

A comparative interactome of Tax and HBZ with cellular host proteins
Previous studies have shown that Tax and HBZ viral proteins control viral gene expression by competing for binding to key transcriptional factors of the CREB/ATF pathway, and coactivators CBP and p300 (S1A and S1B Fig), complexes that are also specifically targeted by other viral oncoproteins including high-risk human papillomavirus (HPV) E6 proteins [23]. In the present study, we aimed at providing an unbiased map of protein-protein interactions (PPIs) established by Tax and HBZ viral oncoproteins with cellular gene expression regulators, including transcription factors (TFs) and RNA-binding proteins (RBPs) (S1C Fig). Our rationale is that exploring more broadly the similarities and differences of Tax and HBZ interactomes with TFs and RBPs would provide global and specific insights on how viral oncogenes cooperate in the initiation and maintenance of cancer.
Based on Gene Ontology and literature curation, we first established a library that covers 3652 ORFs encoding 2089 transcriptional and 1827 post-transcriptional regulators, including known DNA-and RNA-binding proteins (S1D Fig). Secondly, we assembled a mini-library of different Tax  We then tested binary interacting pairs between viral products and human TFs and RBPs using our well-established binary interactome mapping strategy employing primary screening by yeast two-hybrid (Y2H) assays, retesting by Y2H and validation using an orthogonal protein complementation assay [24][25][26][27]. This interactome search space encompassing~95,000 binary combinations (S1F Fig), allowed identification of 53 and 116 Tax and HBZ cellular partners, respectively ( Fig 1A, S1 Data). Interestingly, we observed a highly significant overlap between gene expression regulators, either TFs or RBPs, interacting with Tax and HBZ ( Fig 1A, 25 shared interactors, Fisher test: p<2.2e -16 ). We next used 90 interacting pairs in a validation experiment using an orthogonal assay, the Gaussia princeps protein complementation assay (GPCA) [28]. The validation rate was 83% (Fig 1B and 1C), demonstrating the high quality of our dataset (Y2H_2020) that represents an increase of 32% and 51.6% of known TFs and RBPs interacting with Tax, respectively ( Fig 1D). Compared to Tax, fewer interactions were available in the literature for HBZ, and our result represents a substantial increase of 65.5% and 93.6% of TFs and RBPs interacting with HBZ, respectively ( Fig 1D).
We expanded our catalog of systematically determined binary Tax and HBZ interactions with host proteins (Y2H_2020) with high quality interactions reported in the literature (Lit_2020) (Figs 1D and S3). The union of Tax-host and HBZ-host interactomes contains 258 and 160 interacting partners, respectively (S3 Fig and S1 Data). Of interest, TFs represent 38% and 59%, while RBPs account for 16% and 38% of Tax and HBZ host interactors, respectively (S3 Fig and S1 Data).
To classify Tax and HBZ interacting partners in functional categories we examined their repartition in the MSigDB hallmark gene sets [29] (Fig 1E). We focused on five categories of specific gene sets (Immune, Proliferation, Signaling, Pathway and Cellular component) and obtained a constructive view of the functional distribution of Tax and HBZ partners. The most significantly enriched gene set signatures were "Proliferation" and "Signaling pathways", for both Tax and HBZ (Fig 1E). While HBZ appears to specifically target the IL-6-JAK-STAT3 pathway, cancer-related gene products of the PI3K-AKT-mTOR, TGF-β and IL-2-STAT5 pathways are more enriched in the Tax interactome ( Fig 1E). Further highlighting the impact of Tax and HBZ in the initiation and maintenance of cancer, these viral proteins exhibit  (B) or HBZ (C) and their interacting partners identified in Y2H. Y-axis shows normalized luciferase ratios (NLR). Bar graphs represent the mean number ± SD. Positive interactions are indicated by NLRs above dashed red line (D) Graph showing the number of RBPs and TFs interacting with Tax and HBZ in different datasets. "Overlap" means interactions already known in the literature, "Union" means the combined list of interactions from this study and the literature (E) As in (D) but number of coding genes that are parts of a GSEA hallmark category. (F) Percentages of Cancer gene products interacting with Tax and HBZ. See also S1, S2 and S3 Figs and S1 Data. https://doi.org/10.1371/journal.ppat.1009919.g001

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism significantly more PPIs (at least 20 times) than the average "degree" (number of interactors) of known cancer gene products (Fig 1F), or other tumor virus proteins, but comparable to highly connected cancer gene products such as CREB3L1 [24], or HPV viral proteins E6 and E7 [23].
Altogether these results provide strong evidence that Tax and HBZ perturb the cell host through similar and differential associations with transcriptional and post-transcriptional regulators.

A comparative analysis of transcriptomic changes associated with Tax and HBZ expression
Based on the observation that Tax and HBZ target numerous host DNA-and RNA-binding proteins, it is anticipated that a significant number of host gene expression changes, indispensable for ATLL pathogenesis, are driven by these interactions. We generated a homogeneous inducible cellular system, which consists of two Jurkat T cell lines, Jurkat-iTax and Jurkat-iHBZ, expressing either Tax or HBZ from a doxycycline-inducible promoter (Fig 2A). We then performed high-throughput RNA sequencing (RNA-seq) of the two cell lines and analyzed differentially expressed genes (DEGs), and alternative splicing events (ASEs) associated with Tax or HBZ expression (Fig 2A). We confirmed the induction of Tax and HBZ (Figs 2B, S4A and S4B), which are associated with increased expression of their common up-or downregulated target genes GATA3 (Fig 2B) [32,33]. Comparative analysis with control cells revealed 1453 and 1014 genes that were differentially expressed upon inducing Tax or HBZ expression, respectively (FDR adjusted p< 0.01 and |log2 fold-change| � 1, Fig 2C and 2D and S2 Data). Consistent with its well-known function as a transcriptional activator [34,35], we found that Tax expression caused up-regulation of 885 genes and down-regulation of 568 genes ( Fig 2C). In contrast, expression of HBZ was associated with up-regulation of 397 genes and down-regulation of 617 genes ( Fig 2D). Several studies found that HBZ can act as a transcription repressor [36][37][38][39][40][41], in agreement with our finding that 61% of the differentially regulated genes in Jurkat-iHBZ were down-regulated. This finding is also supported by our confocal and transmission electron microscopy observations showing that HBZ-expressing cells have rounder and more regular nuclear speckles (S4G and S4H Fig), which is often associated with decreased (or reduced) Pol II-mediated transcription [42][43][44].
Gene Set Enrichment Analysis [45] identified TNF-α signaling via NF-kB, as a specific pathway enriched both in Tax-and HBZ-expressing cells (Fig 2E and 2F). In contrast, inflammatory response was specifically up-regulated in Tax-expressing cells whereas cell cycle G2M checkpoint genes were specifically up-regulated in HBZ expressing cells (Fig 2E and 2F and S2 Data). Interestingly, the extent of co-regulated genes by both viral proteins was highly significant (28% for Tax and 41% for HBZ, empirical P < 0.0001). These include 405 genes whose expression was altered in the same direction (149 up-regulated and 256 down-regulated) and only 12 genes whose expression was altered in opposite directions (S2 Data). Despite their differential expression in vivo, our transcriptomic changes analysis confirms the notion highlighted above, that Tax and HBZ viral proteins drive a number of overlapping molecular perturbations.

Impact of Tax and HBZ expression on cellular gene alternative splicing events
Analyses of Tax and HBZ host cell perturbations have predominantly focused on transcriptional effects. However, from our interactome data, the interacting proteins annotated as "post-transcriptional regulators" represent 54% and 44% of Tax and HBZ partners involved in

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism gene expression regulation, respectively (S3 Fig). To assess the impact of Tax and HBZ on the cellular splicing landscape, we statistically computed splicing events using the rMATS software ( Fig 3A, [46]). We focused on 5 types of alternative splicing: Skipped Exons (SE) or cassette

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism regulated by Tax and HBZ expression (102 inclusion and 65 exclusion events in both Tax and HBZ expressing cells, Fig 3D). Taken together our analysis shows that although Tax and HBZ have globally opposite effects on the host splicing landscape, they share some common target exons that are similarly affected.

Splicing targets of Tax and HBZ are enriched for cancer-related genes
We validated the rMATS analysis of 10 Tax-and 13 HBZ-dependent splicing events by qRT-PCR (10/10 for Tax, 12/13 for HBZ, Fig 4A), and performed Gene Ontology (GO) enrichment analysis. Splicing targets were significantly enriched by several GO terms in HBZregulated genes, whereas no significant enrichment was detected for Tax splicing targets. For HBZ, enriched categories were mostly related to RNA regulation, especially RNA splicing (S3 Data). We further investigated the functions of specific spliced genes in cells expressing Tax or HBZ by examining their repartition in the MSigDB hallmark gene sets built by GSEA [29] ( Fig  4B). A significant overlap was also found between genes corresponding to Tax or HBZ splicing pre-mRNA targets and known cancer census genes (ncg.kcl.ac.uk and cancer.sanger.ac.uk) (P = 0.02247 for Tax and P = 0.001742 for HBZ). Pre-mRNA of 33 and 63 cancer genes were identified as splicing targets of Tax and HBZ, respectively ( Fig 4C, S3 Data). Among these, 10 were similarly deregulated by Tax and HBZ (CHCHD7, EIF4A2, NF2, POLG, PTPRC, UBR5, ABI1, BCLAF1, FLNA and NSD1). These include PTPRC, coding for CD45, a transmembrane protein tyrosine phosphatase, which is known to be alternatively spliced upon T-cell differentiation and induces a switch from naive (CD45RA) to memory T-cells [48][49][50][51].
Another interesting observation here is the fact that only a small number of genes (23 and 29 for Tax or HBZ, respectively) presenting ASEs were differentially expressed ( Fig 4D and 4E, S3 Data), as previously observed [47]. Since co-transcriptional processing is a widespread mechanism for many genes in different organisms [52][53][54], our result suggests that splicing events and transcription are not functionally coupled for the majority of genes regulated by Tax or HBZ.

ATLL-specific splicing events validated in independent cohorts partially overlap with Tax-and HBZ-driven splicing
To determine whether ASEs detected in Tax or HBZ expressing cells could be relevant for HTLV-1 infection and leukemogenesis, we interrogated RNA-seq data obtained from two independent cohorts. In the first cohort, referred to as "the Japanese cohort", we analyzed 35 ATLL samples, 3 samples from HTLV-1 asymptomatic carriers and 3 samples from healthy volunteers [6]. Using rMATS v3.2.1 [46], we detected 4497 ASEs (in 2343 genes) between HTLV-1 carriers and healthy volunteers, while 9715 events (in 2737 genes) were differentially affected in the ATLL patients compared to healthy volunteers ( Fig 5A). Among these, cassette exons and mutually exclusive exons accounted for a large majority of ASEs (Fig 5A), as observed for cells expressing Tax or HBZ proteins ( Fig 3B). Interestingly, we observed a significant overlap between ASEs detected in HTLV-1 carriers and ATLL patients (1637 ASEs on 1238 genes, SE P~0, A3'SS P = 8.026e-112, A5'SS P = 4.14e-118, MXE P = 1.40e-181, RI P = 5.01e-19), suggesting that some ASEs may initiate disease progression and persist during ATLL. Similar to our observations in Jurkat cells expressing Tax, alternative splicing of cassette exons (SE events) was skewed towards inclusion for both HTLV-1 carriers and patients with ATLL ( Fig 5B). The most significant GO-term enrichment for genes presenting ASEs in ATLL patients was RNA binding (GO:0003723). However, other categories were also significantly enriched, such as cadherin binding, viral process and immune system process (S4 Data). Thirty-one and 42 ASEs observed in cells expressing Tax or HBZ, respectively, were also

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism present in patients with ATLL. Those events occurred on pre-mRNA of 56 genes including well-known cancer-related genes PTPRC, IKBKB and HRAS, and genes coding for transcription factors ILF2, ATF2 and EYA3 (Fig 5C and 5D).
We found that a highly significant number of same genes were affected by ASE in both the Afro-Caribbean and the Japanese cohort ( Fig 7A, S3 Data), while some differences were also observed (compare Figs 5C, 5D, 6C and 6D). We identified genes for which ASEs were

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism observed in both the Afro-Caribbean cohort and the Jurkat cells expressing Tax or HBZ, including cancer genes EIF4A2, IKBKB, LZTR1, STAT6 (for Tax); CARS, MDM2, IKZF1 (for HBZ); and EWSR1, MUTYH, and PTPRC (for both Tax and HBZ) (Fig 6C and 6D). Several splicing events regulating the inclusion/exclusion of one or more of these exons were detected in both cohorts, thus representing validated molecular biomarkers of ATLL. In addition, we

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism observed similar ASEs for each of the affected exons in Tax/HBZ Jurkat cells and patient samples. Notably, several ASEs in the PTPRC pre-mRNA coding for pan-leukocyte marker CD45 detected in Tax-and HBZ-expressing cells were also identified in patients with ATLL. These ASE changes affected PTPRC exons 4, 5, 6 and 7. More specifically, we observed a trend for exons 5, 6 and 7 to be excluded, while exon 4 was more included (Figs 7B-7E and S6A-S6D),

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism suggesting that, compared to classical activated and memory T-cells [50], ATLL cells exhibit a specific alternative splicing pattern on PTPRC pre-mRNA. Those exons are important for the production of diverse CD45 isoforms and therefore, the inclusion of exon 4 may represent a major mechanism used by Tax and HBZ to control T cell activation. Other examples include ATF2 and EYA3, which present premature stop codons on spliced exons; and MSM01, which has a Nuclear Localization Signal (NLS) affected by splicing events (S5 Fig). Altogether, our results demonstrate a global impact of Tax and HBZ expression on alternative splicing. Strikingly, Tax and HBZ targets show different splicing patterns in HTLV-1-infected individuals and ATLL patients, further emphasizing their global opposing effects on the deregulation of host genes.

Identification of RNA splicing-specific roles for Tax and HBZ proteins
Tax and HBZ interact respectively with 35 and 47 proteins involved in RNA catabolic processes (Fig 8A, yellow), RNA export ( Fig 8A, light red), RNA processing (Fig 8A, blue) and/or RNA translation (Fig 8A, green), respectively. RNA processing factors interacting with Tax and HBZ are categorized into pre-mRNA processing and splicing factors ( Fig 8B). To further explore the physiological roles for Tax-and HBZ-dependent effects on host mRNA splicing, we first performed motif enrichment analysis of alternative splicing events detected in Japanese patients with ATLL. Regulated exons and part of their flanking introns were screened for RNA-binding motifs using the MEME suite [57]. We found significant enrichment for 31 RNA-binding motifs including the complementary factor for APOBEC-1 (A1CF), an RNAbinding protein regulating metabolic enzymes via alternative splicing [58], identified here as a HBZ partner, and the U2 small nuclear ribonucleoprotein particle (snRNP) auxiliary factor (U2AF) large subunit U2AF65 (also called U2AF2), identified here as a Tax partner (Figs 8C and S7A and S1 Data).
The U2AF complex is a well-established essential component of the spliceosome assembly pathway [59][60][61]. U2AF65 forms a heterodimer with U2AF35 that recognizes the 3' splice sites (3'SS) of introns [60]. U2AF65 binds to the polypyrimidine tract (PY tract) of the intron and induces the recruitment of the U2 snRNP [60,61]. We performed co-immunoprecipitation assays and confirmed an interaction between Tax and endogenous U2AF65 (Fig 8D). This interaction was dependent on the presence of RNA (Fig 8E). Using the GPCA assay [28], we confirmed direct Tax and U2AF65 interaction, and the absence of binding between Tax and U2AF35 ( Fig 8F). However, Tax expression increased the formation of the U2AF35-U2AF65 heterodimer ( Fig 8G) suggesting that the interaction interfaces of Tax/U2AF65 and U2AF35/ U2AF65 may be different, and Tax does not disrupt, but rather stabilizes, the U2AF complex. As a control for specificity, we did not detect any interaction between the U2AF subunits and HBZ (S7B Fig).
Previous studies have shown that a more stable U2AF35/U2AF65 complex could favor the recognition of less conserved sub-optimal PY tract sequences containing a lower proportion of thymidine (T) [59,62]. To determine if there is a global effect of the Tax/U2AF65/U2AF35 interaction on alternative splicing events, we inspected the 30bp upstream of all 3'SS of cassette exon events, whether altered or not upon Tax expression. We found a significant reduction of T content in the vicinity of 3'SS of more included exons, indicating that exons with less conserved PY tracts are more included following Tax expression (Fig 8H). To further evaluate the interplay between Tax and U2AF65, we generated by shRNA a Jurkat T-cell line with reduced U2AF65 expression (Fig 8I). We analyzed 6 SE events detected in Jurkat-iTax cells, and as shown on Fig 8J-O, knockdown of U2AF65 affected splicing of all exons, and induced higher inclusion levels for 5 out 6 SE events.

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism In conclusion, although other RNA processing factors interacting with Tax (Fig 8A and 8B) are likely to influence the U2AF heterodimer and therefore splice site recognition, our results demonstrate that a large part of Tax-driven SE events are U2AF65-dependent, suggesting that, the stabilization of the U2AF complex by Tax potentially drives the initial steps of transcriptome diversification in HTLV-1 leukemogenesis.

Discussion
Splicing events, producing multiple mRNA and protein isoforms, participate in proteome diversity and contribute to phenotypic differences among cells [63]. Splicing programs are often altered in cancer cells and systematic quantification of splicing events in tumors has led to the identification of cancer-specific transcripts that are translated into divergent protein isoforms participating in oncogenic processes [63]. In the context of infectious diseases, it is well known that viruses exploit the host splicing machinery to compensate for their small genomes and expand the viral proteome [64]. As a consequence, interactions with host RNA splicing factors have been reported for a number of viruses, including the Human Immunodeficiency [65], Influenza A [66], Herpes Simplex [67], Epstein-Barr Virus [68], Reovirus [69], Human Papillomaviruses [70,71], human Adenovirus [72], or human Parvovirus [70]. However, there is very limited understanding of the direct or indirect effects of viral products in regulating host RNA splicing.
Tax and HBZ, two HTLV1-encoded proteins that are major drivers of ATLL, have been known for many years to hijack the host gene expression programs. However, to date, they are exclusively considered as transcriptional regulators, acting at the level of mRNA synthesis. In HTLV-1 positive cells, the HBZ antisense transcript is consistently expressed while Tax-dependent sense transcription occurs in a burst-like manner, allowing HTLV-1 transmission to naïve T-cells [22]. We thus performed high-throughput binary interactome mapping to identify novel interacting partners of Tax and HBZ and generated a homogenous expression model based on independent induction of Tax and HBZ expression in a Jurkat T-cell line. Jurkat is an established T-cell acute lymphoblastic leukemia cell line [73], which, like any in vitro culture system, carries genomic and phenotypic differences when compared to primary T cells isolated from healthy donors or HTLV-1 infected peripheral blood mononuclear cells. Although our Jurkat experimental system may be limited in capturing the in vivo physiological effects of Tax and HBZ in primary cells, it does allow for a systematic comparative analysis for identifying Tax-and HBZ-dependent events contributing to the dysregulation of cellular functions.
First, we systematically identified, for Tax and HBZ viral proteins, shared and distinct human interacting partners implicated in gene expression regulation. Although we have not interrogated post-translational modification-dependent interactions, this map is the first to be reported for HTLV-1 and constitutes a valuable resource for in depth analysis of Tax and HBZ molecular functions. Our data suggest that both viral proteins interfere almost equally with all steps of mRNA life, including splicing processes, to reprogram the host cell transcriptome. Second, we used a Jurkat T-cell line model to identify potential Tax and HBZ splicing targets, which were validated in primary cells isolated from two independent cohorts of HTLV-1 asymptomatic carriers and ATLL patients [6,8,55,56]. These two cohorts are unbalanced in the number of control samples (purified CD4+ T-cells) from healthy individuals compared to HTLV-1 positive samples, and their genetic and epidemiological heterogeneity could affect our results. However, analyzing these two cohorts separately allowed us to draw three conclusions: (i) in both cohorts, spliced exons (SE) and mutually exclusive exons (MXE) accounted for a large majority of ASEs (compare Figs 5A and 6A), (ii) a number of ASEs from both

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism cohorts were also identified in Jurkat cells expressing either Tax or HBZ genes (compare Figs 5C and 6C) and (iii) we have identified common ASEs in both patient cohorts that partially overlap with Tax-or HBZ-driven splicing events (Fig 7A). Interestingly, Tax-and HBZ-dependent splicing events also affected 33 and 63 genes that are also included in the Catalogue of Somatic Mutations in Cancer (COSMIC), as part of the cancer gene census [74]. One particular interesting example is the PTPRC gene encoding CD45, a critical regulator of immune cell development [48]. The relative inclusion of exons 4 to 7 of PTPRC was modified in Tax and HBZ expressing cells, as well as in ATLL patients' samples (Fig 7). These alternative splicing events on the pre-mRNA of the PTPRC gene have been shown to lead to the expression of different CD45 isoforms [75][76][77] dictating immune cell development [48]. However, the mechanisms regulating CD45 isoform expression are not well understood. Protein kinase C (PKC) has been shown to induce PTPRC exon exclusion in a T-cell model [78], which correlates with previously shown activating mutations in PKC genes [6] and exclusion of exons 5, 6 and 7 in ATLL samples reported here (Fig 7). In order to propose an ATL-specific CD45 isoform as a potential diagnostic tool, future studies will be needed to address (i) mRNA and protein isoform expression at the single-cell level during disease progression, (ii) identification of specific extracellular ligands and possibly (iii) segregation of the functional interplay between Tax and HBZ in regulating PTPRC splicing.
Other examples from this study include well-described tumor-promoting genes such as SRSF2, DNMT3A, ATM, BRCA1 and PKM [63], for which Tax-and HBZ-alternative splicing events are now associated with ATLL for the first time. Information about specific splicing isoforms of these genes would be beneficial to our understanding of ATLL biology, including perturbation of metabolic pathways. For instance, PKM pre-mRNA exhibits MXE changes in HTLV carriers (S3 Data). The encoded enzyme, pyruvate kinase PKM, is essential in glycolytic ATP production [79]. Furthermore, we found that pre-mRNAs whose splicing is modified following HBZ expression are significantly enriched in metabolic processes (S3 Data), and showed that HBZ interacts with A1CF, an RNA-binding protein that regulates metabolic enzymes via alternative splicing [58], and for which we observed binding motif enrichment in ATLL samples (Fig 8C). We propose that HBZ, via interactions with RNA-processing factors, controls metabolic pathways leading to the maintenance of infected cells in a low glucose concentration and highly hypoxic host microenvironment, as recently described [80]. It will be interesting to explore whether inhibitors/activators of HBZ interactions with specific splicing factors targeting metabolic pathways may reveal novel therapies for ATLL. Lastly, this study identified genes such as IKZF1-5, CLK1, FRS2, HNRNPD, HSPH1, ITGAE, MKI67, RTKN2 and SAT1 that could be used as potential biomarkers for ATLL because they are shared between our study using, Jurkat cells, and a dataset from untransformed infected CD4+ T-cell clones and ATLL samples [20,81].

Concluding remarks
Just as the identification of major transcription factors interacting with Tax and HBZ (CREB/ ATF, AP-1, β-catenin, Smad and NF-kB) revolutionized our understanding of the mechanisms underlying HTLV-1 pathogenesis, our finding that both viral proteins are also able to manipulate post-transcriptional events provides an excellent opportunity for an in-depth analysis of less-explored gene expression events (RNA catabolism, export, splicing and translation).
Of broad interest, our work is a contribution toward the cancer genome atlas exploration that has already revealed alternative splicing events across 32 cancer types and highlighted the U2AF complex as a key player in cancer transcriptome diversification [82,83]. As shown here, the retroviral proteins Tax and HBZ are excellent tools to analyze transcriptional dynamics in cancer cells, beyond the evaluation of mutated genes. We advance the hypothesis that the viral

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism protein Tax could reprogram initial steps of the T-cell transcriptome by hijacking the U2AF complex function. Through interaction with Tax, there is also a possibility of recruitment of the U2AF complex to regulate HTLV-1 pre-mRNAs splicing. In fact, the U2AF complex targets the 3' splice site of~88% of protein coding transcripts [82], making it a perfect target for a transforming oncovirus.

Contact for reagents and resources
Further information and requests for resources and reagents (Table 1) should be directed to and will be fulfilled by the Lead Contact, Dr. Jean-Claude Twizere (email: jean-claude.twizer-e@uliege.be).
Plasmids and cell lines generated in this study are available upon request and approval of the Material Transfer Agreement (MTA) by the University of Liege.

Bacterial strains
Chemically competent DH5α E. coli cells were used for all bacterial transformation in this study. Post transformation, cells were cultured in LB (25 g/l) supplemented with antibiotics (50 μg/ml of ampicillin, spectinomycin or kanamycin) and incubated at 37˚C for 24 hours.

Yeast two hybrid assay
Details of the Y2H screening method are described elsewhere [26,27]. Y2H assays were performed with 19 distinct fragments of the Tax protein and 9 fragments of HBZ protein cloned in CEN plasmids pDEST-AD-CYH2 (AD vector) and pDEST-DB (DB vector). AD and DB vectors were transformed into MATa Y8800 and MATα Y8930 Saccharomyces cerevisiae strains, respectively [27]. A total of 3652 Human ORFs, corresponding to human transcription factors and RNA-binding proteins, were obtained from the human ORFeome v7.1 collection. For Y2H screening, we pooled MATa Y8800 yeast strains containing human and viral

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism AD-ORFs, and performed mating against MATα Y8930 yeast clones containing individual human or viral DB-ORFs. After selection of mated yeasts on medium lacking leucine, tryptophan and histidine, the identity of the interacting protein pairs was determined by sequencing of the corresponding ORF clone. A counter-selection on medium containing cycloheximide was performed simultaneously in the screens to identify false positives [27,88]. All discovered interacting pairs were retested again similarly but in pairwise screenings. Interacting pairs confirmed in the second screening were considered positives. Construction of interacting networks was done with Cytoscape v3.5.1 [89].

Generation of Jurkat cell lines with inducible Tax or HBZ expression
The viral Tax gene and the FLAG-tagged HBZ gene were cloned into the pLenti-CMVtight-Blast-DEST (w762-1) vector from Addgene (#26434) with the Gateway cloning system. Production of lentiviral vectors (2nd generation) and generation of the Jurkat Tet3G iTax and Jurkat Tet3G iHBZ cell lines was performed by the GIGA Viral Vectors Platform at the University of Liège with the 3 rd generation inducible gene expression system Tet-On (from Clontech-Takara). The plasmids pLVX-Tet3G, pLenti6 Tight Tax and pLenti6 Tight Flag-HBZ were transfected separately on LentiX-293T Cell Line (from Clontech-Takara) each with a lentiviral packaging mix. This mix contained a packaging construct with the gag, pol, rev genes (psPAX2), and an Env plasmid expressing the vesicular stomatitis virus envelope glycoprotein G (VSV-G). The lentiviral supernatants were harvested, concentrated and tittered by qRT-PCR (Lentiviral Titration Kit LV900 from AbmGood) to be further used for transduction. The Jurkat cell lines (5 x 10 5 cells/ml) were co-transduced with lentiviral vectors pLVX-Tet3G at a multiplicity of infection (MOI) of 30 and with pLenti6 Tight-Tax or -Flag-HBZ at a MOI of 16. For this transduction step, the reagent protamine sulfate (MP Biomedicals) was used according to the manufacturer´s instructions (8 μg/ml). Subsequently, the cells were centrifuged at 800 x g for 30 minutes at 37 degrees. The pellet was suspended in RPMI-1640 containing 10% FBS. After 72h cells were cultivated in cell culture medium containing blasticidin(10 μg/ml) and hygromycin B (400 μg/ml) in order to select transduced cells expressing the gene of interest (Tax or Flag-HBZ) and the rtTA3 gene until the non-transduced cells (negative control) died as determined by Trypan Blue staining.

Cell culture, transfections and treatments
Induction of Tax or Flag-HBZ in Jurkat-iTax or -iHBZ cells was performed by treating cells with doxycycline hydrochloride (1 μg/ml, Fisher Scientific) for 48h. HEK293T and HeLa cells were

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism transfected using polyethylenimine (PEI25K, Polysciences) (from 1 mg/ml) by a ratio 2:1 to plasmid concentration. HEK293T and HeLa cell lines were maintained in DMEM (Gibco) complemented with 10% FBS while Jurkat cell lines were maintained in RPMI (Gibco) also complemented with 10% FBS. Cell lines were regularly checked for mycoplasma contamination.

RNA extraction, RT-PCR and RT-qPCR
Total RNA extraction from cell pellets was performed according to the manufacturer's protocol (Nucleo Spin RNA kit from Macherey-Nagel) and cDNA was obtained by reverse transcription with random primers using the RevertAid RT Reverse Transcription Kit from Thermo Fisher. One or half a microgram of total RNA was used to make cDNA, which were diluted 100 times to perform PCR amplification with specific primers, using Taq polymerase (Thermo Fisher). PCR products were migrated by SDS-PAGE and revealed with Gel Star (Lonza) under UV light. Bands were quantified using ImageJ software [84]. Validation of ASEs detected in Jurkat cells was performed by quantifying bands obtained after PCR amplification of regulated exons. The primers used target flanking exons and allowed detection of inclusion/exclusion of the alternatively spliced exon. Differential Percent Spliced In (ΔPSI) was calculated by subtracting the PSI value obtained from the Tax expressing condition (+Dox) to the PSI value from the CTRL condition (-Dox). PSI was calculated by dividing the quantification of inclusion and exclusion of the PCR band. Quantitative PCR (qPCR) reactions were performed with iTaq Universal SYBR Green Supermix (BioRad) in triplicates on a LightCycler 480 instrument (Roche). The ΔΔCt method was used to analyze relative target mRNA levels with GAPDH as an internal control. All primers used in the study are presented in S4 Data.

Immunofluorescence and confocal microscopy
Transfected HeLa cells with GFP-HBZ or an empty plasmid were grown on glass coverslips for 24h. Cells were washed in PBS and fixed with 4% paraformaldehyde (PAF) for 15 min. After washing with PBS, cells were permeabilized in PBS with 0.1% Triton X-100 for 5 min. Cells were then incubated in blocking solution (PBS with 4% BSA) for 1h before incubation with anti-SC35 primary antibody (Abcam) overnight at 4 degrees. Samples were then incubated with Alexa568-conjugated secondary antibodies (Thermo Fisher) and further incubated 10 min. with DAPI (Thermo Scientific, in PBS) before washing and mounting with Mowiol. All images were acquired with a Nikon A1 confocal microscope and processed with ImageJ.

Transmission electron microscopy
Hek293 cells inducible for HBZ expression (Hek293 iHBZ) were generated similarly to Jurkat-iTax and -iHBZ cell lines. However, a plasmid pLenti CMV rtTA3 was used instead of the plasmid pLVX-Tet3G. HBZ expression was induced or not in Hek293 iHBZ cells for 48 hours with doxycycline before cells were fixed for 90 minutes at 4 degree C with 2.5% glutaraldehyde in a Sörensen 0.1 M phosphate buffer (pH 7.4) and post-fixed for 30 min with 2% osmium tetroxide. After dehydration in graded ethanol, samples were embedded in Epon. Ultrathin sections obtained with a Reichert Ultracut S ultra-microtome were contrasted with uranyl acetate and lead citrate. Observations were made with a Jeol JEM-1400 transmission electron microscope at 80kV.

Cloning and plasmids
Viral clones were amplified by PCR using specific primers flanked at the 5' end with AttB1.1 and AttB2.1 Gateway sequences and were inserted into pDONR223 by Gateway cloning

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism (Invitrogen). Human ORFs (encoding U2AF65, SNRPA, eIF4A3) were retrieved from the human ORFeome v7.1 or the human ORFeome v8.1 (http://horfdb.dfci.harvard.edu). Inserts in pDONR223 were then transferred into appropriate destination vectors by LR cloning. A list of Tax and HBZ clones is available in S4 Data.

Cell lysates, immunoprecipitation, immunoblotting and antibodies
For Immunoprecipitations (IPs) HEK293T cells were harvested and lysed in IPLS (Immunoprecipitation Low Salt; Tris-HCl pH 7.5 50 mM, EDTA, pH 8, 0.5 mM, 0.5% NP-40, 10% glycerol, 120 mM NaCl, complete Protease Inhibitor (Roche)). RNaseA treatment (Thermo Fisher Scientific, 10 μg/ml at 37% for 30 min) was performed on cleared lysates when indicated. Supernatants were incubated with anti-FLAG M2 agarose beads (Sigma-Aldrich) and then were washed with IPLS (incubation times and number of washes depended on tested interactions). For semi-endogenous IPs, rabbit anti-Tax antibody [90] was incubated overnight with cell lysates. Afterwards, Protein A/G PLUS-Agarose beads (Santa Cruz) were incubated with lysates for 2h and beads were washed 3 times with IPLS. Beads were re-suspended in 2x SDS loading buffer and boiled. Samples were then analyzed by SDS-PAGE and western blotting and revealed with ECL detection kit (GE Healthcare Bio-Sciences) according to standard procedures.

Protein Complementation Assay (PCA)
Tax, HBZ and interacting partners ORFs were cloned in destination vectors containing GLucN1 and GLucN2 fragments of the Gaussia princeps luciferase. HEK293T cells were seeded in 24-well or 96-well plates and transfected with 500 ng or 200 ng of the appropriate constructs (GLucN1 + GLucN2), respectively. After 24h cells were washed with PBS and lysed using the manufacturer's lysing buffer (Renilla luciferase kit, Promega). Ten to 20 μl of lysates were then used to quantify luminescence in a Centro lb 960 luminometer (Berthold).

RNA-seq data analysis
Libraries were prepared with the Illumina Truseq stranded mRNA sample prep kit and pairedend sequencing was performed with the Illumina NextSeq500 PE2X75 system by the Genomics platform at GIGA, University of Liege. Three replicates were made for each condition analysed (Jurkat iTax +Dox, Jurkat iTax -Dox, Jurkat iHBZ +Dox and Jurkat iHBZ -Dox). Sequence reads were aligned to the human genome hg19 (UCSC) using STAR [85]. Differential expression analysis was performed with DESeq2 [86] on read counts from STAR quant Mode. Genes were considered significantly up-or down-regulated if their base 2 logarithm fold change was >1 or <-1 and their adjusted p-value was <0.01. Analysis of Alternative Splicing Events (ASEs) was performed with the rMATS software (v3.2.1 for the Japanese cohort and iTax iHBZ and v4.1.1 for the Afro-Caribbean cohort), using reads mapped with STAR that spanned between exon junctions of ASEs [46]. rMATS outputs reported differential ASEs by calculating the difference in PSI between two conditions, namely ΔPSI, ranging from -1 to +1. For instance, SE events where exons were more excluded presented a ΔPSI value <0 while exons that were more included had a ΔPSI >0. Genes with low expression levels were removed from the output results by filtering out genes with a TPM <1. TPM was calculated using Salmon (v0.9.1 for the Japanese cohort, iTax and iHBZ, and v1.4.0 for the Afro-Caribbean cohort). ASEs were further filtered to consider only events with a ΔPSI >0.1 or <-0.1 and with a FDR <0.05. Sashimi plots were generated using rmats2sashimi.
RNA-seq data from the Japanese patient cohort were previously described (Kataoka et al., 2015). For the purpose of this study, we selected: 35 ATLL patient samples (17 acute, 11

Gene enrichment analysis
Gene enrichment analysis on differentially expressed genes was performed with GSEA 3.0 (Gene Set Enrichment Analysis) in pre-ranked mode using the hallmark gene sets from the Molecular Signatures Database [29]. GO enrichment analysis was performed on alternatively spliced genes with Gorilla [87] using all detected ASEs as a background. Other enrichment analyses were performed by hypergeometric tests or by calculating empirical p-values in R.

RNA-Binding motif enrichment analysis
Sequences of regulated exons detected by rMATS (SE events only) and up to 200 bp of their flanking introns were retrieved using bed tools v0.9.1. Control sequences consisted of detected ASEs with low ΔPSI values and high FDR. The FASTA files generated were then interrogated for known RNA-binding motifs from the literature [91] using the AME software from MEME suite 5.0.2 with default settings [57]. Motifs with an adjusted p-value <0.05 were considered significant.

PLOS PATHOGENS
The viral proteins Tax and HBZ control host RNA metabolism