Human T-cell leukemia virus type 1 infects multiple lineage hematopoietic cells in vivo

Human T-cell leukemia virus type 1 (HTLV-1) infects mainly CD4+CCR4+ effector/memory T cells in vivo. However, it remains unknown whether HTLV-1 preferentially infects these T cells or this virus converts infected precursor cells to specialized T cells. Expression of viral genes in vivo is critical to study viral replication and proliferation of infected cells. Therefore, we first analyzed viral gene expression in non-human primates naturally infected with simian T-cell leukemia virus type 1 (STLV-1), whose virological attributes closely resemble those of HTLV-1. Although the tax transcript was detected only in certain tissues, Tax expression was much higher in the bone marrow, indicating the possibility of de novo infection. Furthermore, Tax expression of non-T cells was suspected in bone marrow. These data suggest that HTLV-1 infects hematopoietic cells in the bone marrow. To explore the possibility that HTLV-1 infects hematopoietic stem cells (HSCs), we analyzed integration sites of HTLV-1 provirus in various lineages of hematopoietic cells in patients with HTLV-1 associated myelopathy/tropical spastic paraparesis (HAM/TSP) and a HTLV-1 carrier using the high-throughput sequencing method. Identical integration sites were detected in neutrophils, monocytes, B cells, CD8+ T cells and CD4+ T cells, indicating that HTLV-1 infects HSCs in vivo. We also detected Tax protein in myeloperoxidase positive neutrophils. Furthermore, dendritic cells differentiated from HTLV-1 infected monocytes caused de novo infection to T cells, indicating that infected monocytes are implicated in viral spreading in vivo. Certain integration sites were re-detected in neutrophils from HAM/TSP patients at different time points, indicating that infected HSCs persist and differentiate in vivo. This study demonstrates that HTLV-1 infects HSCs, and infected stem cells differentiate into diverse cell lineages. These data indicate that infection of HSCs can contribute to the persistence and spread of HTLV-1 in vivo.


Introduction
Human T-cell leukemia virus type 1 (HTLV-1) is the causal agent of adult T-cell leukemialymphoma (ATL) and inflammatory diseases including HTLV-1 associated myelopathy/ tropical spastic paraparesis (HAM/TSP) [1][2][3][4]. HTLV-1 is a unique retrovirus since this virus transmits only by cell-to-cell infection [5][6][7]. The infectivity of free HTLV-1 virions is very inefficient whereas this virus transmits efficiently through cell-to-cell contact [8,9]. Therefore, HTLV-1 induces proliferation of infected cells to increase the chance of transmission [10][11][12]. There are two different ways to increase the number of HTLV-1-infected cells in vivo: proliferation of infected cells (mitotic division) and de novo infection [7]. It is thought that mitotic division is predominant in the chronic infection of this virus.
HTLV-1 is a member of the primate T-cell leukemia virus type 1 (PTLV-1) group, which contains simian T-cell leukemia virus type 1 (STLV-1) [13]. Based on phylogenetic analyses, HTLV-1 is thought to be derived from STLV-1 by interspecies transmission [14]. Old World monkeys are infected with STLV-1 while New World monkeys are not infected [15]. It was reported that the seroprevalence of STLV-1 in Japanese macaques (JMs) was high [16]. We have reported that STLV-1 induces clonal proliferation of CD4 + T cells in vivo, and development of T-cell lymphoma was observed in a STLV-1 infected JM [17]. STLV-1 encodes Tax in the plus strand and STLV-1 bZIP factor (SBZ) in the minus strand. STLV-1 Tax and SBZ possess similar functions to HTLV-1 Tax and HTLV-1 bZIP factor (HBZ). Therefore, the STLV-1 infected JM is a good model for HTLV-1 infection [17]. However, the frequency of expression of viral genes in various organs and tissues in vivo remains unknown.
The receptors for HTLV-1 are glucose transporter 1 (GLUT-1) and neuropilin 1, which are expressed on various types of cells [18]. Therefore, this virus can infect different types of cells in vitro [19][20][21][22]. However, the HTLV-1 provirus is mainly detected in CD4 + T cells, in particular, CADM1 + CCR4 + CD45RO + T cells in vivo [23][24][25][26]. This suggests that HTLV-1 either modulates the immunophenotype of T cells, or preferentially infects this subpopulation. Since the cellular receptors of this virus do not absolutely define the specificity of target cells, it is possible that hematopoietic stem cells (HSCs) are also infected by HTLV-1. Although Tax expressing cells were found in the bone marrow [27], previous studies reported that HTLV-1 did not infect HSCs in ATL patients [28,29].
In this study, we analyzed expression of tax and SBZ genes in various organs and tissues of STLV-1 infected JMs, and found that expression of SBZ were higher than those of the tax gene, while the tax gene was highly expressed in the peripheral blood and bone marrow, suggesting that infectious cycle replication of STLV-1 occurs in the bone marrow. To explore the possibility that HTLV-1 infects HSCs, we analyzed integration sites of HTLV-1 in the different hematopoietic cells. The same integration sites of HTLV-1 proviruses were detected in neutrophils, monocytes, B cells, CD8 + T cells and CD4 + T cells in HAM/TSP patients, suggesting that HTLV-1 infects HSCs. This study uncovers a new aspect of HTLV-1 infection and spread in vivo.

Expression of viral genes in various organs and tissues
To explore in vivo viral gene expression of virus-infected cells, we analyzed the proviral loads (PVLs) and transcripts of the tax and SBZ genes in STLV-1-infected JMs. To reduce contamination of peripheral blood lymphocytes in organs and tissues, three monkeys were perfused (JM1, JM2 and JM3). PVL was presented as the percentage of infected cells in total cells. The limit of detection was 100 copies per sample as described in Materials and Methods. The PVL varied between tissues and organs although it was high in peripheral blood, lymph nodes and spleen (S1 Table), indicating that infected cells are abundant in lymphatic tissues and lymphocytes. Next, we quantified tax and SBZ transcripts per provirus of these tissues and organs. The expression levels of tax and SBZ were measured by real-time PCR with the ddCt algorithm using an STLV-1 infected cell line, Si-2, as a reference. In order to compare the expression of tax and SBZ in each sample, the absolute amount of them in Si-2 was determined by the standard curve method, and the expression values in all samples were normalized to the expression in Si-2 cells. Details of this calculation are described in Materials and Methods. In general, the level of SBZ expression was much higher than that of tax. SBZ was expressed in most tissues and organs, although the level of expression was variable (Fig 1). However, tax transcripts were detected in very limited tissues and organs. In particular, the tax transcript was highly expressed in peripheral blood and bone marrow.

Multi-lineage hematopoietic cells are infected by HTLV-1
It has been reported that Tax-expressing cells were abundant in the bone marrow of HAM/ TSP patients [27]. Likewise, this study showed that higher Tax expression was found in the bone marrow cells of STLV-1-infected JMs (Fig 1). Since Tax is essential for viral replication and transmission, the presence of Tax-expressing cells suggests that de novo infection of HSCs with HTLV-1 occurs in the bone marrow. To address this question, we analyzed Tax expression in bone marrow cells of two STLV-1-infected JMs (JM4, 5) and an uninfected JM (JM6). Twenty-four hours after removal of CD8 + T cells from the bone marrow cells, Tax expression was measured by flow cytometry. As shown in Fig 2, both CD3 + and CD3bone marrow cells expressed Tax. Tax positive cells were also found in CD4or CD8cells. These data indicate that non-T cells are infected by STLV-1. On the other hand, Tax expression was not detected in a non-infected monkey (JM6: S2 Fig). Further analyses suggested that stem cells (CD4 -CD34 + ), myeloid cells (CD4 dim CD33 + or CD4 -CD33 + ) and B cells (CD4 -CD19 + ) express Tax in vivo (Fig 2 and S2 Fig). These data indicate the possibility that not only T cells but also non-T cells are infected by STLV-1 in bone marrow.
In view of the similarity between STLV-1 and HTLV-1, we speculated that HTLV-1 also infects hematopoietic precursor cells in bone marrow. To test this possibility, we analyzed the genomic integration sites of HTLV-1 in peripheral blood mononuclear cells (PBMCs) and neutrophils of a HAM/TSP patient (HAM/TSP#1) using high-throughput sequencing.
Contamination of infected T cells is a serious problem to identify the integration sites of provirus in various hematopoietic cells. Since neutrophil is abundantly present in the peripheral blood, the level of T-cell contamination is low. Contamination of lymphocytes in isolated neutrophils was morphologically confirmed, and their percentages were 0.2-0.9% (0.92% for HAM/TSP#1, 0.20% for HAM/TSP#2, and 0.26% for HAM/TSP#3).
We observed certain integration sites in both PBMCs and neutrophils (HAM/TSP#1)(S2 Table), suggesting that HTLV-1 infects HSCs in vivo. However, since most proviruses are present in T cells in vivo, the risk of contamination of T cells cannot be excluded in this experiment. To examine further the possibility of HTLV-1 infection in HSCs, we isolated various lineages of hematopoietic cells (CD4 + T cells, CD8 + T cells, B cells, monocytes, and neutrophils) from two HAM/TSP patients (HAM/TSP#2 and #3) and a HTLV-1 carrier, and then  Table). The PVLs of HTLV-1 in each of these lineages are shown in S4 Table. To avoid the contamination of detected sequences, we analyzed each sample using the Ion PGM machine on a separate chip. The observation that a given proviral integration site is present at a higher abundance in non-T cell lineages than in CD4 + or CD8 + T cells, argues against the possibility of T-cell contamination. Further, the presence of such integration sites in different cell lineage suggests that HTLV-1 infects HSCs. Tables 1-3 show the 15 most abundant clones in each cell lineage from two HAM/TSP patients and a HTLV-1 carrier. We repeatedly identified identical integration sites in different cell types, suggesting that HTLV-1 infects HSCs in the bone marrow and subsequently differentiate in vivo. The purity of each cell type was not perfect, raising the question whether certain detected integration sites were derived from contaminating T cells. However, certain integration sites that were frequently observed in neutrophils, B cells or monocytes were rarely detected in CD4 + T cells. Conversely, certain integration sites observed in highabundance CD4 + T cell clones were not detected in cells of other lineages (Tables 1, 2 and 3). These data suggest that in non-T cells and HSCs are infected with HTLV-1. All integration site data in all lineage cells are summarized in S5 Table. Next, we analyzed the proportion of HTLV-1 infected cells that share integration sites with other lineage cells and are derived from infected HSCs. The percentages of infected cells with the same integration sites with other lineage cells were generally high in neutrophils, monocytes and B cells (Fig 3), suggesting that these cells were infected as precursor cells in the bone marrow. A substantial number of CD4 + T cells (16.0% for HAM/TSP#2, 16.7% for HAM/TSP#3 and 35.9% for a HTLV-1 carrier) possessed integration sites observed in other hematopoietic cells. This indicates that some HTLV-1 infected CD4 + T cells are derived from infected HSCs.

HTLV-1 infection in neutrophils and monocytes
These data indicate that HTLV-1 infects HSCs in vivo. To confirm the presence of HTLV-1 infection in neutrophils, we tried to detect Tax protein in neutrophils using immunofluorescent staining. Tax protein was detected in the neutrophils from HAM/TSP patients along with myeloperoxidase ( Fig 4A), which confirmed HTLV-1 infection of neutrophils.
It has been reported that HTLV-1 infected DCs spread virus to T cells via a virological synapse [30]. When the monocytes that are infected in the bone marrow differentiate in vivo, infected DCs may subsequently disseminate the virus. To check this possibility, we differentiated monocytes from HAM/TSP patients to DCs using GM-CSF and IL-4 in the presence of azidothymidine (AZT), and the differentiated DCs were then co-cultured with Jurkat cells stably transfected with plasmid that encodes the tandem dimer Tomato (tdTomato) under the control of the Tax responsive element (JET WT35). Differentiation to DCs was confirmed by expression of CD11c and CD209, and loss of CD14 expression (S3

Discussion
It has been reported that HTLV-1 can infect various types of cells in vitro [19][20][21][22]. Furthermore, the provirus was detected in various hematopoietic cells in vivo [32]. However, it remains uncertain whether HTLV-1 infects HSCs in vivo. It was thought that HTLV-1 infects mature lymphocytes, macrophages and dendritic cells in the periphery. Indeed, previous  Mo 3  Neu 3  ID 1  CD4 3  CD8 3  B 2  Mo 3  Neu 3  ID 1  CD4 3  CD8 3  B 3  Mo 3  Neu 2   11921  159  0  0  5  0  11070 4  0  26  57  0  0  10660  5  178  37  40  43   10549  140  3  0  24  6  10660  5  178  37  40  43  10359 4  8  0  0  7  21   11565  101  4  0  6  3  9170  0  62  27  7  0  10427   studies reported that HSCs were not infected by HTLV-1 [28,29]. In these studies, HSCs were isolated from patients with ATL, in which most of the HTLV-1-infected cells were leukemic cells that frequently do not express Tax [3]. On the other hand, Tax was relatively highly expressed in peripheral blood of HAM/TSP patients, and Tax-expressing cells were also found in the bone marrow of HAM/TSP patients [27,33]. These observations raise the possibility that bone marrow is a reservoir of HTLV-1 [34]. It has been reported that CD4 + memory T cells specific for cytomegalovirus, tetanus toxoid, measles, mumps and rubella are enriched in the bone marrow [35]. It is possible that such memory T cells are infected by HTLV-1 and express Tax in the bone marrow. Tax is an essential protein for HTLV-1 replication, and the observed Tax expression suggests that de novo infection occurs in the bone marrow in HAM/ TSP patients. Indeed, we first presented the evidence that HTLV-1-infected hematopoietic cells of different lineages have the same integration sites in vivo, indicating that this virus infects HSCs.
It is critical to show that these commonly identified integration sites are not derived from contamination with HTLV-1-infected lymphocytes during separation of each cells. It is almost impossible to completely exclude contamination of lymphocytes from isolated cells in every case due to predominance of CD4 + T cells in infected cells. However, it is unlikely that all of the integration sites identified in different cell lineages were derived from contaminated cells, for the following reasons. First, approximately 90% of HTLV-1 provirus is detected in CD4 + T cells [23]. Therefore, the contaminating cells are likely to be CD4 + T cell clones with high abundance. However, the abundance of a given integration site in non-CD4 + T cells was frequently higher than that in CD4 + T cells (Tables 1-3 (Tables 1-3). These data indicate that HTLV-1 infects HSCs in vivo. However, the percentage of infected cells derived from infected HSCs might be over or underestimated. As shown in S6 Table, identified integration sites in neutrophil were frequently found only in CD4 + T cells one year ago, suggesting that frequency of infected cells derived from infected HSCs is underestimated. At the same time, contamination of infected CD4 + T cells might cause overestimation of this frequency especially when only a cell with the integration site was found in non-T cell lineage.
Is HTLV-1 infection of HSCs beneficial for this virus? Viral transmission needs expression of viral antigens (Env, Gag, Pol, Tax, and Rex) to form viral particles. Therefore, cytotoxic T lymphocytes tend to attack infected cells during transmission. In this regard, HTLV-1-infected cells that differentiate from HSCs can reduce necessity to express viral antigens in vivo. It might be a strategy of HTLV-1 to decrease viral replication, in order to avoid immune attack by the host. After transmission via breast-feeding or sexual intercourse, it is speculated that HTLV-1 infects many T cells as shown in bovine leukemia virus infected cow [36]. During this stage, some infected T cells might migrate into the bone marrow. It is possible that hypoxic condition in the bone marrow enables infected T cells to express Tax, which causes de novo infection [37]. This scenario should be analyzed in the future studies.
It is intriguing that a fraction of infected CD4 + T cells appear to be derived from infected HSCs, suggesting that infected pre-T cells in the bone marrow migrate to the thymus and differentiate to CD4 + and CD8 + T cells. It has been well recognized that HTLV-1-infected cells and ATL cells possess specific surface markers including CD4, CD25, CCR4, and CADM1 [24-26, 38, 39]. There are two possible scenarios. First, HTLV-1 targets this specific subpopulation. Secondly, viral proteins modulate phenotypes of infected cells. Our finding that HTLV-1 infected HSCs can differentiate to mature CD4 + T cells in vivo suggests that viral proteins convert infected cells to cells with specific markers, which supports the second hypothesis. It has been reported that HBZ induces expression of Foxp3 while Tax suppresses its expression [40,41]. Recently, we have reported that HBZ induces expression of CCR4, T cell immunoglobulin and ITIM domain (TIGIT), and PD-1, which are expressed on ATL cells and HTLV-1-infected cells [42]. Thus, HBZ is considered to control the immunophenotype of infected cells and ATL cells during differentiation from HSC. Analyses of integration sites at different time points demonstrated that identical integration sites were frequently detected in neutrophils and other lineage cells (Fig 5), indicating that HTLV-1 infected HSCs can persist in vivo for at least one year. It is noteworthy that approximately half the observed integration sites in neutrophils were detected in other lineage cells one year earlier. These findings suggest that most of infected HSCs persist in vivo. It has been reported that the risk of cancer is influenced by the number of stem cell divisions [43]. If HTLV-1-infected HSCs survive for a long time, persistent HTLV-1 infection in HSCs might predispose to leukemogenesis by HTLV-1.
It has been shown that HTLV-1 infects DCs, which likely transmits viruses to T cells [21,30,44]. This study reveals that at least, some HTLV-1 infected monocytes are derived from infected HSCs in vivo. It is thought that DCs derived from infected monocytes efficiently transmit virus to T cells through virological synapses formed between DCs and T cells. De novo infection requires expression of Tax and other viral proteins. However, infected monocytes derived from HSCs do not need to express viral proteins until transmission occurs at the periphery. This strategy might therefore enable infected cells to evade the host immune responses in vivo.
In this study, we demonstrate that HTLV-1 infects HSCs, which then differentiate to multiple lineage hematopoietic cells in vivo. This study suggests that HTLV-1-infected HSCs form a persistent reservoir of HTLV-1 infection, which has implications for viral propagation and possibly leukemogenesis.
To obtain whole blood, bone marrow aspirates and organs from Japanese macaques (Macaca fuscata), four animals were euthanized with Pentobarbital (50mg/kg) [17]. Appropriate procedures were utilized in order to reduce potential distress, pain and discomfort. We obtained whole blood, bone marrow aspirates and organs for this study. Before sampling of organs, monkeys were perfused with phosphate buffered saline (PBS) to get rid of the contamination of PBMC in solid organs.

Ethics statement
Blood samples from adult patients with HAM/TSP and a HTLV-1 carrier were collected after the written informed consent was obtained in accordance with the Declaration of Helsinki. These experiments were approved by the Institutional Ethics Committee of Kyoto University (approval number G311).
Six Japanese monkeys (Macaca fuscata) were used for this study. All monkeys were supplied from colonies in the Primate Research Institute. The monkeys were reared in outdoor group cages with wooded toys provided as environmental enrichment. They were fed with apple, potato and commercial monkey diet. They were able to access to water ad libitum. They had own health record from birth with yearly health checkup. Blood samples were obtained from the macaques under ketamine anesthesia with medetomidine, followed by administration of its antagonist atipamezole at the end of the procedure. At euthanasia, ketamine anesthesia to the macaques was followed by injection of pentobarbital sodium at a dose of !25 mg/kg. Then they were perfused with phosphate buffered saline (PBS) to get rid of the contamination of blood cells in solid organs before necropsy for this study.  [45]. They were cultured with RPMI 1640 medium supplemented with 10%FBS, antibiotics and G418 (250 μg/mL) for selection.
Detection of cell-to-cell infection by JET WT35 cells 5x10 4 JET WT35 cells were co-cultured with 1x10 4 cells of either HPB-ATL-2 or CEM cell lines in 12 wells plate in presence of either DMSO, azidothymidine (AZT) (5 μM), or raltegravir (RAL) (5 μM). 1.5x10 5 DCs that were differentiated from monocytes of HAM/TSP patients were co-cultured with 1.5x10 5 JET WT35 cells. After 48 hours, total number of tdTomato positive cells was counted. Images represent overlay of differential interference contrast and tdTomato channels.

Proviral load
Proviral load was measured by real-time PCR as previously described [17,23]. Briefly the copy number of the pX region and RAG1 gene in genomic DNA was quantified. HTLV-1 proviral load was calculated with relative quantification method by using TL-Om1 of which proviral load is 100%. STLV-1 proviral load was calculated with absolute quantification method using plasmid DNA that contains STLV-1 sequence. We used serially diluted plasmid DNAs (the limit of detection is 100 copies) for standard curve. Therefore, we defined proviral load lower than 100 copies as under detection level (UD) in S1 Table. The sequences of primers for RAG1 and pX were reported before [17] and newly constructed ones were as follows; tax primer (human) 5'-GAAGACTGTTTGCCCACCACC-3' (sense) and 5'-TGAGGGTTGAGTGGAAC GGA-3' (anti-sense); pX probe (Human) was 5'-CACCCGTCACGCTAACAGCCTGGCA A-3'. The reaction conditions were 50˚C for 2 minutes, 95˚C for 10 minutes and 45 cycles of 15 seconds at 95˚C, followed by 60seconds at 60˚C.

Quantitative analysis of viral gene expression
Total RNA was extracted using Trizol reagent (Thermo Fisher Scientific). The tissues of JMs were treated with RNA later (Thermo Fisher Scientific) to prevent RNA degradation. Reverse transcription was performed using random primer and SuperScript III reverse transcriptase (Thermo Fisher Scientific). The transcripts of SBZ and those of STLV-1 tax were measured by real time PCR. GAPDH mRNA was measured as internal control. The primers and probes for GAPDH were previously described [46]. Others were as follows; stax primers; 5'-ATCCCGTGGAGGCTCCTC-3' (sense) and 5 0 -CCAAATACGTAGACTGGGTATCC AT-3 0 (anti-sense); stax probe; 5 0 -ACCAACACCATGGCCCACTTCCC-3 0 ; SBZ primers; 5'-A GAGCGCAACTCAACCGG-3' (sense) and 5'-GCAGGGAACAGGTAAACA TCG-3'(antisense); SBZ probe; 5'-TGGATGGCGGCCTCAGGGCC-3'.The sequence of GAPDH primers and probe for JMs were same as those for humans. The amplification condition was 50˚C for 2 min, 95˚C for 10 min, 45 cycles of 95˚C for 15sec and 60˚C for 1 min. For the comparison of the expression level of tax with that of SBZ in each sample, we normalized the values of tax and SBZ per infected cell. Briefly, the relative expression levels of SBZ and tax were quantified by ddCt method using Si-2, which is an STLV-1-infected cell line, as a reference sample. Next, we determined absolute copy number of tax and SBZ transcripts in Si-2, and found that the copy numbers of tax and SBZ transcripts were 24.7 and 1, respectively. To normalize the expression levels of SBZ and tax in primary JM tissues, the value of tax was multiplied by 24.7 (S1 Fig).

Differentiation of monocyte to dendritic cells in vitro
Monocytes from HAM/TSP patients were isolated using positive selection with BD IMag systems (BD Bioscience). Then, monocytes were cultured in AIMV medium (Thermo Fisher Scientific) supplemented with 5% human AB serum, IL-4 (10 ng/ml) and GM-CSF (10 ng/ml). To avoid de novo infection of HTLV-1, raltegravir (10 μM) or azidothymidine (5 μM) was added. After culture with the antiviral drugs for 5 days, cells were washed by RPMI supplemented with 10% FBS. Cells were then co-cultured with JET WT35 in RPMI 1640 medium supplemented with 10% FBS and antibiotics without G418. The JET WT35 cells are indicator Jurkat cells stably transfected with a plasmid that encodes the tdTomato under the control of the Tax responsive element. After 48 hours, tdTomato expression in co-cultured cells was observed with the EVOS FL fluorescence microscope (Thermo Fisher Scientific, 20× objective lens), and images were acquired by a CCD camera with which microscopy is equipped and built-in software.

Immunofluorescent staining
Cells were fixed using 2% paraformaldehyde for 15min and fixed on poly-D-lysine coated glass or FRONTIER-coated slides (Matsunami-glass) by centrifugation. Fixed cells were permeabilized with 0.2% Triton X-100 for 15 min, blocked by overnight incubation in blocking solution (10% Blocking One and 5% Normal Goat Serum in PBS, both from Nakalai tesque, Japan) at 4˚C, and then incubated with anti-Tax antibody (clone: MI73) (1:1000) for 3 days at 4˚C [46]. After incubation, cells were gently washed five times with PBS, treated with secondary antibody (1:500 dilution, Alexa Fluor 488-conjugated goat anti-mouse IgG, abcam) for 1 h at room temperature, and subsequently rinsed five times with PBS. After re-blocking as described above, myeloperoxidase was stained to identify neutrophils. Cells were incubated with anti-myeloperoxidase antibody (1:200 dilution, abcam) for 1h at room temperature, rinsed five times with PBS, incubated with secondary antibody (1:500 dilution, Goat Anti-Rabbit IgG H&L (Alexa Fluor 568), abcam) for 1 h at room temperature, and washed five times with PBS. Then they were mounted with ProLong Gold antifade reagent with DAPI (Thermo Fisher Scientific) for staining of the nuclei. Images were observed using a confocal microscopy (FV1000-D IX81, Olympus, 40× objective lens) at room temperature and obtained with FV10-ASW software. Brightness and contrast were adjusted by ImageJ software (National Institutes of Health, http://imagej.nih.gov/ij/index.html).

High throughput sequencing of provirus integration sites
Integration sites were amplified with ligation mediated PCR and high-throughput sequencing was performed as previously reported with some modifications using Ion Torrent Personal Genome Machine (Ion PGM, Thermo Fisher Scientific) or Miseq (Illumina) [17,47]. Genomic DNA was extracted with phenol/chloroform method and sheared by sonication with a Covaris M220 instrument (Covaris). After end-repair and linker ligation, nested PCR was performed to amplify the integration sites using the primers specific for viral and linker sequences. Amplicons were size-selected with E-Gel SizeSelect Agarose Gel (Thermo Fisher Scientific) to generate libraries for Ion PGM. Templates were prepared by Ion PGM Hi-Q OT2 Kit or Ion PGM Template OT2 400 Kit, and then sequencing was performed on Ion 318 Chip Kit v2 using Ion PGM Hi-Q Sequencing Kit or Ion PGM Sequencing 400 Kit (Thermo Fisher Scientific). For Miseq, additional steps were needed after nested PCR. TruSeq DNA PCR-Free Sample Prep Kit (Illumina) was used to ligate the adaptor specific for Miseq according to the manuscripts but without fragmentation because samples were already fragmented. PCR products after nested PCR were used as input DNA in that case. High throughput sequencing was performed according to the manufacturer's instructions.

Bioinformatics
For Illumina pair-end sequencing, the obtained reads were trimmed with a quality threshold of 20 on the Phred scale and minimum length of both pairs of 20 bp in order to remove low quality reads and short reads using Trim Galore! (http://www.bioinformatics.babraham.ac.uk/ projects/trim_galore/). The end of viral sequence ("TTTAGTACACA" was used as a marker) and that of linker sequence ("TCGCTCTTCCGATCT" was used as a marker) were removed from Read1 and Read2. Trimmed reads were arranged as Read1 including sequence started from the beginning base of integration site and Read2 including sequence started from the end base of shear-site and aligned to human genome reference (UCSC hg38) using Burrows-Wheeler Aligner (BWA) [48]. The reads were filtered by mapping quality, removing supplementary reads and excluding un-paired reads with SAMtools software package [49]. Minus strand sequences were converted into complementary sequencing to count the number of clones and PCR duplicates. Sequence similarity for the longest sequence in each integration site was evaluated through the program ClustalW (version 2) in order to remove twin integration sites arising from mismapping of some duplicates [50]. With regard to the pair of clones with high homology score (>85), the clone which has the smaller number of shear sites was removed. When the number of shear sites was same, we used total read number including the number of PCR products in addition to the number of shear sites. Furthermore, both clones were excluded when the pair of clones has the same shear-sites number and the same number of reads.
For Ion PGM single end sequencing, data analysis was done as previously reported with some modifications [17]. When extracting the host genomic sequences, the viral 3' LTR sequence (TAGTACACA) and the linker sequence (AGATCGGAA) were regarded as the tags and removed. After the reads were mapped to human genome reference (UCSC hg38) by BWA, they were filtered by the mapping quality. The length of host genomic sequences was calculated using the start position and cigar codes in the SAM files to count the number of clones and PCR duplicates. Sequence similarity was assessed as described earlier.
Supporting information S1 Fig. Quantification and normalization of tax and SBZ expression by real-time PCR. (A) Transcripts of STLV-1 tax and SBZ in various tissues were quantified by real-time PCR. In order to compare the expression levels of SBZ and tax in each sample, we calculated their relative expression values as follows. To normalize the expression values of all samples, we decided to use cDNA of Si-2, which is an STLV-1-infected cell line, as a reference sample. First we determined the absolute copy numbers of SBZ and tax in Si-2 cDNA by the standard curve method. To draw the standard curves to quantify SBZ or tax transcripts, we generated the plasmids containing the fragments of the SBZ gene or the tax gene. The standard curves drawn with SBZ-and tax-encoding plasmids were quite similar as shown in S1 Fig A, indicating that PCR efficiency of SBZ was similar to that of tax. (B) Using these standard curves, we found that the copy numbers of tax and SBZ in Si-2 were 35.8 copies and 1.45 copies respectively, showing that the expression of tax was 24.7 times higher than that of SBZ in Si-2 (the ratio of tax to SBZ was 24.7:1 in Si-2). (C) The relative expression levels of SBZ and tax in all JM tissues were quantified by the ddCt method using the Si-2 cDNA as a reference sample (SBZ in Si-2 was assumed as 1, and tax in Si-2 was done as 24.7 for normalization). Since percentages of infected cells in each tissue were varied, the expression values of SBZ and tax were divided by the proviral load of each sample to reflect the expression levels of SBZ and tax per infected cell. An example of the normalized value is shown.