The authors have declared that no competing interests exist.
Conceived and designed the experiments: A. Moiani FM. Performed the experiments: A. Moiani ER JDS. Analyzed the data: A. Moiani A. Miccio MS DP GDB. Contributed reagents/materials/analysis tools: JDS CB. Wrote the paper: A. Moiani FM.
Moloney murine leukemia virus (MLV)-derived gamma-retroviral vectors integrate preferentially near transcriptional regulatory regions in the human genome, and are associated with a significant risk of insertional gene deregulation. Self-inactivating (SIN) vectors carry a deletion of the U3 enhancer and promoter in the long terminal repeat (LTR), and show reduced genotoxicity in pre-clinical assays. We report a high-definition analysis of the integration preferences of a SIN MLV vector compared to a wild-type-LTR MLV vector in the genome of CD34+ human hematopoietic stem/progenitor cells (HSPCs). We sequenced 13,011 unique SIN-MLV integration sites and compared them to 32,574 previously generated MLV sites in human HSPCs. The SIN-MLV vector recapitulates the integration pattern observed for MLV, with the characteristic clustering of integrations around enhancer and promoter regions associated to H3K4me3 and H3K4me1 histone modifications, specialized chromatin configurations (presence of the H2A.Z histone variant) and binding of RNA Pol II. SIN-MLV and MLV integration clusters and hot spots overlap in most cases and are generated at a comparable frequency, indicating that the reduced genotoxicity of SIN-MLV vectors in hematopoietic cells is not due to a modified integration profile.
Retroviral integration is a non-random process, whereby the viral RNA genome, reverse transcribed into double-stranded DNA and assembled in pre-integration complexes (PICs), associates with the host cell chromatin and integrates through the activity of the viral integrase
Recent clinical studies have shown that transplantation of stem cells genetically modified by retroviral vectors may cure severe genetic diseases such as immunodeficiencies
In this study, we have analyzed the integration profiles of a MLV and SIN MLV vectors in the genome of a clinically relevant target cell population, cord blood-derived CD34+ hematopoietic stem/progenitor cells (HSPCs), by ligation-mediated PCR (LM-PCR) and high-throughput sequencing. We show that SIN-MLV and MLV vectors have very similar integration preferences, with the typical clustering around enhancer and promoter regions associated to specific histone modifications, specialized chromatin configurations and binding of RNA Pol II. Strikingly, SIN-MLV and MLV integration clusters and hot spots overlap in most cases and are generated at a similar frequency, indicating that the U3 enhancer has no role in targeting MLV PICs to the genome, at least in hematopoietic cells.
To generate a high-definition integration profile of SIN MLV integrations in human HSPCs, we transduced umbilical cord blood-derived CD34+ cells with a previously described SIN-MLV vector carrying a GFP expression cassette under the control of the human elongation factor 1α (EFS) promoter
To identify differences in the integration preferences of MLV and SIN-MLV in HSPCs, we first analyzed the relationship between integration sites and Known Genes (UCSC definition) in the human genome: integration were annotated as TSS-proximal when occurring in an interval of ±2.5 kb from the TSS of any Known Gene, intragenic when occurring inside a Known Gene >2.5 kb from the TSS, and intergenic in all other cases. Intergenic and intragenic integrations were <40% for both MLV and SIN-MLV, while TSS-proximal integrations were 22.9% and 23.8% respectively (
Distribution of the distance of SIN-MLV (green bars) and MLV (red bars) integration sites from the TSS of targeted genes at 2,500 bp (
Intergenic (%) | TSS-proximal (%) | Intragenic (%) | CpG islands (%) | CNCs (%) | Total hits | |
|
38.0 | 23.8 | 38.1 | 12.4 | 7.7 | 13,011 |
|
38.2 | 22.9 | 38.8 | 12.8 | 8.0 | 32,574 |
|
59.1 | 3.0 | 37.8 | 1.2 | 5.4 | 40,000 |
Percentage of intergenic, TSS-proximal or intragenic integration sites in the SIN-MLV, MLV and random control site datasets, together with the percentage of sites at a distance of ±1,000 bp from at least one CpG island or conserved non coding region (CNC).
We previously reported that MLV integrations are strongly associated with histone modifications marking transcriptionally active promoters and enhancers, with the specialized H2A.Z histone variant, and with binding sites for RNA Pol II and transcription factors in both HSCs and T cells
All together, these data indicate that the absence of the U3 region of the LTR of the SIN-MLV vector causes no significant change in the integration preferences of MLV in the genome of human HSPCs.
MLV and SIN-MLV integrations showed the same non-random, highly clustered distribution in the human genome, with integration hot and cold spots. Integration clusters were statistically defined as described
We then looked at the integration clusters of both vectors in a number of individual loci characteristically hit at high frequency by MLV integration. Most of the loci highly targeted by MLV were targeted also by SIN-MLV. Whenever the numerosity was sufficient, we observed a striking overlap between the integration hot spots of the two vectors within the same loci. As examples, SIN-MLV integrations faithfully reproduced the MLV integration patterns in the LMO2, EVI2A/B, RUNX1, RUNX2, ZNF217-BCAS1, CD34, ELF1 and NFE2 loci, which were targeted at the same overall frequency and at the same hot spots within each locus (
Distribution of MLV (red) and SIN-MLV (green) integration clusters (horizontal solid bars) and integrations (vertical marks) in the LMO2, RUNX1, EVI2A/B, and ZNF217-BCAS1 loci as displayed by the UCSC Genome Browser. The base position feature at the top (scale bar and chromosome number) identifies the genomic coordinates of the displayed region.
Retroviruses select their target integration sites by tethering their PICs to the host cell chromatin through protein-protein interactions that appear to be specific for each retrovirus type
The peculiar characteristics of MLV integration, coupled with the strong transcriptional enhancer activity of the LTR U3 region, explain the high risk of insertional gene activation and genotoxicity observed in pre-clinical
Genotoxicity of retroviral vectors has many components, including the vector design, the nature of the target cell and the genetic background of the patient, all ultimately affecting the risk of a specific gene therapy approach
Human CD34+ HSPCs were purified form umbilical cord blood, pre-stimulated for 48 hours in serum-free Iscove modified Dulbecco medium supplemented with 20% FCS, 20 ng/ml human thrombopoietin, 100 ng/ml Flt-3 ligand, 20 ng/ml interleukin-6, and 100 ng/ml stem cell factor, as previously described
Genomic DNA was extracted from a pool of 2×106 CD34+/GFP+ cells enriched by fluorescence-activated cell sorting, after a brief period in culture to dilute unintegrated vectors. 3′-LTR vector-genome junctions were amplified by LM-PCR adapted to the GS-FLX Genome Sequencer (Roche/454 Life Sciences) pyrosequencing platform, as previously described
Association between MLV and SIN-MLV integration sites and CpG islands, CNCs, PolII and histone modifications. Distribution of the distance of SIN-MLV (green bars) and MLV wt (red bars) integrations from the midpoint of CpG islands (A) or CNCs (B) in a 20 kb window. In the y axis is plotted the percentage of the total number of CpG islands or CNCs located at ±50 kb distance from the integrations. The black line indicates the distribution of random control sites. (C) The distribution of epigenetic marks in a 10 kb window around vector integration sites (IS) shown for H3K27me3 (top panels), H2A.Z (middle panels), PolII (lower panels) with respect to MLV integrations (left panels) or SIN-MLV integrations (right panels). See legend of
(PDF)
MLV and SIN-MLV integration sites and clusters in CD34+ HSPC-specific loci. Distribution of MLV (red) and SIN-MLV (green) integration clusters (horizontal solid bars) and integrations (vertical marks) in the CD34, ELF1, NFE2, and RUNX2, MECOM and PRDM16 loci as displayed by the UCSC Genome Browser. The base position feature at the top (scale bar and chromosome number) identifies the genomic coordinates of the displayed region.
(PDF)
A.M. is a student of the Ph.D. Program in Cellular and Molecular Biology of the Vita-Salute San Raffaele University, Milan, Italy.