Fig 1.
20861 cells have multiple, rearranged copies of HPV16 at the integration site.
A, Southern blot analysis of 20861 and 20863 W12 cell lines. Genomic DNA was digested with either HindIII (H; does not cut HPV16 DNA) or BamH1 (B; a single cut in HPV16 DNA). DNA fragments were separated by electrophoresis, transferred to membranes and probed with 32P-labeled HPV16 DNA. Arrows indicate relaxed circle (I), unit length linear (II), and supercoiled HPV16 genomes (III). Viral copy number was quantified by comparison to 10 pg linearized HPV16 DNA. The position of BamHI and HindIII cleavage sites spanning the integrated locus (chr2:28,561,317–28,639,445; hg19) and expected band sizes from Southern blot are shown in B. B, Diagram of the integration locus of HPV16 in 20861 cells. APOT analysis showed that HPV16 was integrated into chr2 p23.2 and expresses an E6/E7 fusion transcript that splices to an exon located at 28,595,425–28,595,675 (hg19). The wavy line underneath the expanded viral genome represents the fusion transcript. Primer positions are denoted by black horizontal arrows. Green represents viral sequences, and blue represents host-derived sequences. Splice donor (viral) and acceptor (host) sites are indicated below.
Fig 2.
Cellular DNA at the integration region is co-amplified with HPV.
A, The location of Brd4, HPV16 DNA, and cellular DNA flanking the chromosome 2 integration site were studied in W12 20861 and 20863 (extrachromosomal HPV16) cells, and in HPV-negative NIKS human keratinocyte cells by combined immunofluorescence /fluorescent in situ hybridization. HPV16 DNA is shown in red, and the flanking cellular sequence is shown in green. Brd4 signal is in cyan and nuclei are counterstained with DAPI (blue). In comparable FISH experiments, about 70% 20861 cells contained large focus of signal from the cellular flanking sequence that colocalized with the HPV16 signal. B, Alignment of 20861 WGS data to the human reference genome (hg19) showed focal amplification of cellular sequence at the HPV16 integration site. Histograms represent depth of coverage of aligned reads (quality score threshold, 30). Amplified region is marked by a black horizontal bar; HPV16 genome and cellular genes are represented by green and blue horizontal arrows, respectively. C, WGS reads were aligned to the HPV type 16 isolate 16W12E (AF125673.1) reference genome. Histograms represent copy numbers of viral sequences (blue) and counts of discordant paired-end reads supporting insertional breakpoints (red). The scale (y-axis) of each plot was normalized to maximum read counts. HPV breakpoints are defined further in Table 1 and S1 Table.
Table 1.
Summary of WGS sequencing of W12 clonal cell lines.
Fig 3.
Analysis of the 20861 integration site by molecular combing.
A, Representative image of a DNA fiber containing the Type III HPV16 integration site in 20861 cells. The HPV16 DNA signal is shown in green, and the DNA backbone was visualized with a single-stranded DNA (ssDNA) antibody, shown in red. The map shows the breakpoint of HPV16 in the E2 ORF in the 20861 Type III integration site. B, Scatter plot showing the number of HPV16 genomes per individual DNA fiber. Average signal count per fiber and SD is shown, along with total number of fiber counts. Data are from two replicates. C-D, Scatter plots showing measured HPV16 signal length (C), and measured length of the interspersing space between HPV16 signals from every fiber (D). Measurements were based on the conversion 1 μm = 2 kb. Average measured length in kb and SD is shown, along with total number of fibers counted. Data are from two replicates. E, Two-color fiber-FISH in 20861 cells using DNA probes against HPV16 (shown in red) and the amplified cellular sequence identified from whole genome sequencing (shown in green) confirm the organization of the 20861 Type III integration site as predicted from sequence analysis. Data are from two biological replicates.
Fig 4.
Transcription from the integrated locus in 20861 cells.
Alignment of RNA-seq reads to the human (hg19) and HPV type 16 isolate 16W12E (AF125673.1) reference genomes showed high expression of a fusion transcript encoding E6/E7 spliced into a cryptic acceptor (chr2: 28,595,675). Histograms represent depth of coverage of aligned reads. Amplified region is marked by a black horizontal bar; HPV16 genome and cellular genes are represented by green and blue horizontal arrows, respectively. Schematic of the viral-cellular fusion transcript is depicted in the lower panel; encoded viral nucleotides and splice sites are indicated. Data are averaged from three biological replicates.
Fig 5.
Super-enhancer markers are enriched over the integration site in 20861 cells.
A, Alignment of ChIP-seq reads to the human reference genome (hg19) in 20861 and 20863 cells. Data are averaged from two biological replicates. B, Alignment of ChIP-seq reads to human and HPV type 16 isolate 16W12E (AF125673.1) reference genomes showed enrichment of super-enhancer markers over both viral and cellular chromatin. Histograms represent depth of coverage of aligned reads. Amplified region is marked by a black horizontal bar; HPV16 genome and cellular genes are represented by green and blue horizontal arrows, respectively. Data are averaged from two biological replicates.
Fig 6.
Super-enhancer markers are enriched over the viral URR in 20861, but not other integrated W12 sub-clones.
Chromatin immunoprecipitation (ChIP) was performed in 20861, 20863, 20831 and 20862 cells using antibodies against Brd4 and H3K27ac. A, Map of linearized HPV16 genome showing primer positions (denoted by black horizontal arrows) for the upstream regulatory region (URR), L1 and E2. B, ChIP DNA samples were analyzed by real-time qPCR using primers against target promoters, indicated in panel A. ChIP signals were expressed as the percentage of immunoprecipitated chromatin DNA relative to the total amount of input chromatin (% Input). CCND2 and FOSL1 were included as positive controls for super-enhancer loci; IGLL5 was included as a negative control for Brd4 binding in these cells. C, To account for variations in viral copy number between W12 cells, ChIP signals were expressed as binding per single-copy genome. Background signal at each locus (measured by no-antibody controls) was subtracted from corresponding ChIP signals. Average binding levels were calculated from three independent experiments. Error bars represent SD. Note that similar experiments were previously conducted on 20861 and 20863 cells (using different datasets) [22].
Fig 7.
A cellular epithelial-specific enhancer is amplified in the 20861 super-enhancer.
A, The strong cellular H3K27ac peak shown in Fig 5A aligned with an epithelial-specific enhancer (boxed region) identified from ENCODE. The data from the seven cell lines are represented as colored peaks in the layered H3K27ac track and are highlighted according to the ChIP-seq profiles from the ENCODE/Broad Institute track. The amplified region is marked by a black horizontal bar; the HPV16 genome insertion point is marked by a green arrowhead. B, The H3K27ac ChIP-seq profiles from individual cell types are shown (GM12878, B-lymphocytes; H1-hESC, human embryonic stem cells; K562, human erythroleukemic cell line; HeLa-S3, cervical carcinoma; HepG2, hepatocellular carcinoma; HUVEC, human umbilical vein cells; CD14+, CD14-positive monocytes; HMEC, human mammary epithelial cells; HSMM, human skeletal muscle myoblasts; NH-A, normal human astrocytes; NHDF-Ad, normal human dermal fibroblasts-adult; NHEK, normal human epidermal keratinocytes; NHLF, normal human lung fibroblasts; MCF-7, mammary gland adenocarcinoma; HCT-116, colorectal carcinoma; PANC-1, pancreatic carcinoma). C, H2A.Z, H3K9ac, H3K4me1, H3K4me2, and H3K27ac tracks in NHEK cells from the ENCODE/Broad Institute track.