The authors have declared that no competing interests exist.
Conceived and designed the experiments: DAF ORR MP GA RY JAM DEH MV MPW LL JAD. Performed the experiments: DAF ORR AK RLJ GA RY LG CB YZ SV LL. Analyzed the data: DAF ORR MP AK RLJ GA RY MAC JAM DEH MEC LF LL JAD. Contributed reagents/materials/analysis tools: LG CB SV JC. Wrote the paper: DAF ORR MP GA RY YZ MAC JAM MEC LF LL JAD.
The small genome of polyomaviruses encodes a limited number of proteins that are highly dependent on interactions with host cell proteins for efficient viral replication. The SV40 large T antigen (LT) contains several discrete functional domains including the LXCXE or RB-binding motif, the DNA binding and helicase domains that contribute to the viral life cycle. In addition, the LT C-terminal region contains the host range and adenovirus helper functions required for lytic infection in certain restrictive cell types. To understand how LT affects the host cell to facilitate viral replication, we expressed full-length or functional domains of LT in cells, identified interacting host proteins and carried out expression profiling. LT perturbed the expression of p53 target genes and subsets of cell-cycle dependent genes regulated by the DREAM and the B-Myb-MuvB complexes. Affinity purification of LT followed by mass spectrometry revealed a specific interaction between the LT C-terminal region and FAM111A, a previously uncharacterized protein. Depletion of FAM111A recapitulated the effects of heterologous expression of the LT C-terminal region, including increased viral gene expression and lytic infection of SV40 host range mutants and adenovirus replication in restrictive cells. FAM111A functions as a host range restriction factor that is specifically targeted by SV40 LT.
Viruses have evolved numerous mechanisms to counteract host cell defenses to facilitate productive infection. Simian Virus 40 (SV40) replication depends on specific interactions between large T antigen (LT) and a wide variety of host cell proteins. Although the LT C-terminal region has no evident enzymatic activity, mutations or deletions of this region significantly reduce the ability of the virus to replicate in restrictive cell types. Here, we identified host proteins that bind to LT and determined that the LT C-terminal region binds specifically to FAM111A. This physical interaction was required for efficient viral replication and sustained viral gene expression in restrictive cell types. In addition, RNAi-mediated knockdown of FAM111A levels in restrictive cells restored lytic infection of SV40 host range mutants and human adenovirus. These results indicate that FAM111A plays an important role in viral host range restriction. Our study provides insights into the viral-host perturbations caused by SV40 LT and the interaction of viruses with host restriction factors.
SV40 large T antigen (LT) is a multifunctional viral protein that plays a central role in orchestrating productive viral infection as well as cellular transformation. Discrete regions of LT are required for binding to specific host proteins and provide specific functions. The LXCXE motif (residues 103–107) binds to the retinoblastoma family of proteins RB (RB1), p107 (RBL1) and p130 (RBL2) to promote cell cycle entry. The N-terminal J domain (residues 1–82) binds specifically to heat shock protein chaperone HSC70 (HSPA4) and contributes to efficient viral replication as well as inactivation of p107 and p130 growth suppressing activities
Viral host range is defined as the set of cells, tissues and species that a virus can productively infect. There are a wide variety of cellular host range restriction factors as well as counter strategies employed by viruses to overcome them. Sometimes virally encoded proteins bind directly to specific host proteins to overcome host range restriction. SV40 host range mutant viruses, all of which contain deletions or truncations in the C-terminal region of LT, express lower levels of mRNA and protein for early (LT) and late (VP1) genes compared to wild type virus and fail to support lytic infection in restrictive cell types
Here, we examined host interactome and transcriptome perturbations induced by full-length and discrete functional domains of LT. The resulting data provides a global view of LT-host cell interactions and highlights cellular pathways perturbed by the presence of LT. Notably, we identified FAM111A, a previously uncharacterized cellular protein that binds specifically to the C-terminal region of LT. We provide evidence that this interaction contributes to SV40 host range and adenovirus helper functions.
The C-terminal region of LT is required for efficient viral gene expression and replication in the African green monkey kidney (AGMK) CV-1P cell line
(A) CV-1P or U-2 OS cells were transfected with vector control (M), viral DNA encoding wild type SV40 (WT), host range mutant HR684 (HR), or HR plus HA-tagged LT C-TERM (HR+C). Ninety-six hours post-transfection, cell lysates were western blotted for LT, VP1, LT C-TERM (HA) and vinculin (VIN). (B) U-2 OS cells stably expressing vector only or LT C-TERM were transfected with HR684 viral DNA. Lysates were prepared at the indicated time points (hours) and blotted as in (A).
Proteomic analysis was not possible in AGMK cells because whole genome and proteome sequences were not available. Instead, we tested several human cell lines for the ability of the LT C-terminal region to increase levels of host range mutant viral genes. Increased levels of HR684 LT and VP1 were observed in U-2 OS but not in HeLa or T98G cells when co-expressed with LT C-TERM (
To examine the effect of the LT C-terminal region on LT and VP1 levels, HR684 viral DNA was transfected into U-2 OS cells that stably expressed the C-TERM construct or empty vector (
Discrete functional domains within the SV40 LT protein bind to diverse host cell proteins (
(A) Schematic representation of the SV40 LT protein. Functional domains including the J domain, the LXCXE or RB binding motif, the nuclear localization signal (NLS), the DNA binding domain (DBD), the bipartite p53 binding domain contained within the helicase domain, and the C-terminal host range (HR)/adeno-helper (AH) domain are depicted. Residue numbers indicate limits for LT functional domains (B) Gene clusters that showed functional enrichment upon expression of full-length LT (T1) or LT fragments. The heatmap shows the expression of these genes in U-2 OS cells expressing T1 or T16 and IMR-90 cells expressing T1 relative to vector or GFP controls, respectively. Replicates were collapsed and genes hierarchically clustered (rows, genes; columns, experiments; red, induced from baseline; blue, repressed from baseline; white, unchanged from baseline). Enriched GO terms are listed adjacent to the numbered expression clusters and next to them are listed enriched gene sets in the cluster. In cluster C14 all transcripts are histones (C) GSEA plots determining whether the expression of the defined gene sets (DREAM, B-Myb-MuvB, or p53) show statistically significant, concordant differences between two biological states (T1 or T16 and vector control).
We determined the effects of full-length LT and various fragments on global gene expression. Cells expressing T1 and T16 constructs showed significant differential expression changes of multiple target genes compared to control. In contrast, cells expressing the N-terminal T6 and T8 constructs showed minimal changes in gene expression compared to control. To identify patterns of host transcriptional perturbation common across all comparisons between the set of LT constructs and controls, we applied model-based clustering to construct clusters from the 430 most frequently perturbed host genes (Table S1 in
To assess the biological significance of the expression profiles we applied gene set enrichment analysis (GSEA)
Given the effects of LT on cellular and viral gene expression, we sought to identify host proteins that bind to LT. We used multidimensional protein identification technology (MudPIT) to analyze preparative scale immunoprecipitations by mass spectrometry
(A) Network of associations of full-length SV40 LT (hexagon) with host proteins (circles) detected in at least three of five replica affinity purification (AP)-MudPIT experiments. Host proteins reported to associate with LT in VirusMint are colored (Blue). Circle size is proportional to the number of times the association was observed. Solid lines (links) represent viral-host protein associations and dashed lines represent host-host associations reported in the BioGRID database. (B) Extracts from T98G cells transfected with HA-tagged p130 and LT were immunoprecipitated with anti-HA antibodies and the indicated proteins were detected by western blot. (C) Summary of AP-MudPIT analysis from full-length LT or LT fragments for the indicated host proteins. Relative abundance values (dNSAF as defined in the supplemental data) were averaged across 5 biological replicate analyses of T1 (full-length LT) affinity purification experiments. The number of times each host protein was identified in the biological replicates is shown. CT indicates C-TERM. (D) Summary of iTRAQ analysis. Estimates of protein stoichiometry, relative to LT were based on reconstructed ion chromatogram (RIC) intensities of the most abundant peptides assigned to each protein. Number of biological replicates for each affinity purification experiment is indicated in parentheses in header. ND indicates not detected.
Consistent with previous reports that the N-terminal J domain binds HSC70
We detected a previously unreported association of LT with the uncharacterized protein FAM111A (family with sequence similarity 111, member A; LOC63901; Gene ID: 63901). FAM111A was reproducibly detected in all five full-length LT (T1) replicates but not in the corresponding negative controls (
We also identified LT-associated proteins using iTRAQ stable isotope labeling (see Supplementary experimental procedures and Table S4 in
We tested FAM111A binding to LT in a yeast two-hybrid (Y2H) assay. The LT constructs T1, T16 and C-TERM bound to FAM111A either as bait or prey in Y2H, while T8 could not (
(A) LT fragments were tested for binding to full-length FAM111A by yeast-two-hybrid (Y2H) in pairwise fashion. (B) Fragments of FAM111A were tested in pair wise Y2H analysis with full-length LT (T1). Numbers indicate residue position in human FAM111A.
Homologs of FAM111A exist in several mammalian species including mouse, rat, and rhesus monkey. FAM111A is also highly similar to its paralog human FAM111B (Gene ID: 374393) with 43% identity in the C-terminal 330 residues encompassing the peptidase domain and trypsin-like catalytic triad. To confirm that LT could bind to FAM111A in human U-2 OS cells, we performed immunoprecipitations with antibodies specific for FAM111A or FAM111B. An antibody for FAM111A detected a 70 kDa band that was reduced upon shRNA-mediated knockdown of FAM111A (
(A) Immunoprecipitations for FAM111A and LT with lysates from U-2 OS cells stably expressing full-length LT (LT) or vector control (V). Whole cell lysate of the U-2 OS cell line stably expressing shRNA-2 (sh) against FAM111A was used as a control for FAM111A antibody specificity and normal rabbit serum (IgG) as an immunoprecipitation control. The indicated proteins were detected by western blot analysis. (B) Immunoprecipitations for FAM111A with lysates from U-2 OS cells stably expressing the LT C-terminus (CT) or vector control (V). (C) Immunoprecipitations of FAM111A and FAM111B in U-2 OS cells expressing LT or mock (M). Levels of FAM111A, FAM111B, LT, and vinculin (VIN) were determined by western blot. (D) Immunoprecipitations of FAM111A on U-2OS, BSC40, and CV-1P cells 48 hours post-infection with wild type SV40 (SV) or mock infected (M). (E) Immunoprecipitation of FAM111A in U-2 OS cells transfected with viral DNA encoding wild type SV40 (WT), K697R acetylation mutant (KR), T701A phosphorylation mutant (TA) and host range mutants HR684 (HR) and dl1066 (dl).
To extend our observations to AGMK cells, we immunoprecipitated LT and FAM111A from lysates prepared from CV-1P, BSC40 and U-2 OS cells infected with wild type SV40 virus (
We next examined how LT C-terminal mutations affect binding to FAM111A. The SV40 point substitution mutants T701A and K697R show wild-type host range activity, while the SV40 host range mutants HR684 and dl1066 cannot produce plaques in CV-1P cells
We sought to characterize FAM111A expression. Differential cellular extraction revealed that FAM111A was present in the nuclear and cytoplasmic fractions of U-2 OS cells (
(A) U-2 OS cells were fractionated and equal amounts of cytoplasmic (C) and nuclear (N) lysates were blotted with FAM111A antibodies. Tubulin and lamin serve as cytoplasmic and nuclear markers, respectively. (B) Box plot depicting the average expression of 79 genes in T98G cells with profiles similar to that of
Similar to the mRNA levels, FAM111A protein levels were lowest in serum starved T98G cells and increased as cells progressed towards the G2/M phase of the cell cycle (
Binding of LT to p53 and RB serves to inactivate their growth suppressing functions. By analogy, LT binding to FAM111A might serve to inactivate the host range restriction function of FAM111A, thereby promoting increased and sustained viral gene expression. If so, then expression of the SV40 LT C-terminal region should have the same effect on virus replication as reduced FAM111A expression. Cells expressing LT C-TERM showed eight-to-ten fold increases of early (LT) and late (VP1) viral transcripts from the HR684 viral DNA relative to cells without LT C-TERM (
(A) U-2 OS cells were co-transfected with host range viral DNA (HR684) and control siRNA (black bars), siRNA targeting FAM111A (white bars) or an expression vector for the C-terminus of LT (grey bars). Quantitative RT-PCR was performed 72 hours post-transfection to determine the expression levels of LT, VP1 and FAM111A (latter not shown) mRNA relative to actin. Error bars represent standard deviation from the mean. U-2 OS (B) and CV-1P (C) cell lines stably expressing two different shRNAs against FAM111A or vector control were generated and the amount of FAM111A RNA remaining (% FAM111A RNA) was confirmed by quantitative RT-PCR. Viral DNA encoding HR684 was transfected into the indicated cell lines and whole cell lysates were harvested at 48 and 96 hours post transfection. (D). CV-1P cells stably expressing two different shRNAs against FAM111A or vector control were transfected with viral DNA and assayed for lytic infection by plaque assay. Plaques were counted 8 days after transfection. Results shown are the average of three independent experiments with standard deviation from the mean denoted by +/−. (E) Control or FAM111A shRNA depleted CV-1P cells were infected at a multiplicity of infection of three with either wild-type SV40 virus or the host range mutant dl1066 virus. Cells were freeze thawed at the indicated time points to extract virus and the viral titer was determined in BSC40 cells. Results shown are the average of three independent experiments with standard deviation from the mean indicated.
To evaluate longer-term effects of FAM111A knockdown on viral gene expression, we generated U-2 OS and CV-1P cell lines stably expressing two different shRNAs specific (sh-1 or sh-2) for FAM111A or vector control. The reduction in FAM111A mRNA and protein expression mediated by sh-2 was slightly more effective than sh-1 in both human and monkey cells (
We examined the effects of depletion of FAM111A on lytic infection by host range mutant virus. DNA corresponding to wild-type SV40 or host range mutants HR684 and dl1066 was transfected into CV-1P cells expressing shRNAs targeting FAM111A. Wild-type SV40 was capable of inducing plaque formation in control CV-1P cells and in cells containing sh-1 or sh-2 against FAM111A (
A single burst assay quantified the effect of FAM111A depletion on virus yield in restrictive CV-1P cells. Cells expressing shRNAs targeting FAM111A or vector control were infected with wild-type SV40 or host range mutant virus, dl1066, at a multiplicity of infection (MOI) of 3. Virions were harvested at several intervals and quantified by plaque assay in permissive BSC40 cells. The wild-type SV40 virus yield was similar in the presence or absence of FAM111A. In contrast, the host range virus yield was negligible in control CV-1P cells but was comparable to wild-type SV40 virus yield when FAM111A was depleted with either sh-1 or sh-2 (
AGMK cells can support human adenovirus replication only when co-infected with SV40
(A) CV-1P cell lines stably expressing shRNA against FAM111A or vector control were mock infected or infected with Ad5 at the indicated dilutions. Ad5 infected cells were visualized using Adeno-X Rapid Titer (hexon protein, brown color). Results from a representative experiment, and quantification of integrated density across two biological replicates are shown (B). CV-1P cells stably expressing two different shRNAs against FAM111A or vector control were infected with Ad5 and assayed for lytic infection by plaque assay. Results from a representative experiment and a graph showing the average number of plaques in three biological replicates are shown.
The propensity of viruses to replicate in host cells depends on their ability to manipulate key host defenses. The multifunctional SV40 LT viral protein encodes discrete domains required for viral replication including origin DNA binding, helicase activity, and the ability to hijack critical host proteins. The LT C-TERM domain is necessary for evading host range restriction in AGMK cells. We demonstrate that the host protein, FAM111A, plays a critical role in restricting viral replication, and that the LT C-terminal region binds to FAM111A to overcome this effect.
Proteomic identification of LT associated proteins confirmed several known co-complex associations including p53 and RB (
SV40 LT interaction with p130 (RBL2) disrupts the DREAM complex leading to cell cycle entry and increased FAM111A expression. LT reduces the expression of p53-responsive genes. LT binding to FAM111A overcomes SV40 host range restriction and enables the adenovirus helper effect.
Most intriguingly, we identified an interaction of the LT C-terminal region with FAM111A and provided several lines of evidence that this interaction contributes to the host range function of LT. FAM111A binds specifically to LT, as demonstrated by two mass spectrometry approaches, Y2H analysis, and reciprocal co-immunoprecipitation. Furthermore, we demonstrated that expression of the LT C-terminal domain or depletion of FAM111A in restrictive CV-1P cells led to sustained viral gene expression and infectious virion formation by host range mutant SV40 viruses. This data strongly supports the model that FAM111A functions as a host range restriction factor that is specifically counteracted by binding to the C-terminal region of LT. In addition, FAM111A depletion enabled human Ad5 to replicate in AGMK cells consistent with the model that FAM111A contributes to viral host restriction.
The observation that host range mutant viruses can productively infect permissive BSC40 cells but not restrictive CV-1P cells suggests that these cells differ in a factor that determines cellular susceptibility to viral infection. However, levels of FAM111A were not appreciably different between CV-1P and BSC40 cells, and LT could bind to FAM111A in both cell types. It is possible that small differences in FAM111A levels or activity could affect viral replication or the efficiency of host range restriction. For example, we observed that
LT-mediated inhibition of FAM111A activity to promote viral replication is consistent with our observations that loss of FAM111A expression by RNAi-mediated knockdown rescues the host range phenotype. FAM111A is predicted to contain a trypsin-like serine peptidase domain. The conservation of the catalytic triad in the FAM111A primary sequence suggests that the protein may act as a specific peptidase. In a simple model, LT binding could inhibit the FAM111A peptidase activity. Although LT binds to the peptidase domain, there is no evidence that LT itself undergoes proteolysis or is a substrate of FAM111A and the exact role of FAM111A remains to be elucidated. It should be noted that several known LT-interacting proteins, including RB, p53, FBXW7 and CDC73, are
BSC40 (gift from J. Pipas, University of Pittsburgh), CV-1P
SV40 genomic DNA (strain 776) was cloned into the BamH1 site of pBluescript KS (Stratagene) for propagation in bacteria. Wild type LT cDNA was transiently expressed from the pSG5 vector. The C-terminal fragment of LT was transiently expressed from the pVAX1 expression vector (Invitrogen). The C-TERM construct contained LT residues 627–708 in frame with an N-terminal hemagglutinin (HA) epitope tag (YPYDVPDYA) and the SV40 nuclear localization signal (NLS) (SPKKKRKVED) cloned into the pWZL retroviral vector
siRNA oligonucleotides were purchased from Dharmacon. Lentiviral vectors (pGIPZ) with shRNA directed against FAM111A were obtained from Open Biosystems. The sequences of siRNA and shRNA are provided in Supplementary experimental procedures in
The following antibodies were used: LT mouse monoclonal antibodies PAb419 and PAb901
Whole cell lysates were prepared in EBC buffer (50 mM Tris-HCl [pH 8.0], 150 mM NaCl, 0.5% Nonidet P-40) supplemented with protease inhibitor cocktail set I (Calbiochem) and phosphatase inhibitor cocktail (Sigma). The Subcellular Protein Fractionation Kit for Cultured Cells was used for nuclear/cytoplasmic fractionation was used (Thermo Scientific). Membranes were blocked and incubated with the appropriate primary antibody in TBS-T overnight at 4°C. Detection of proteins was performed with horseradish peroxidase-conjugated secondary goat antibody (Pierce) in TBS-T and enhanced chemiluminescence (Pierce).
For immunoprecipitations, whole cell lysate was incubated with antibodies and protein A-Sepharose beads overnight at 4°C. Immune complexes were washed four times with EBC and boiled in sample buffer.
See Supplementary Data.
Yeast two-hybrid matrix-style experiment with LT and FAM111A as bait or prey was essentially as previously described
Cells at 80% confluency were infected with wild type SV40 diluted in DMEM supplemented with 2% Fetal Clone-I serum (HyClone), penicillin and streptomycin for two hours. BSC40 and CV-1P were infected at a multiplicity of infection (MOI) of 0.125 and U-2 OS at MOI of 0.5. SV40 plaque assays were as previously described
U-2 OS cells were transfected with control siRNA and total RNA was extracted using TRIzol (Invitrogen) and purified in RNeasy columns (Qiagen). RNA integrity was determined using a Bioanalyzer (Agilent). Gene expression was assayed using Human Genome U133 Plus 2.0 arrays (Affymetrix) in a single batch. Microarray intensities were normalized using robust multi-array averaging (RMA) through the affy package in R/Bioconductor. Differential expression was determined using the limma package
To select genes for clustering, differential expression was tested between all pairwise comparisons and all genes whose expression changes were statistically significant in two or more comparisons were retained (p<0.05 after Benjamini-Hochberg correction for multiple testing). Next, all genes that were differentially expressed in any T1, T6, T8 or T16-expressing cells compared to the vector control cells were adjoined to the previous set of genes. This resulted in a final set of 430 unique HUGO gene symbols. The expression profile of each gene was determined by taking the median expression levels of all probesets annotated to that gene. All the profiles were mean-centered and scaled by the standard deviation before using the mclust package to cluster the genes
Previous microarray profiling of IMR90 normal human fibroblasts transduced with either GFP or SV40 LT was incorporated into the heatmap in the following way. Data from Human Gene 1.0 ST arrays (Affymetrix) was preprocessed as described
GSEA was run using the Java-based desktop application. Probesets were collapsed to gene symbols using median levels. Four combinations of parameters were tried for each run of GSEA: genes were ranked by either signal-to-noise ratio or by t-test, and the p-value was estimated by permuting either sample or gene labels. Only GSEA runs that resulted in significant p-values across all four parameter sets were retained for further interpretation. Therefore, although the enrichment score traces and p-values depicted in the figures correspond specifically to t-test ranking and gene set permutation, these gene sets were significant among all parameter combinations tried.
The DREAM and B-MYB/MuvB gene sets were extracted from
This file contains supplementary figures, tables, experimental procedures, and references.
(PDF)
We thank Melissa Duarte and Huiping Zhang for technical assistance, Eric McIntush (Bethyl Laboratories, Inc.) for FAM111A and FAM111B antibodies, Gary Ketner (Johns Hopkins University) for Adenovirus 5, Neil D. Christensen (Pennsylvania State University) for SV40 VP1 antibodies, and Jim Pipas (University of Pittsburgh) for BSC40 cells. We thank Shmuel Rozenblatt for discussions on SV40 LT.