The NF-κB-dependent and -independent transcriptome and chromatin landscapes of human coronavirus 229E-infected cells

Coronavirus replication takes place in the host cell cytoplasm and triggers inflammatory gene expression by poorly characterized mechanisms. To obtain more insight into the signals and molecular events that coordinate global host responses in the nucleus of coronavirus-infected cells, first, transcriptome dynamics was studied in human coronavirus 229E (HCoV-229E)-infected A549 and HuH7 cells, respectively, revealing a core signature of upregulated genes in these cells. Compared to treatment with the prototypical inflammatory cytokine interleukin(IL)-1, HCoV-229E replication was found to attenuate the inducible activity of the transcription factor (TF) NF-κB and to restrict the nuclear concentration of NF-κB subunits by (i) an unusual mechanism involving partial degradation of IKKβ, NEMO and IκBα and (ii) upregulation of TNFAIP3 (A20), although constitutive IKK activity and basal TNFAIP3 expression levels were shown to be required for efficient virus replication. Second, we characterized actively transcribed genomic regions and enhancers in HCoV-229E-infected cells and systematically correlated the genome-wide gene expression changes with the recruitment of Ser5-phosphorylated RNA polymerase II and prototypical histone modifications (H3K9ac, H3K36ac, H4K5ac, H3K27ac, H3K4me1). The data revealed that, in HCoV-infected (but not IL-1-treated) cells, an extensive set of genes was activated without inducible p65 NF-κB being recruited. Furthermore, both HCoV-229E replication and IL-1 were shown to upregulate a small set of genes encoding immunomodulatory factors that bind p65 at promoters and require IKKβ activity and p65 for expression. Also, HCoV-229E and IL-1 activated a common set of 440 p65-bound enhancers that differed from another 992 HCoV-229E-specific enhancer regions by distinct TF-binding motif combinations. Taken together, the study shows that cytoplasmic RNA viruses fine-tune NF-κB signaling at multiple levels and profoundly reprogram the host cellular chromatin landscape, thereby orchestrating the timely coordinated expression of genes involved in multiple signaling, immunoregulatory and metabolic processes.


Plasmids and transient transfections
shRNA constructs for knockdown of p65 were prepared in pSuper-Puro backbone as previously described . The shRNA-encoding plasmid to knock down TNFAIP3 was obtained from Sigma (pLKO.1 shTNFAIP3 (TRCN0000050959; NM_006290). To suppress p65 or TNFAIP3, HeLa cells were transiently transfected by the calcium phosphate method. After 24 h, the medium was changed and transfected cells were selected for 48 h (for shp65) or 72 h (for shTNFAIP3) with 1 μg/ml puromycin.
After selection cells were infected for another 24 h with a MOI of 1 with HCoV-229E before harvesting.

mRNA expression analysis by RT-qPCR
1-2 µg of total RNA was prepared by column purification (Macherey and Nagel or Qiagen) and transcribed into cDNA using Moloney murine leukemia virus reverse transcriptase (Fermentas) in a total volume of 20 µl. For mRNA measurement of cellular genes and HCoV-229E genomic RNA1 (nsp8) and RNA2 (Spike) as shown in Figs. 1E, 5F, 6B, 6F, S1A, S3A 2 µl of this reaction mixture were used to amplify cDNAs by RT-qPCR using Fast SYBR® Green Master Mix (ThermoFisher Scientific, 4385612) and the primers shown below. Melting curve analysis revealed a single PCR product.

Immunofluorescence analysis
Indirect immunofluorescence analysis was performed as described [2]   PCR products were detected by TaqMan® probes as described above. Taqman probes and primer pairs detecting GUSB were used for normalization.

Immunoblotting
Immunoblotting was performed essentially as described (

Microarray experiments
Purified total RNA was amplified and Cy3-labeled using the LIRAK kit (Agilent) following the kit instructions. Per reaction, 200 ng of total RNA was used. The Cy3labeled aRNA (amplified antisense RNA) was hybridized overnight to 8x60K 60mer oligonucleotide spotted microarray slides (Agilent Technologies, design ID 039494).
Hybridization and subsequent washing and drying of the slides were performed following the Agilent hybridization protocol. The dried slides were scanned at 2 µm/pixel resolution using the InnoScan is900 (Innopsys, Carbonne, France). Image analysis was performed with Mapix 6.5.0 software, and calculated values for all spots were saved as GenePix results files. Stored data were evaluated using the R software and the limma package from BioConductor [6][7][8]. Log 2 mean spot signals were taken for further analysis. Data were background corrected using the NormExp procedure on the negative control spots and quantile-normalized before averaging. For filtering of "expressed" genes the distribution of log 2 signal intensities of the negative control spots was determined for each array. A regular spot was considered "present" when its signal was 3 times higher than the interquartile width of the 3rd quartile. A gene was considered "expressed" on an array when at least 50% of the spots for this gene were "present". A gene was considered "expressed" in a group of biological samples, when this gene was judged "expressed" in at least 50% of the arrays of this group. For Fig. 1A, selections of regulated genes from single hybridizations were made based on the noise observed between hybridizations of heat-inactivated controls. A heuristic noise-band function was set up to distinguish a regulation from noise so that the probability of a falsely identified regulated gene is < 0.1%. The noise-band function is ±1.5·8/A, where A is the average log 2 signal. KEGG pathway analyses were done using gene set tests on the ranks of the tvalues (when available) or log 2 fold-changes [6,9]. The significances for the KEGG pathway perturbations were calculated by the function geneSetTest from limma. The statistics were calculated for all three alternatives "mixed", "up", and "down", and the smallest p value was taken and Bonferroni corrected. In this analysis only genes were considered with average signal intensities in at least one group that were considerably higher than the average signal of negative controls on the arrays as described above. The significances for the over-representation of selected gene lists were calculated by Fisher's exact test. As for the gene set tests, only "expressed" genes were considered.

Processing of ChIP-seq reads
ChIP-Seq reads were converted to fastq format and aligned to a precompiled hg19 reference index with BOWTIE with -k option set to 1 [10,11]. Sequencing data were controlled for general quality features using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Unambiguously mapped and unique reads were kept for subsequent generation of binding profiles and calling of peaks using MACS v1.41 [12] and PeakRanger [13] at default settings. All downstream analyses were done in R/BioConductor (http://www.bioconductor.org) [7].

Peak analysis
Peaks identified by MACS or PeakRanger at a Poisson p value < 10 -5 and an FDR < 5% were used for intersection analysis to determine the overlap in pairwise comparisons.
Two peaks were determined to be overlapping in case they had a minimal overlapping interval of 1 bp. In order to identify differentially bound regions we collapsed all binding regions determined for a given factor under different conditions and extracted the read numbers mapping to these collapsed intervals. DESeq [14] was used to normalize between samples and in order to determine changes in occupancy and genes were ranked accordingly. In case we were studying the overlap of multiple binding factors at once we used the multiinter function of the BedTools suite [15] with the -cluster option turned on.
All intervals like enhancers or transcription binding peaks were annotated with respect to human hg19 RefSeq annotations downloaded from Illumina's iGenome repository. In order to assign genes to peaks we linked each peak with the next gene (maximum distance between peak border and transcriptional start site was set to 500 kb). Gene ontology analysis of differentially bound enhancers was done with GREAT using default settings [16].

Visualization of binding profiles
After extension of reads continuous coverage vectors were calculated and normalized per million reads to account for differential library sizes. These data were used to collect data in windows of different sizes spanning features of interest (e.g. transcription factor peaks). The binding data was binned across binding sites in 50 bp windows and the mean was calculated at each position in order to generate cumulative average binding profiles.
For representation in genome browsers, profiles were additionally smoothed using kernel regression estimates. Data was visualized using the Gviz BioConductor package.

Enhancer analysis
The principal set of enhancers in this study was determined as the collapsed overlap of H3K4me1 and H3K27ac peaks as determined in all of the 3 experimental conditions (control, IL-1, HCoV-229E). IL-1-and CoV-dependent enhancers were defined as the subset of these enhancers marked by substantial induction of H3K27ac (> 2-fold) when comparing IL-1 treatment or HCoV-229E infection with the control treatment of the cells.

Motif search
We extracted the sequences in a +/-50bp window around the peak maxima and discrete sub-peak maxima (called by the PeakSplitter application: http://www.ebi.ac.uk/research/bertone/software) from top ranked peaks. These sequences were used to run MEME-ChIP (http://meme.ebi.edu.au/) in order to identify de novo motifs and known motifs enriched within peak regions of respective factors [17]. In order to identify motifs within regulated enhancers with higher fidelity, we pursued an alternative strategy. Within enhancers we found that in many cases p65 binding sites were found at local minima of corresponding histone modification ChIP-seq data. In order to better define potential transcription factor binding motif containing sequences we implemented a tool to identify "valleys" within the coverage profiles for H3K27ac data.
Therefore, we calculated a "valley score" across the analyzed enhancer intervals at 10 bp resolution similar to what has been described in a previous study [18]. Bins assigned as valleys were stitched together to larger regions in case they occured within 50 bp.
Identified valley intervals were validated by optical inspection of profiles in the genome browser and by plotting the average H3K27ac coverage across all identified enhancer valleys. Instead of the complete enhancer sequence we submitted only the sequences corresponding to valley regions to MEME-ChIP, thereby excluding more than 80 % of the original enhancer sequences.