Ehrlichia chaffeensis TRP47 enters the nucleus via a MYND-binding domain-dependent mechanism and predominantly binds enhancers of host genes associated with signal transduction, cytoskeletal organization, and immune response

Ehrlichia chaffeensis is an obligately intracellular bacterium that establishes infection in mononuclear phagocytes through largely undefined reprogramming strategies including modulation of host gene transcription. In this study, we demonstrate that the E. chaffeensis effector TRP47 enters the host cell nucleus and binds regulatory regions of host genes relevant to infection. TRP47 was observed in the nucleus of E. chaffeensis-infected host cells, and nuclear localization was dependent on a variant MYND-binding domain. An electrophoretic mobility shift assay (EMSA) demonstrated that TRP47 directly binds host DNA via its tandem repeat domain. Utilizing chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) with E. chaffeensis-infected cells, TRP47 was found to bind at multiple sites in the human genome (n = 2,051 at p < 10−30). Ontology analysis identified genes involved in functions such as immune response, cytoskeletal organization, and signal transduction. TRP47-bound genes included RNA-coding genes, many of these linked to cell proliferation or apoptosis. Comparison of TRP47 binding sites with those of previously-identified E. chaffeensis nucleomodulins identified multiple genes and gene functional categories in common including intracellular transport, cell signaling, and transcriptional regulation. Further, motif analysis followed by EMSA with synthetic oligonucleotides containing discovered motifs revealed a conserved TRP47 DNA-binding motif. This study reveals that TRP47 is a nucleomodulin that enters the nucleus via a MYND-binding domain and appears to play a role in host cell reprogramming by regulation of transcription.


Antibodies
Polyclonal rabbit and mouse anti-TRP47 antibodies were generated against a peptide from the TRP47 tandem repeat region (ASVSEGDAVVNAVSQETPA) by a commercial vendor (Bio-Synthesis, Lewisville, TX). Pre-immune serum from the same rabbit was also used in this study as a control. Rabbit anti-Dsb antibody was produced as previously described [13].

Immunofluorescence and confocal laser microscopy
TRP47 localization during infection was examined by immunofluorescence and confocal laser microscopy. Briefly, uninfected and E. chaffeensis-infected THP-1 cells were cytocentrifuged onto glass slides at 24, 48, and 72 h post infection and fixed in 4% paraformaldehyde in phosphatebuffered saline (Alfa Aeasar) at room temperature for 25 min. Cells were then permeabilized using 1% Triton X-100 with 2% bovine serum albumin (BSA; Sigma-Aldrich) in phosphate-buffered saline (PBS; Sigma-Aldrich) for 1 h, followed by incubation with 1:100 mouse anti-TRP47 and 1:100 rabbit anti-Dsb antibodies for 1 h. After incubation with primary antibodies, slides were washed 4 times in PBS and incubated with Alexa Fluor 488-conjugated goat anti-rabbit IgG (H+L) and Alexa Fluor 594-conjugated goat anti-rabbit IgG (H+L) secondary antibodies for 30 min. Finally, glass coverslips (Fisherbrand premium cover glass) were mounted using ProLong Gold antifade reagent with DAPI (4 0 ,6-diamidino-2-phenylindole) (Cell Signaling Technology, Danvers, Mass.) and dried at room temperature overnight. Images were obtained using a Zeiss LSM 880 Airyscan super resolution laser scanning confocal microscope configured with an Axiovert inverted microscope with a c-Apochromat 40x/1.2 numerical aperture water immersion lens. UV laser (405 nm) and VIS laser (488 nm and 561 nm) were used to excite the sample and emissions were captured using band pass filters of 385 to 470nm (for DAPI), 505 to 530 nm (for Alexa Fluor 488 conjugate), and 560 to 615 nm (for Alexa Fluor 594 conjugate), respectively. Images were analyzed using Zenlite blue software, and z-stacks were constructed by imaging optical slices at 0.16 μm intervals. and 65˚C for the appropriate extension time (30 s per 500 product base pairs); and 65˚C for 7 min. Correct size was verified using the FlashGel DNA electrophoresis system (Lonza, Basel, Switzerland), and PCR products were purified with the MinElute PCR Purification Kit (Qiagen). PCR products for pGEX-6P-1 constructs were digested with EcoRI and SalI high-fidelity restriction enzymes (New England Biolabs, Ipswich, Mass.) and ligated into digested vector (Fast-Link DNA Ligase; Epicentre, Madison, Wis.). Constructs were transformed into TOP10 E. coli (Invitrogen, Carlsbad, Calif.), and transformants were selected by growth on LB agar with ampicillin. PCR products for pAcGFP1-C constructs were cloned into pre-linearized vector using the In-Fusion PCR Cloning Kit (Clontech), and transformants were selected by growth on LB agar with kanamycin. The DNA fragment encoding the His-tagged MYND-binding domain (His-MBD) was created by annealing primers flanking the region. A PCR product encoding TRP47 without the MYND-binding domain (No Motif construct) was amplified from a commercially synthesized pUc57 plasmid (GenScript) using the forward and reverse primers for the full-length GFP construct. PCR screening was used to verify correct insert size, and plasmids from positive colonies were isolated (QIAprep Spin Miniprep Kit, Qiagen) and sequenced to verify proper orientation and frame. Lysine-to-arginine mutation of the full-length TRP47 pAcGFP1-C construct was performed using the QuikChange II SiteDirected Mutagenesis Kit (Agilent, Santa Clara, Calif.) according to the manufacturer's instructions. Mutating primers were created using the QuikChange Primer Design program. Plasmids from positive transformants were isolated and sequenced to verify presence of the correct mutation.

Expression and purification of recombinant TRP47
Plasmids encoding full-length and tandem repeat region TRP47 GST-fusion constructs were transformed into TurboCells BL21(DE3) E. coli and plated on LB agar with ampicillin. Colonies were PCR-screened to verify presence of the insert, and positive transformants were regrown overnight in LB broth with ampicillin. Overnight cultures were diluted 1:20 in fresh growth medium and incubated with shaking at 37˚C until an OD 600 of about 0.5 was achieved. Protein expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG; Sigma-Aldrich) for 4 h, after which cultures were pelleted and stored at -80˚C. Cell pellets were resuspended in Tris-HCl buffer and lysed by sonication in the presence of protease inhibitors (cOmplete mini, EDTA-free; Roche, Basel, Switzerland). Lysates were cleared by centrifugation at 12,000xg and 4˚C for 30 min, and the supernatant was collected and incubated with Glutathione Sepharose 4B affinity resin (GE Healthcare Life Sciences) at 4˚C with nutation overnight. After washing with Tris-HCl, bound protein was eluted from the resin with 50 mM glutathione. Dialysis was performed and the concentration measured using the Pierce BCA Protein Assay Kit (Thermo Fisher Scientific). Recombinant protein expression and purification were verified using SDS-PAGE and protein staining (AcquaStain Protein Gel Stain; Bulldog Bio, Portsmouth, N.H.).

Ectopic expression of GFP-tagged TRP47 constructs
Plasmids for transfection (pAcGFP1-C FL, N, Ntrunc, TRC, C, MBD, No Motif, K49R, K71R, GFP control) were isolated from 150 mL overnight E. coli cultures with the PerfectPrep EndoFree Plasmid Maxi Kit (5 PRIME) and quantitated using a spectrophotometer. HeLa cells were seeded at a density of 0.8 x 10 5 cells/mL in 8-well chamber slides 24 h prior to transfection. Transfections were performed in duplicate with Lipofectamine 2000 (Thermo Fisher Scientific) according to the manufacturer's recommendations. Twenty-four hours after transfection, cells were fixed with acetone at -20˚C for 10 min and mounted with ProLong Gold antifade reagent with DAPI. Slides were dried at room temperature overnight, viewed using an Olympus BX61 epifluorescence microscope, and analyzed with Slidebook software (version 5.0; Intelligent Imaging Innovations, Denver, CO).

Chromatin immunoprecipitation and sequencing
E. chaffeensis-infected THP-1 cells harvested at 48 hpi were used for chromatin immunoprecipitation using the EZ Magna ChIP Kit (EMD Millipore). Briefly, cells were cross-linked using a final concentration of 1% formaldehyde for 10 min. Cells were then pelleted and lysed with lysis buffer and a dounce homogenizer. Lysate was sonicated on ice using Sonics Vibra-Cell (20 cycles, 30 s at 5 W output, 30 s rest) to generate chromatin fragments around 300 bp in length. To validate sheared chromatin samples and ChIP technique, ChIP was performed with anti-RNA polymerase II positive control antibody followed by measurement of fold enrichment of GAPDH nucleic acid by qPCR. TRP47 was immunoprecipitated using rabbit anti-TRP47 at optimized concentration and pre-immune serum was used as a control. Washing of IP reactions and reversal of crosslinks were performed according to the Magna ChIP protocol, and nucleic acid was purified and quantified using a Qubit fluorometer. Preparation of indexed DNA sample libraries involving PCR amplification with adapter ligation was performed by the UTMB Next Generation Sequencing Core using the NEBNext Ultra II DNA Library Kit. Library quality and quantity was evaluated using Agilent Bioanalyzer. Samples were sequenced using an Illumina HiSeq 1500 to generate about 40 million single-end 50 base pair sequence reads per sample. Sequence reads were analyzed by base calling and sequence quality filtering scripts with Illumina Pipeline software. Sequences were aligned to the human genome (UCSC hg38) using bowtie2. The output BAM files were processed by removing PCR duplicates, reads mapping to multiple genomic locations, and reads with low mapping scores. MACS [14] was then used to call peaks ranked by significance and BED files were generated. Hypergeometric Optimization of Motif EnRichment (HOMER) software [15] was also used to identify peaks for comparison, but downstream analyses focused on MACS peaks. Peaks were visualized using Integrated Genomics Viewer (IGV).

Gene annotation and ontology
BED files consisting of highly significant TRP47 peaks were submitted to Genomic Regions Enrichment of Annotations Tool (GREAT) version 3.0 [16]. Peaks were associated with nearby genes using the Basal Plus Extension rule, which assigns genes a proximal regulatory region consisting of -5 kb to +1 kb from the transcription start site (TSS) and a distal regulatory region extending in both directions to the proximal regulatory region of any other genes but no more than 1000 kb. GREAT also utilizes a small set of curated regulatory domains from the literature. After the ChIP-seq peaks were assigned to nearby genes, the annotations from those genes were used to calculate enriched categories using region-based binomial and gene-based hypergeometric tests. The enriched ontology terms were filtered to include only those significant (p < 0.05) by both statistical tests. Significant TRP120 and TRP32 peaks were also submitted to GREAT to identify target genes and ontology terms in common between TRPs.
To associate TRP47 peaks with gene regulatory regions, the BEDTools coverage function was used to examine overlap of peaks with promoters from the Eukaryotic Promoter Database EPDnew (each TSS extended -5 kb to +1 kb using BEDTools slop) and enhancers from the GeneHancer database [17]. To determine significance of enrichment in different genomic regions, experimental peaks were randomly permuted throughout the genome (excluding centromere regions which are not well represented in whole-genome sequencing data) using BEDTools shuffle to generate 10 random peak sets [18]. These control peak sets were compared to the experimental peaks using a one sample t-test. To associate peaks with RNA-coding genes, BEDTools closest function was used to find TSSs from the RefSeq curated mRNA and non-coding RNA transcripts database located nearest to the center of each peak.

DNA-binding motif analysis
Likely TRP47 DNA-binding motifs were identified using the MEME-ChIP tool in the Multiple EM Motif Elicitation (MEME) Suite [19,20]. BED files consisting of regions +/-250 base pairs from the center of highly significant TRP47 binding peaks were generated using BEDTools. BED files consisting of only regions located within +/-5 kb of associated gene TSSs according to GREAT were also generated. All BED files were converted to fasta format, randomly subsampled as needed to maximum 1000 sequences for manageable algorithm running time, and submitted to MEME-ChIP. MEME-ChIP executed complementary MEME and DREME algorithms to discover novel sequence motifs and CentriMo algorithm to measure positional enrichment of the de novo motifs and known motifs from JASPAR CORE vertebrates and UniPROBE mouse databases within peak sequences. Motifs that were statistically enriched in these regions were returned as probability matrices ranked by level of significance and grouped by similarity. Significant (E-value < 10 −5 ) motifs were tested in multiple EMSAs.

EMSA
Whole genomic DNA was isolated from healthy THP-1 cells and sheared using the ChIP-IT Express Enzymatic Kit (Active Motif) without crosslinking. Complete shearing of chromatin into fragments of 200 base pairs or less was verified by gel electrophoresis and ethidium bromide staining. Sheared genomic nucleic acid was then purified using the QIAquick PCR Purification Kit (Qiagen) and biotin labeled using the Label IT Nucleic Acid Labeling Kit (Mirus). EMSAs were performed using the Lightshift Chemiluminescent EMSA Kit (ThermoFisher). Briefly, 5 ng of labeled genomic DNA was incubated with up to 8 μg of purified protein in buffer containing 10 mM Tris, 50 mM KCl, 1 mM DTT, 5 mM MgCl 2 , 2.5% glycerol, 0.05% NP-40 and 25 ng/μl Poly (dI•dC). Samples were incubated at 4˚C for 1 hr then separated on a 6% DNA retardation gel (ThermoFisher) at 100V for 90 min. Reactions were subsequently transferred to a Biodyne B nylon membrane (ThermoFisher) at 20V for 1 hr. Transferred DNA was crosslinked to the membrane using a CL-1000 Ultraviolet crosslinker (UVP) at 200 mJ/cm 2 for 5 min. Reactions were imaged using streptavidin-HRP and x-ray film or a Chemi-Doc-It 2 Imager (UVP). EMSA reactions that included rabbit anti-TRP47 antibody at 1:200, 1:100, 1:50, and 1:20 dilutions were performed by adding antibody after 30 min of incubation and then incubating for an additional 30 min. Competition assays were performed using 500x molar increase of unlabeled DNA. Protein and unlabeled DNA were incubated for 30 min at 4˚C followed by adding labeled DNA and incubating an additional 30 min. For EMSAs utilizing a probe, up to 2500 pM of labeled DNA was incubated with up to 10 μg of purified protein, 100-250x molar increase of unlabeled DNA, and 1:20 dilution of anti-TRP47 antibody.
Oligonucleotide probes (IDT) containing discovered TRP47 binding motifs were resuspended in annealing buffer (10 mM Tris, pH 7.5-8.0, 50 mM NaCl, and 1 mM EDTA) and complementary strands were annealed by heating to 95˚C in a heat block for 5 min then slowly cooling to room temperature. Before use in EMSA, probes were diluted in annealing buffer. Sequences for probes P1, P2, P3, and P4 are listed in Table 1, and the discovered motifs are underlined.

TRP47 localizes to the host cell nucleus in a mechanism involving a MYND-binding domain
TRP47 nuclear localization during infection was examined using confocal microscopy (Fig 1). At 24 hpi, TRP47 is observed in and around morulae and around the nucleus of E. chaffeensis- infected THP-1 cells. At later timepoints 48 and 72 hpi, an increase in TRP47 was observed both in the cytoplasm and the nucleus. Analysis of the TRP47 amino acid sequence by the Eukaryotic Linear Motif resource identified a variant MYND-binding motif of PXLE (X: any amino acid) associated with binding of Hsp90 complex co-chaperones to PHD2 [21]. To determine the role of this domain in the subcellular localization of TRP47, a GFP control and TRP47 constructs encoding GFP-tagged fulllength (FL), N-terminal (N), truncated N-terminal (Ntrunc), tandem repeat-C-terminal (TRC), C-terminal (C), His-MYND-binding domain (His-MBD), MBD deletion mutant (No Motif), K49R mutant, and K71R mutant TRP47 (Fig 2A) were ectopically expressed in HeLa cells and their localization observed by fluorescence microscopy (Fig 2B). All ectopically expressed TRP47 truncation constructs that included the MBD (marked by asterisks) except for K49R exhibited nuclear localization while constructs without the MBD exhibited exclusively cytoplasmic localization. These results indicate that the MYND-binding motif plays a role in TRP47 nuclear translocation. The inability of the K49R mutant TRP47 construct to enter or be retained in the nucleus could be due to a number of factors, such as unmasking of a nuclear export signal, loss of an important post-translational modification or protein-protein interaction, or both.

TRP47 binds to host genomic DNA via a tandem repeat DNA-binding domain
To determine if TRP47 binds host genomic DNA, purified recombinant full-length and 5 tandem repeat (5TR) GST-tagged TRP47 constructs were used in an electrophoretic mobility shift assay (EMSA) with biotin-labeled sheared human genomic DNA. Signal detection in this assay is based on biotin-labeled DNA in complex with protein traveling more slowly during electrophoresis creating a shifted band. When an EMSA was performed with 4 μg of TRP47 5TR, formation of a protein-DNA complex was observed and addition of unlabeled competitor gDNA prevented detection of a band (Fig 3A), indicating that TRP47 specifically interacts with DNA. Addition of anti-TRP47 antibody resulted in loss of this band, most likely due to interference with the ability of TRP47 to bind DNA. Band intensity was gradually restored when antibody concentration was decreased, confirming that TRP47 is present in the protein-DNA complex. EMSA using full-length TRP47 showed formation of a protein-DNA complex that was not detected after the addition of unlabeled competitor gDNA ( Fig 3B). These experiments confirm that TRP47 tandem repeat region is the functional domain for DNA binding. The amount of full-length protein required to visualize a band was twice the amount required for the tandem repeat construct, suggesting that full-length TRP47 has lower affinity for DNA than the tandem repeat region.

TRP47 binds within regulatory regions of host genes involved in a variety of cellular processes
In order to identify TRP47 host genomic targets, ChIP-seq was performed. MACS identified 127,421 TRP47 binding peaks with p value < 10 −4 and 2,051 highly significant peaks with p < 10 −30 (Fig 4A). The enriched regions were distributed in all chromosomes (Fig 4B). Submitting highly significant peaks to GREAT revealed that the binding locations occurred at variable distances from the TSS of associated genes (Fig 4C). Compared to random control peaks, a significantly greater proportion of TRP47 peaks were located within +/-50 kb (TRP47: 44.5%, random: 26.4%, p < 0.0001) and +/-5 kb (TRP47: 7.7%, random: 3.5%, p < 0.0001) from TSS. Coverage analysis of binding locations using EPDnew promoter and GeneHancer enhancer region databases indicated that, compared to random control peaks, there was  (Fig 4D).
To assess the quality of the called TRP47 binding peaks, highly significant peaks were visualized with IGV (Fig 5). Panels A-C represent examples of peaks within promoters (A: DACT2, B: LTA and TNF, C: WNT9A), and panels E and F provide examples of peaks in enhancers (E: AFAP1, F: FYN). As expected for good quality ChIP-seq [22], the TRP47 ChIP sample forward and reverse reads are generally asymmetrically distributed around the midpoint of peaks and coverage within peaks is greater compared to serum control ChIP.
Using ontology term enrichment analysis, TRP47 binding regions were found to be associated with many functional categories of genes. The enriched categories within biological process, cellular compartment, molecular function, and pathways ontologies were related to innate immune response, cytoskeleton and vesicle trafficking, epigenetic regulation, cell differentiation, ion transport, RNA editing, and signal transduction including regulation of Wnt signaling (Fig 6A-6D). Multiple TRP47-bound genes were associated with each significantlyenriched category (Fig 6E). Further analysis of TRP47 binding sites revealed that most occur near protein-coding genes, but some are associated with RNA genes (Fig 7A). Of the TRP47-bound RNA genes that have a known or studied function, many have been implicated in various cancers and play a role in cell proliferation and apoptosis (Fig 7B).
(marked by asterisks), while the Ntrunc, C, and No Motif constructs did not. The FL and His-MBD TRP47 constructs exhibited strong nuclear localization and the N, TRC, and K71R TRP47 constructs exhibited nuclear and diffuse cytoplasmic localization. Exclusively cytoplasmic localization was observed with the Ntrunc, C, No Motif, and K49R mutant TRP47 constructs. The GFP negative control construct was also observed only in the cytoplasm. Due to differences in expression levels of the constructs, different exposure times were used to image each construct. https://doi.org/10.1371/journal.pone.0205983.g002

Multiple discovered TRP47 target genes and ontological categories overlap with those of TRP120 and TRP32
To examine potential interplay of ehrlichial nucleomodulins by targeting the same host genes or genes implicated in the same pathway or functional category, target genes and enriched ontology terms identified for TRP47, TRP32, and TRP120 were compared. For these comparisons, we used a different p value cutoff of peaks for each TRP to provide similar numbers of peaks. Significant terms in common between TRP47 and TRP120 (n = 48) corresponded to cellular processes such as intracellular transport, cell differentiation, and signal transduction ( Table 2). Terms identified with more highly significant TRP binding sites (indicated by darker gray color) included regulation of protein kinase C and Notch signaling, endoplasmic reticulum lumen, and genes involved in NCAM1 (also known as CD56) interactions. Out of all genes associated with TRP binding peaks, 391 were in common for all three TRPs and 3,591 were in common for two TRPs. Among 4,703 genes associated with significant TRP47 peaks, 3,536 were in common with at least one other ehrlichial nucleomodulin while 1,167 were unique TRP47 targets. A subset of genes associated with highly significant TRP peaks are shown along with the TRP binding locations with respect to the TSS in Table 3. Categories of function of these genes included cell fate determination, transcriptional regulation, vesicle trafficking, cell signaling, and specifically Wnt signaling. Additionally, out of 1,323 previously identified Ank200 target genes [8], 557 were found to be in common with one or more TRPs. Lists of ontology terms and target genes identified for E. chaffeensis nucleomodulins are provided in S2 and S3 Tables, respectively. To help validate host gene-TRP47 interactions, we checked if significant identified TRP47 target genes were found to be differentially expressed during E. chaffeensis infection in previous studies. Among 571 genes previously found to have significantly altered transcriptional levels for at least one time point post-infection [3], 102 are TRP47 targets and 21 of these are unique targets of TRP47 (S3 Table). Functional categories of these target genes include transcription, immune response, and apoptosis and cell proliferation. In addition, among 46 TRP32 target genes previously found to be differentially expressed during infection [10], 8 are also TRP47 targets, including CAP1, CMC1, HLX, ING1, MTFR2, NAT10, PARP16, and POLDIP3.

TRP47 DNA motif
To determine the DNA motif bound by TRP47, highly significant ChIP peak sequences were entered into motif analysis software. Eight of the most likely TRP47 binding motifs were chosen as those with the lowest E-values returned from different sets of highly significant (p < 10 −20 ) TRP47 peaks at any distance or exclusively within +/-5 kb of gene TSSs (Fig 8A). To determine if the TRP47 ChIP-seq sites were bound directly by TRP47, we assayed the ability of purified TRP47 to bind candidate sequences in vitro using EMSA (Fig 8B). EMSA was performed using biotin-labeled probes (P1, P2, P3 and P4). TRP47 was not observed to bind P1, P2, or P3, but binding was observed for P4 (Fig 8B, left panel). Formation of the protein complex was inhibited by competition with excess unlabeled P4, and the complex was supershifted with addition of anti-TRP47 antibody (Fig 8B, right panel). We concluded that TRP47 specifically interacts with a conserved motif within P4. The specific TRP47 binding motif could be the consensus sequence [5'-CCTCCC-3'], which is only found within P4 and not P1, P2, or P3. Visualization of E. chaffeensis TRP47 binding sites in genome browser. IGV was used to visualize TRP47 binding peaks in gene promoter and enhancer regions. Panels A-F represent genomic locations containing highly significant (p < 10 −30 ) peaks. Serum and TRP47 ChIP tracks show forward (red) and reverse (blue) strand mapped reads and read coverage (gray bars). The tracks below depict peaks (green rectangles) labeled with associated genes and distance from TSS in parentheses; genes as exons (blue rectangles) and introns (blue lines) including 5' to 3' direction (blue arrows) and TSS (bent arrows); promoters (orange rectangles), specifically regions -5 kb to +1 kb from EPDnew primary promoter TSSs; and enhancers (red rectangles) labeled by GeneHancer ID number. https://doi.org/10.1371/journal.pone.0205983.g005

Discussion
Survival and replication of E. chaffeensis and other obligately intracellular bacteria critically depends on subversion of host cell defenses and processes through diverse interactions with host molecules and pathways. An emerging mechanism utilized by intracellular pathogens to reprogram host cells involves secreted nucleomodulins that translocate into the nucleus and influence transcription. Recent studies demonstrated that E. chaffeensis effectors Ank200, TRP120, and TRP32 act as nucleomodulins that target genes involved in processes known or reasonably assumed to have an impact on ehrlichial survival. The results of this study suggest that TRP47 has similar nucleomodulin activity.
Previously, ehrlichial effectors Ank200, TRP120, and TRP32 were all found to localize to the nucleus of host cells during infection [8][9][10]. Nuclear localization for these effectors occurs later in the course of infection (after 48 hpi), although TRP120 was also detected in the nucleus as early as 3 hpi. Similarly, in this study, we observed that TRP47 is present in the nucleus mostly at 48-72 hpi with much less at 24 hpi. This suggests that TRP47 gradually accumulates in the nucleus over the course of infection. Although the TRPs lack a classical NLS, transport into the nucleus could involve post-translational modification and interaction with an NLScontaining partner ("piggy-backing"). For example, TRP32 nuclear localization requires phosphorylation of Y179 located in a PYYY motif, a post-translational modification that could directly or indirectly enable TRP32 to interact with a protein capable of trafficking to the nucleus. We found that the MYND-binding domain in TRP47 plays a role in its nuclear localization and that mutation of TRP47 K49 prevents nuclear localization. MYND domains mediate protein-protein interactions especially between transcription regulatory proteins. For example, the MYND domain in the ETO corepressor binds to the PPPL1 MBD in the SMRT and NCoR corepressors [27,28], and the MYND domain in the BS69 corepressor binds to the PXLXP MBD in the MGA transcription factor and viral transactivators E1A and EBNA2 [29]. The TRP47 MBD in TRP47, PXLE, is present in Hsp90 co-chaperones p23 and FKBP38 and binds the MYND domain of the HIF1-α hydroxylase PHD2 [21]. PHD2 has a non-classical NLS that is required for nuclear translocation [30]. The TRP47 MBD may allow binding to a MYND domain-containing protein such as PHD2 that directly or indirectly mediates TRP47 nuclear entry or retention. Although the precise mechanism by which TRP47 and other When significant TRP47 binding peaks were submitted to BEDTools closest, 23% of peak-gene associations corresponded to RNA-coding genes. Of these, the majority were noncoding RNAs (66%) followed by microRNAs (16%), antisense RNAs (10%), pseudogenes (5%), small nucleolar RNAs (3%), and small nuclear RNAs (<1%). (B) A table of selected characterized noncoding RNA genes bound by TRP47 showing that many are involved cell proliferation or apoptosis and some are also bound by another ehrlichial TRP: NR_027181.1 [23], NR_029833.1 [24], NR_029608.1 [25], NR_126415.1 [26].
The number of TRP47 ChIP-seq binding peaks called by MACS was greater than previously reported for TRP120 and TRP32. For instance, at the highly significant p value of 10 −20 , 16,443 TRP47 peaks were identified while 582 TRP120 and 241 TRP32 peaks were returned. These disparities could be accounted for by differences in the respective antibody specificity and its efficiency in immunoprecipitation or differences in technique such as chromatin fixation and sonication. The latter explanation may be supported by our observation of variable values of fold enrichment of DNA in positive control ChIP for different sheared chromatin preparations. There are also differences in the enrichment of binding peaks for each TRP within gene regulatory regions. While 20% of TRP120 and 43% of TRP32 highly significant peaks occur within +/-5 kb of TSSs, 8% of TRP47 highly significant peaks occur at this distance. One possible explanation for the lower proportion of TRP47 peaks in promoter regions is that TRP47 preferentially binds enhancer regions to regulate host gene expression, which would represent a novel function of bacterial nucleomodulins. This hypothesis is supported by the fact that enrichment of TRP47 peaks is greater in enhancers (42%) compared to promoters (8%), and the degree of increased enrichment compared to random peaks is greater in enhancers (+25.3%) compared to promoters (+6.5%).
Enriched ontological categories and even gene targets identified for TRP47 alone or in addition to TRP32 or TRP120 included many that have been previously reported to be exploited by E. chaffeensis to facilitate infection. For example, TRP32, TRP47, and TRP120 all interact with proteins involved in Wnt signaling, TRP120 interacts with Notch metalloprotease ADAM17, and both pathways are required for ehrlichial survival [31,32]. Correspondingly, among the significant TRP47 target gene processes identified in this study were the Notch signaling pathway, which was also implicated with TRP120 peaks, and regulation of canonical Wnt signaling. TRP47-bound genes involved in these pathways were identified as well, including PSMB1 and IRF2BP2, both of which were also identified as TRP32 and TRP120 target genes. IRF2BP2 also encodes a host protein target of TRP120, suggesting it is regulated in some way by TRPs both at the level of transcription and protein. It is possible that E. chaffeensis utilizes multiple effectors to regulate expression of the same host target genes in order to ensure effective regulation of genes critical for its survival. Many of the identified TRP47 target genes have been found to be differentially expressed during E. chaffeensis infection in previous studies [3,10]. This suggests that TRP47 may play a key role in regulating expression of the identified target genes. However other factors could be at least partially responsible for altered expression of the TRP47-bound genes. For instance, 81 of 102 differentially expressed TRP47-bound genes are also identified targets of other ehrlichial nucleomodulins; TRP47 may or may not be the primary nucleomodulin responsible for altering expression of these genes. Further, other processes related to infection can modulate gene transcription, such as effector-mediated modulation of cell signaling and host cell immune response. Ehrlichia chaffeensis TRP47 nucleomodulin TRP47 was found to bind host DNA via its tandem repeat domains, similar to TRP120 and TRP32 in E. chaffeensis and the TAL family of effectors in Xanthomonas. The same amount of full-length TRP47 produced a much weaker band shift than the tandem repeat construct. This comparatively reduced affinity of full-length TRP47 for DNA could be due to the presence of an inhibitory domain in the N-terminal or C-terminal portion of the protein. In the supershift assay, a gradual inhibition of DNA-binding activity occurred with increasing concentration of antibody. This phenomenon has been previously described for several transcription factors [33,34] and is likely due to competition between the antibody and DNA binding of the TRP47 tandem repeat region since the antibody used was directed against a molecularly defined epitope in the tandem repeat domain. Interestingly, a recent study using nuclear magnetic resonance spectroscopy found that TRP120-1TR construct specifically binds DNA only at low pH and with affinity in the micromolar range weaker than that of zinc finger transcription factors [35]. Further, TRPs may require a post-translational modification or interaction with host or ehrlichial proteins to adopt an ideal conformation for DNA binding, neither of which can occur in the in vitro EMSA assay. These features of ehrlichial TRP-DNA binding may account for difficulties with acquiring strong shifted bands in TRP47 EMSAs with gDNA and synthesized probes.  Table 1) are shown in WebLogo format. (B) The predicted motifs were tested by EMSA. Incubation of biotin-labeled probes P1, P2, P3, and P4 with full-length TRP47 resulted in a shifted band complex C TRP47 for P4 (lane 8 in left panel and lane 2 in right panel, marked by arrows). The relative density of the shifted band was diminished with addition of 100 and 250-fold molar excess of unlabeled competitor DNA (right panel, lane 3 and 4 respectively). This band was also reduced with addition of anti-TRP47 antibody, but accompanied by a supershifted band complex SC TRP47 (lane 5, marked by arrow). These results suggest that P4 contains a specific TRP47 DNA-binding motif.
https://doi.org/10.1371/journal.pone.0205983.g008 MEME-ChIP analysis using TRP47 peaks returned many statistically significant candidate TRP47 binding motifs that were rich in guanine and cytosine nucleobases, similar to TRP120 and TRP32 motifs. Compared to the shifted bands seen in EMSA with TRP47 and gDNA as well as the other TRPs and their respective positive probes, the shifted bands we observed in multiple EMSAs using probe P4 with TRP47 were weaker in intensity. One possible reason for the relatively low binding affinity for this probe is that only a single iteration of the TRP47 motif is present and a repeat is required for strong protein-DNA interaction. This is supported by the fact that both the TRP120 and TRP32 probes that demonstrated strongest interactions in EMSA had two 6-nucleotide inverted repeats in close proximation, a feature that has also been reported in DNA-binding sites of other transcription factors such as EBF [36]. The positive EMSA result for probes containing the discovered TRP47 motifs helps validate the ChIPseq peaks and affirm that they represent true TRP47 binding sites.
In this study, we identified an additional ehrlichial effector that is likely involved in direct transcriptional regulation of host genes relevant to infection. We demonstrated that TRP47 translocation to the nucleus involves a MYND domain, where it binds host DNA, and host targets include genes involved in signal transduction, vesicle trafficking, and immune response. Further research will evaluate direct modulation of expression of target genes by TRP47 and elucidate the mechanisms and pathological significance of regulation of host genes by ehrlichial nucleomodulins.
Supporting information S1 Table. Oligonucleotide primers used to create E. chaffeensis TRP47 expression constructs. (DOC) S2 Table. Ontology terms identified for TRP47, TRP32, and TRP120. Significantly enriched terms within the GO Biological Process, GO Cellular Component, GO Molecular Function, MSigDB Pathway, and PANTHER Pathway ontologies identified by GREAT for each TRP are presented. In the TRP columns, the lowest p value peak set which returned the term association is shown. The p value cutoff used varied by TRP to provide similar numbers of peaks: p < 10 −25 for TRP47, p < 10 −4 for TRP32, and p < 10 −10 for TRP120. Enriched gene ontology terms for Ank200 were reported previously [8]. (XLSX) S3 Table. Target genes identified for TRP47, TRP32, TRP120, and Ank200. Genes that were identified as significant targets for at least one E. chaffeensis nucleomodulin are listed. In each TRP column, the lowest p value peak nearest to the associated gene TSS and the distance to TSS is shown. In the Ank200 column, the FDR value of the previously identified gene interaction [8] is shown. In the Induced or Repressed column, results from a previous study of differential gene expression during E. chaffeensis infection [3] are shown. (XLSX)