Effects of Argonaute on Gene Expression in Thermus thermophilus

Background Eukaryotic Argonaute proteins mediate RNA-guided RNA interference, allowing both regulation of host gene expression and defense against invading mobile genetic elements. Recently, it has become evident that prokaryotic Argonaute homologs mediate DNA-guided DNA interference, and play a role in host defense. Argonaute of the bacterium Thermus thermophilus (TtAgo) targets invading plasmid DNA during and after transformation. Using small interfering DNA guides, TtAgo can cleave single and double stranded DNAs. Although TtAgo additionally has been demonstrated to cleave RNA targets complementary to its DNA guide in vitro, RNA targeting by TtAgo has not been demonstrated in vivo. Methods To investigate if TtAgo also has the potential to control RNA levels, we analyzed RNA-seq data derived from cultures of four T. thermophilus strain HB27 variants: wild type, TtAgo knockout (Δago), and either strain transformed with a plasmid. Additionally we determined the effect of TtAgo on expression of plasmid-encoded RNA and plasmid DNA levels. Results In the absence of exogenous DNA (plasmid), TtAgo presence or absence had no effect on gene expression levels. When plasmid DNA is present, TtAgo reduces plasmid DNA levels 4-fold, and a corresponding reduction of plasmid gene transcript levels was observed. We therefore conclude that TtAgo interferes with plasmid DNA, but not with plasmid-encoded RNA. Interestingly, TtAgo presence stimulates expression of specific endogenous genes, but only when exogenous plasmid DNA was present. Specifically, the presence of TtAgo directly or indirectly stimulates expression of CRISPR loci and associated genes, some of which are involved in CRISPR adaptation. This suggests that TtAgo-mediated interference with plasmid DNA stimulates CRISPR adaptation.


Introduction
Argonaute proteins (Agos) have long been known as key players in eukaryotic RNA interference (RNAi) pathways, in which eukaryotic Ago (eAgo) uses a small single-stranded (ss)RNA guide to target ssRNA molecules (reviewed in [1][2][3]). While many RNAi pathways regulate host gene expression by targeting mRNAs, some RNAi pathways are involved in host defense (reviewed in [4][5][6]). In these pathways, Agos interfere with RNA transcripts from viruses or transposons, or with RNA viruses directly.
Prokaryotes also encode Agos (pAgos), but none of the additional proteins involved in canonical RNAi pathways [7][8][9][10]. Recently, it has become clear that pAgos are involved in mediating host defense, but in contrast to eAgos, they target DNA rather than RNA [11,12]. One of the best studied pAgos is that of Thermus thermophilus (TtAgo), which has been characterized structurally and biochemically [12][13][14][15][16]. T. thermophilus is a gram-negative thermophilic bacterium that is used as model organism for genetic transformation, biotechnological applications and structural biology. T. thermophilus strain HB27 has a 1.9 Mb chromosome encoding 1988 genes (with TTC# tag) and harbors a 232 Kb mega-plasmid designated pTT27, encoding 230 genes (with TT_P# tag).
In contrast to RNA-guided eAgos, TtAgo has been demonstrated to utilize DNA guides in order to cleave single stranded RNA (ssRNA), single stranded DNA (ssDNA) and double stranded DNA (dsDNA) targets in vitro [12][13][14][15][16]. This allows TtAgo to directly interfere with invading DNAs, lowering plasmid transformation efficiencies and intracellular plasmid content [12,17]. As TtAgo preferentially acquires guides from plasmid DNA [12], and it is able to cleave RNA targets in vitro, it was predicted that TtAgo also interferes with plasmid transcripts [12]. This would suggest a dual-function of TtAgo, both in defense and in gene regulation, which is akin to eAgos [1][2][3] and prokaryotic CRISPR-Cas [18]. However, gene expression of T. thermophilus has not yet been investigated in strains in which invading DNA in the form of a plasmid was present.
Here, we describe the analysis of a new RNA-seq dataset derived from T. thermophilus strains HB27 and HB27Δago harboring plasmid pMKPnqosGFP [19]. Although the presence of TtAgo or plasmid DNA itself does not strongly affect gene expression, the presence of both results in decreased quantities of plasmid-encoded RNA transcripts and increased expression of specific genomic genes.

Results and Discussion
We included previously obtained RNA-seq data from HB27 and HB27Δago [12] in our analysis in order to compare them with the new data from HB27 + plasmid (HB27+P) and HB27Δago + plasmid (HB27Δago+P). The latter two strains were grown in medium containing kanamycin, selecting for plasmid maintenance. For each condition, RNA from biological triplicates was purified, sequenced and mapped, and for each gene the abundance was calculated as Fragments Per Kilobase of exon per Million fragments mapped (FPKM). RNA levels are considered to be changed significantly when the FPKM value of a set of biological triplicates differed from the FPKM value of another set of biological triplicates with P<0.05. We considered changes in RNA levels biologically relevant if FPKM averages of biological triplicates differed at least >4-fold from FPKM averages of another set of biological triplicates, while smaller changes were considered stochastic. RNA was purified from triplicate log phase (OD 600 nm of 0.5) cultures HB27, HB27Δago, HB27+P and HB27Δago+P (Fig 1A). Using Prodigal 2.6 [20], 35 new open reading frames were identified of which 22 were located on the HB27 chromosome (tagged TTCX01-TTCX22) and 13 on the mega-plasmid pTT27 (TTPX01-TTPX13). Of these new genes, 15 encode proteins of which the function can be predicted based on (partial) homology to other proteins. Furthermore, 16 encode proteins that show (partial) similarity to hypothetical proteins, whereas four encode proteins that share no significant similarity to other proteins in the current NCBI database. The open reading frames and predicted functions of the proteins they encode are listed in S1 Table. The absence of TtAgo results in small stochastic changes in T. thermophilus gene expression The ago knockout in T. thermophilus strain HB27 has previously been demonstrated to result in small pleiotropic changes in gene expression (<4-fold change for most genes) [12], and this was confirmed in our new analyses of the same dataset (S2 Table). Stochastic changes in gene expression include 59 genes which are >2-fold up-regulated and 35 genes which are >2-fold down-regulated in HB27Δago compared to HB27 (S2 Table). Besides these small differences, >4-fold change in expression was observed for specific genes (S2 Table). As expected, we observe no expression of the gene encoding TtAgo (TT_P0026) in Δago strains, and low levels of TtAgo expression in wild type strains (FPKM<150; Fig 1B). In agreement with this observation, evidence for expression of (strep(II)-tagged) TtAgo protein encoded by the knock-in gene at the same genomic location has previously been demonstrated [12]. The ago knockout resulted in 3 to 6-fold lower RNA levels mapped against genes located near and on the same strand as ago (Fig 1B). These changes are most likely polar effects caused by ago deletion. RNA mapped against two other genes is lowered 3-fold and 5-fold in HB27Δago: TTC1213 (1-pyrroline-5-carboxylate dehydrogenase) and TTC1241 (predicted acyl-amino acid-releasing enzyme). These genes, as well as the genes located near ago on the genome, are also downregulated in HB27Δago+P compared to HB27+P. In addition, a predicted operon encoding a branched-chain amino acid transport system (TTC0333-TTC0343) appears up-regulated (3 to 5-fold increase in RNA levels) in HB27Δago compared to HB27. This operon encodes a system homologous to the Liv ABC transporter system, which transports the amino acids leucine, isoleucine, valine, threonine and alanine in an ATP dependent manner. The same set of genes is only moderately up-regulated (most genes <2-fold change) in HB27+P compared to HB27Δago+P. A functional link between these genes and the ago knockout is not obvious. The levels of these RNAs are affected in both HB27Δago and HB27Δago+P, suggesting that TtAgo affects these RNA levels directly or indirectly.

Comparison of T. thermophilus HB8Δago and HB27Δago
A recent publication describes the differences in RNA expression between T. thermophilus strains HB8 ( Fig 1C) and HB8Δago [21]. The chromosomes of T. thermophilus strains HB8 and HB27 are highly conserved, while their mega-plasmids pTT8 and pTT27, which encode ago and most CRISPR-Cas related genes, show a higher degree of divergence [22]. RNA was purified from log-phase cultures in both studies, but the growth medium used for HB8 cultivation [21] is slightly different from the medium we used for HB27 cultivation (S3 Table). We compared the genes from HB27Δago and HB8Δago of which corresponding RNA levels changed >2-fold compared to the corresponding wild type strains. We found no clear correlation between the affected genes in both strains ( Fig 1D and S3 Table). None of the genes of which expression changed >4-fold in HB8Δago were found to be differentially expressed in HB27Δago (S3 Table).
Given that Ago proteins interact with guides to bind specific complementary targets [10,23], it would be expected that TtAgo strongly affects levels of specific RNAs. As chromosomes of both HB8 and HB27 TtAgo are very similar, and specific RNA levels changed in HB8Δago and HB27Δago vary greatly, it seems unlikely that TtAgo targets specific RNAs. Instead, our analysis suggests that observed differences in RNA levels are stochastic, and thus unlikely to be caused by guided TtAgo activity. The observation that TtAgo does not influence the transcription of genes involved in competence or host defense, suggests that TtAgo only interferes with the invading DNA directly. In a recent study the competence of HB27 and HB27Δago has been compared during natural transformation experiments and during cell-tocell conjugation experiments with genomic T. thermophilus DNA. It was found that TtAgo does interfere with natural transformation, but not with cell-to-cell conjugation [17]. As the genes required for natural transformation are also essential for cell-to-cell conjugation, this excludes a possible indirect effect of TtAgo via regulation of expression competence genes.

Presence of plasmid DNA results in changes in gene expression only if TtAgo is present
To investigate the effect of the presence of plasmid DNA on gene expression, we compared RNA isolated from HB27Δago to that from HB27Δago+P. No significant (P<0.05) >4-fold changes in RNA levels were observed. Furthermore, the presence of plasmid DNA did not result in significant (P<0.05) >2-fold change in expression of host-defense genes (S2 Table). Combined, these data suggest that presence of invading nucleic acids in the form of plasmid DNA does not result in differentiated gene expression in HB27Δago. This contrasts with the presence of another invader, lytic phage ϕYS40 [24] in HB8, which results in up-regulation of a plethora of host defense genes. These genes encode (amongst others) TtAgo, the T. thermophilus Type I-E (not encoded in HB27), Type III-A and Type III-B CRISPR-Cas systems, as well as multiple other Cas genes scattered over the HB8 genome [24]. In summary, although phage infection triggers host defense response pathways in T. thermophilus, the presence and replication of plasmid DNA does not trigger host defense responses. This is presumably because defense pathways are costly to induce [25], and are most beneficial in the context of parasitic infections [26], such as by lytic phages. In contrast, plasmids are far less detrimental to the host and often confer a fitness benefit [27], making it unnecessary to induce these pathways during plasmid invasion.
In contrast, when comparing WT (Ago-encoding) strains with and without plasmid DNA, we observed significant (P>0.02) >4-fold increase of RNA levels mapped to specific genes (HB27+P compared to HB27; Table 1, S2 Table). Corresponding genes, difference in RNA levels, motifs and predicted functions of the proteins they encode are listed in Table 1 and Fig 2. There seems to be no clear link between the functions of the up-regulated genes. Interestingly however, many of these genes, especially genes up-regulated >5-fold, are located directly downstream and in the same orientation as various CRISPR loci (Fig 2A and Table 1). Predicted gene TTPX09 is located in a CRISPR locus, and is unlikely to encode a functional protein. Furthermore two putative transposases (TT_P0099 and TTC1169) of which expression appears up-regulated, are located near CRISPR loci on the genome, but in reverse orientation. TT_P0211 and TTPX12 are located directly downstream each other in a predicted operon. These and three other genes located on the chromosome (TTC0310, TTC0311 and TTC0399) appear to have no link with CRISPR loci. As we observe only elevated RNA levels under these conditions, it is highly unlikely that TtAgo interferes with RNA, as this would lower RNA levels. Nevertheless, the fact that these genes are up-regulated only under conditions where both TtAgo and plasmid DNA are present, suggests that TtAgo directly or indirectly influences expression of these genes.
Combined presence of plasmid DNA and TtAgo results in up-regulation of crRNA expression As many genes that are up-regulated in HB27+P are located on the genome near CRISPR loci, we further investigated expression of cas genes and CRISPR loci. Mega-plasmid pTT27 encodes complete Type I-C, III-A and III-B CRISPR-Cas systems (Fig 2A), and multiple scattered cas genes (two cas1, two cas2, one cas4, one cas6 and a cas3 gene with an internal frameshift (TTPX10 and TTPX11; Fig 2A and S1 Table). pTT27 additionally encodes eight CRISPR arrays (Fig 2). The HB27 chromosome encodes two cas1 genes and a cas6 gene, as well as two CRISPR loci (Fig 2). Besides TT_P0101 (encoding Cas2), no cas genes appear differentially expressed (S2 Table). This is striking, as the up-regulated cas2 is located directly upstream of the predicted operon encoding the Type III-A CRISPR-Cas system (Fig 2A).
To investigate expression of CRISPR RNA (crRNA) from CRISPR loci, we used a dataset containing only reads that are partially complementary to CRISPR repeats (S4 Table). For most CRISPR loci, expression of crRNA is highest at the leader-proximal end of the CRISPR locus, and gradually lowered towards the leader-distal end of the CRISPR locus (S4 Table). This observation agrees with the leader harboring the promoter for crRNA expression [30,31]. When comparing crRNA expression in the different strains, expression of crRNAs encoded by eight CRISPR loci is strongly up-regulated in HB27+P compared to HB27 (Fig 2B and S4 Table). As some of the genes mentioned in Table 1 are located directly downstream CRISPR loci, it appears that the expression of these genes and the presence of a CRISPR locus is linked. This suggests that either these genes are expressed from the same promoter (read-through), or alternatively that they, and possibly other up-regulated genes that are not located directly downstream CRISPR loci, are under control of the same transcriptional regulator as the upregulated CRISPR loci. As Cas2 and CRISPR leader sequences play essential roles in the acquisition of CRISPR-Cas-mediated immunity (reviewed in [30,31]), increased expression of Cas2 and crRNAs could imply that CRISPR adaptation is activated. To investigate if TtAgo enhances CRISPR adaptation, we analyzed CRISPR loci for integration of new spacers. We used a PCR-based method that previously has been demonstrated to identify spacer integration in E. coli cultures, if at least 0.4% of the culture integrated a spacer in the amplified CRISPR locus [32]. However, no new spacers were detected, even when cultures were grown in the absence of antibiotics (S1 Fig). This suggests either that under the tested conditions CRISPR adaptation is not stimulated, or alternatively that CRISPR adaptation does not confer a benefit to the host (i.e. clones with novel spacers do not increase in frequency and therefore remain undetectable), which is supported by theoretical predictions that costly acquired immunity is not likely to evolve against parasites with low virulence [26].

TtAgo interferes with invader DNA but not with invader-encoded RNA
Besides the effect on genome-encoded gene expression, TtAgo has a clear effect on plasmid DNA and plasmid-encoded gene expression (Fig 3 and S5 Table). It has previously been shown that TtAgo interferes with intracellular plasmids, resulting in 3 to 5-fold higher plasmid contents in HB27Δago compared to HB27, even when the cultures were grown under conditions selecting for plasmid maintenance [12]. We determined plasmid pMKPnqosGFP content at the time at which the RNA was isolated (OD 600 nm of 0.5) for strains HB27+P and HB27Δago +P (Fig 3A and S5 Table). These cultures were grown in presence of kanamycin, selecting for pMKPnqosGFP maintenance. In line with previous observations [12], intracellular plasmid content was significantly (P<0.05) lowered~4-fold in wild type HB27 compared to HB27Δago (Fig 3B), confirming that TtAgo interferes with intracellular plasmid DNA. Furthermore, we observed 2.4 to 3.8-fold lower levels of plasmid-encoded RNA in the HB27+P strain compared to the HB27Δago+P strain (Fig 3C and S5 Table). Thus, in contrast to genomic encoded RNAs, plasmid encoded RNAs are lowered in the presence of TtAgo. The~4-fold lower plasmid content itself can explain the 2.4 to 3.8-fold decrease of plasmid encoded RNA, as there are less plasmid copies available for RNA expression. Thus, although DNA-guided TtAgo has been shown to cleave both ssDNA and ssRNA targets in vitro [12][13][14][15], this data suggests that in vivo TtAgo solely interferes with plasmid DNA and not with plasmid-encoded RNA.
Bacterial Ago from Rhodobacter sphaeroides (RsAgo) associates with small RNAs and DNAs derived from extracellular sources such as plasmids, transposons and phages [11]. We therefore analyzed the effect of TtAgo on RNAs encoded by transposases (S6 Table). We observed no higher expression of transposase genes in HB27Δago strains. In contrast, compared to HB27, in HB27+P levels of RNA mapped against predicted transposases TTC1169, TT_P0099 and TT_P0211 are significantly (P<0.02) increased (12.6-fold increase, 4.3-fold increase, and 6.6-fold increase, respectively). The fact that these differences were only observed if both TtAgo and the plasmid are present, suggests that transposase expression is induced under these specific conditions only. As mentioned above, TTC1169 and TT_P0099 are located near CRISPR loci on the genome, suggesting their increased expression is a result of up-regulation of these CRISPR loci. As we only analyzed RNA levels, and because TtAgo appears not to be interfering with RNA directly (see above), we cannot rule out that TtAgo interferes with transposons at the DNA level (for example during the extrachromosomal step of their life-cycle).

Conclusions
Only small stochastic changes in gene expression are observed when comparing wild type Thermus thermophilus and the derived Δago mutant. This implies that TtAgo, in contrast to eAgos, is not involved in regulation of gene expression. In agreement with previous observations [12,17], TtAgo lowers intracellular plasmid DNA levels, even under selective conditions for plasmid propagation. This results in decreased plasmid DNA levels that still allows for survival in the presence of kanamycin while lowering the metabolic burden of high copy number plasmids. Earlier work showed that TtAgo preferentially acquires DNA guides complementary to plasmid DNA and/or plasmid-encoded RNA [12], and demonstrated that TtAgo can cleave both DNA and RNA in vitro [12,15]. To investigate the effect of TtAgo on plasmid DNA and on plasmid-encoded RNA, we analyzed new RNA-seq data derived from T. thermophilus strains harboring a plasmid. Compared to the Δago strain, we observed lowered plasmid DNA levels and accordingly lowered levels of plasmid-encoded RNA in the wild type strain. Strikingly, we observed no further reduction of plasmid-encoded RNA. This suggests that TtAgo does not directly target RNA in vivo, making it a strict DNA-guided DNA-interfering host defense system. Furthermore, the presence of plasmid DNA itself does not result in up-regulation of host defense genes. This suggests that, unlike phage infection, the presence of plasmid DNA is (at least under the used conditions) not registered as a threat. T. thermophilus requires a host defense system that is able to distinguish invader DNA from genomic DNA. While CRISPR-Cas systems require incorporation of spacers before being able to target invaders, TtAgo specifically interferes with plasmid DNA without being dependent on genomic-encoded information about the invader. The observation that the combined presence of TtAgo and plasmid DNA correlates with up-regulation of various CRISPR loci and at least part of the CRISPR adaptation machinery suggests that TtAgo-mediated plasmid interference stimulates CRISPR adaptation. Although pAgos and CRISPR-Cas systems sometimes co-occur, often only one of these defense systems is encoded by a genome [9]. This suggests that these systems function independently. Nevertheless, there are rare examples where the gene encoding pAgo co-localizes with Cas1 and Cas2 (for example in Methanopyrus kandleri) or Cas4 (multiple pAgos) [9,10]. Cas1 and Cas2 are known to be essential for CRISPR adaptation [30,32,33]. Also Cas4 has been predicted to be involved in CRISPR adaptation as its forms complexes with Cas1 and Cas2 [34] and additionally Cas4 is fused to Cas1 in several Type I CRISPR-Cas systems [29]. As cas genes and pAgo do not strictly co-occur, we hypothesize that pAgo itself is not directly involved in spacer adaptation, but that pAgo-mediated plasmid interference indirectly stimulates CRISPR adaptation. For example, pAgos might generate plasmid DNA degradation products that somehow stimulate expression of genes involved in CRISPR adaptation. Acquisition of new spacers, stimulated by TtAgo, would make future generations resistant against the invader by CRISPR--Cas-mediated defense. An additive effect of two host defense systems (a restriction modification system and a CRISPR-Cas systems) on total resistance levels has recently been reported [35]. Combined with observation that TtAgo lowers plasmid concentrations even under conditions selecting for plasmid maintenance, this makes TtAgo a valuable addition to the current arsenal of host defense systems.

Strains
T. thermophilus HB27 (ATCC BAA-163, DSM7039 and NBRC101085), which is referred to in this manuscript as HB27 or wild type, and the TtAgo-encoding gene knockout strain HB27Δago [12] were used for the studies described in this manuscript (Fig 1).

Transformations
T. thermophilus strains were transformed with plasmid pMKPnqosGFP [19] as described previously [12]. Colonies were selected and cultivated overnight at 65°C in 20 mL TTH medium [12] in a shaker incubator. 1 mL aliquots were prepared from the overnight cultures in 1.5 mL Eppendorf tubes which were centrifuged in a table top centrifuge at 6,000 rpm for 10 min. Supernatant was removed and cell pellets were stored at -20°C.

RNA sequencing
T. thermophilus strains with and without plasmid pMKPnqosGFP were cultivated in triplicates as described previously [12]. Growth medium was supplemented with 30 μg/mL kanamycin for cultures harboring pMKPnqosGFP. When cultures reached an OD 600 nm of 0.5, RNA was purified using the mirVana RNA isolation kit (Ambion) as described previously [12]. Purified RNA from these biological triplicates was sequenced by BaseClear BV by Illumina sequencing.

RNA-seq analysis
T. thermophilus genome was re-annotated using an in-house annotation pipeline SAPP platform (Koehorst et al., submitted). Reads of different experiments were all mapped against the T. Thermophilus genome (consisting of the HB27 chromosome and pTT27 mega-plasmid) and plasmid pMKPnqosGFP plasmid. For the identification of noise, reads of all experiments also excluding the pMKPnqosGFP plasmid were mapped against the entire T. thermophilus genome and corresponding plasmids. Differential expression analysis was performed using the trinity package in combination with RSEM [36].

crRNA analysis
For the analysis of the crRNAs the CRISPR cassettes were predicted using the CRT prediction module in SAPP (Koehorst et al., submitted). The corresponding regions of the CRISPR cassettes were extracted and analyzed in combination with the gene sequences using the trinity package. To improve mapping, repeat regions were trimmed.

Analysis of CRISPR loci
Triplicate HB27 and HB27Δago cultures with or without plasmid pMHPnqosGFP were cultivated in medium with and without antibiotics to an OD 600 nm of 0.5, after which genomic DNA was purified using the JGI 'bacterial genomic DNA isolation using CTAB' protocol [37]. Short stretches of each CRISPR locus, encompassing at least a part of the leader sequence and the first spacer-repeat unit, were PCR amplified (for primers see S7 Table), and resolved on 2% agarose gels. Gels were stained with SYBR Safe Nucleic Acid Stain (Invitrogen) and nucleic acids were visualized using a G:BOX Chemi imager. A comparable method has previously been demonstrated to detect CRISPR adaptation if at least 0.4% of the culture obtained new spacers [32].

Plasmid content analysis
For complete DNA (containing both genomic and plasmid DNA) purification, T. thermophilus HB27 and HB27Δago transformed with pMKPnqosGFP were cultivated in triplicates to an OD 600 nm of 0.5. One OD 600 nm unit was harvested and complete DNA was isolated using the JGI 'bacterial genomic DNA isolation using CTAB' protocol [37]. 1 mg DNA of each purification was resolved on 0.8% agarose gels and stained with SYBR Safe Nucleic Acid Stain (Invitrogen), visualized using a G:BOX Chemi imager and analyzed using GeneTools analysis software (Syngene).

Statistical analysis
For the calculation of P values of differences in expression levels of specific genes, FPKM of biological triplicates of each strain were used as the input. P values stated in this manuscript are calculated by a two-tailed distributed two-sample t-test assuming equal variances.