m6A RNA methylation facilitates pre-mRNA 3’-end formation and is essential for viability of Toxoplasma gondii

Toxoplasma gondii is an obligate intracellular parasite that can cause serious opportunistic disease in the immunocompromised or through congenital infection. To progress through its life cycle, Toxoplasma relies on multiple layers of gene regulation that includes an array of transcription and epigenetic factors. Over the last decade, the modification of mRNA has emerged as another important layer of gene regulation called epitranscriptomics. Here, we report that epitranscriptomics machinery exists in Toxoplasma, namely the methylation of adenosines (m6A) in mRNA transcripts. We identified novel components of the m6A methyltransferase complex and determined the distribution of m6A marks within the parasite transcriptome. m6A mapping revealed the modification to be preferentially located near the 3’-boundary of mRNAs. Knockdown of the m6A writer components METTL3 and WTAP resulted in diminished m6A marks and a complete arrest of parasite replication. Furthermore, we examined the two proteins in Toxoplasma that possess YTH domains, which bind m6A marks, and showed them to be integral members of the cleavage and polyadenylation machinery that catalyzes the 3’-end processing of pre-mRNAs. Loss of METTL3, WTAP, or YTH1 led to a defect in transcript 3’-end formation. Together, these findings establish that the m6A epitranscriptome is essential for parasite viability by contributing to the processing of mRNA 3’-ends.

secondary structures by lessening the strength between A:U base pairs in mRNAs [14,15], the majority of m6A-directed activities stem from the direct recognition of the methylated nucleotide by RNA-binding proteins, termed m6A readers. The best characterized readers are YTH domain-containing proteins, though additional proteins have also been demonstrated to recognize the m6A mark [7][8][9]12,13]. Selective removal of the m6A mark occurs through the erasers, Fe(II)/α-ketoglutarate-dependent dioxygenases in the protein ALKBH family [7,9,12].
Phylogenetic evidence indicates that the m6A epitranscriptome system is present in each of the eukaryotic kingdoms, including many animals, plants, fungi, and the Alveolate protozoa, including Plasmodium spp. and Toxoplasma [8]. Here, we systematically investigate each identifiable member of the m6A machinery in Toxoplasma and survey the transcriptome-wide prevalence of m6A in tachyzoites and in parasites undergoing bradyzoite formation. We determine that writer functionality is essential for parasite viability, identify a parasite-specific member of the writer complex, discover that the m6A mark is localized in close proximity to the 3'-end of transcripts, and demonstrate that the YTH-family reader proteins constitute part of the machinery involved in processing the cleavage and polyadenylation of transcripts at their 3'-termini. Induced knockdown of either writer or reader components led to a defect in the normal 3'-end formation of mRNAs. Together, these findings establish that m6A is a critical mRNA modification that is widespread across multiple stages of the Toxoplasma life cycle and relies on parasite-specific components that may serve as attractive drug targets.

The m6A mark is present throughout asexual Toxoplasma stages
To determine whether m6A is present in Toxoplasma, we first performed immunofluorescent assays using an anti-m6A antibody. Examination of extracellular tachyzoites (ME49 strain) revealed an enriched signal in the cytoplasm that appeared in a slightly punctate pattern ( Fig  1A, panel i). A corresponding no antibody control, which was performed at the same time, indicated that the staining was specific (Fig 1A, panel ii). We also examined intracellular tachyzoites and bradyzoites generated by treating parasites with alkaline media for 6 days. The m6A mark was primarily localized to the cytoplasm in both of these life stages (Fig 1A, panels iii and iv). An anti-m6A dot blot performed on total RNA and genomic DNA extracted from purified tachyzoites revealed an enrichment of m6A in parasite RNA ( Fig 1B). The quality of the parasite purification was assessed by the lack of a detectable host-specific 28S rRNA band in the RNA sample, indicating that very little to no contaminating host nucleic acids were present in the sample. Together, these data demonstrate that m6A is a feature of Toxoplasma RNA.

Purification of the Toxoplasma m6A writer complex reveals unique features
The components of the m6A writer complex, which consists of the catalytic METTL3, the catalytically inactive METTL14, and the associated protein WTAP are conserved in Toxoplasma based on BLASTP homology searches (Table 1). It should be noted that WTAP is currently annotated as two separate genes on ToxoDB, but it is likely to be a single transcript (see next section). Of importance, each of three of the Toxoplasma m6A writer orthologues are critical for parasite fitness according to a genome-wide CRISPR screen (Table 1) [16], suggesting that adenosine methylation of mRNA is important and required for tachyzoite growth and replication.
We tagged each of identified m6A writer proteins at their endogenous C-termini with an HA epitope in RH strain parasites lacking KU80 and HXGPRT genes (RHΔΔ) using single Toxoplasma RNA is modified with the m6A mark. A) Immunofluorescence assays show m6A-specific marks enriched in the cytoplasm of i) extracellular tachyzoites compared to ii) a control without primary antibody. The m6A mark is also present in the cytoplasm of iii) intracellular tachyzoites and iv) stress-induced bradyzoites (cyst wall is stained with DBA). Scale bars represent 5 μm. B) Toxoplasma genomic DNA and differing amounts of RNA, as indicated, were immunoblotted for the m6A mark. SYBR Gold staining serves as a loading control. https://doi.org/10.1371/journal.ppat.1009335.g001

PLOS PATHOGENS
crossover recombination [17,18]. Consistent with observations from other eukaryotes and with their proposed role in mediating co-transcriptional mRNA methylation [9], each of the writer proteins were predominantly localized in the parasite nuclei (Fig 2A). Whereas METTL3 HA and WTAP HA were easily detected, the METTL14 HA signal was comparatively faint and required more sensitive exposure settings for adequate visualization. A western blot of each tagged line revealed a single band of the expected size; the data also support that WTAP is misannotated on ToxoDB and should be a single gene (Fig 2B). We also note that while WTAP and METTL3 proteins are expressed in similar amounts, METTL14 is significantly expressed at substoichiometric levels, consistent with the diminished detection in the immunofluorescence microscopy (Figs 2B and 3C). Whether this result is indicative of a potential regulatory function for METTL14 as a limiting factor for the writer complex remains to be determined. METTL3, METTL14, and WTAP are widely conserved amongst eukaryotes as a "core" complex that delivers the m6A mark, but accessory proteins involved in this writer complex can differ among species. There are no apparent orthologues of any accessory proteins from other organisms in Toxoplasma based on sequence homology. To validate the associations between Toxoplasma METTL3 HA , METTL14 HA , and WTAP HA , and discover possible accessory writer proteins, we immunoprecipitated each of the HA-tagged proteins from nuclear extracts of freshly lysed tachyzoites and identified co-purifying proteins by mass spectrometry. Both METTL3 HA and METTL14 HA were robustly pulled down together in every experiment (Table 2). Interestingly, although WTAP was not enriched in either METTL3 HA or METTL14 HA pulldowns, a few peptides were present in one experiment of each (Tables 2 and S1).
In reciprocal pull down experiments, neither METTL3 nor METTL14 were found in WTA-P HA immunoprecipitates. Rather, WTAP strongly associated with two hypothetical proteins, TGGT1_226660 and TGGT1_275990. Although neither protein possesses any annotated domains, homology searches conducted on the HHPRED webserver [19] revealed a region in the N-terminus of TGGT1_226660 that displayed homology with the PF15912 domain corresponding to the N-terminal domain of VIRMA, a well-documented component of the m6A writer complex that helps guide the writer complex to the stop codon and 3'-UTR in higher eukaryotes [6,20]. Interestingly, when performing orthology searches of TGGT1_275990, we discovered that it is restricted to coccidian parasites, suggesting that this m6A writer-associated protein may be a unique feature present in select apicomplexan parasites.

The m6A epitranscriptome in tachyzoites and early bradyzoites
In higher eukaryotes, deposition of m6A has been shown to be dynamic and required for the efficient progression of developmental and biological processes [6,8,12]. We therefore characterized the m6A epitranscriptome as Toxoplasma converts from tachyzoites to early bradyzoites. We performed m6A-enriched RNA sequencing using the MeRIPseq method [21] in parallel with RNAseq in ME49 tachyzoites grown under normal conditions and under alkaline pH for 24 hours, a stress commonly used to induce bradyzoite conversion in vitro [22]. RNAseq measurements from parasites subjected to this induction trigger showed an upregulation of 295 gene transcripts, including those encoding proteins associated with bradyzoites, such as the cyst wall antigens CST1 and MAG1, the stage-specific enzymes ENO1 and LDH2, and AP2 transcription factors associated with early bradyzoite development. Additionally, among the 19 downregulated genes were the canonical tachyzoite surface antigen SAG1 [23,24] (Fig 3A  and S2 Table). While it is important to underscore that these parasites are not mature bradyzoites, our results are consistent with the gene expression pattern expected of tachyzoites that have begun the conversion into bradyzoites.
To analyze our datasets for m6A-containing transcripts, we utilized the MeTPeak software package that has been tailored for detecting m6A-enriched regions in MeRIPseq experiments [25]. We detected 1,896 peaks spread across 866 genes in our tachyzoite dataset and 1,841 peaks amongst 687 genes during early bradyzoite formation (S3 Table). The majority of the m6A-containing transcripts are shared between the two conditions ( Fig 3B). We queried the MeRIPseq datasets for gene ontology enrichment of biological processes and molecular functions, however no terms were significantly enriched. We then used MeTDiff [26], which takes the MeRIPseq and RNAseq data from all conditions into account to call m6A-containing peaks, to assess differential m6A methylation between the two conditions. This process detected a total of 6,777 peaks spread across 3,272 genes, however none were differentially methylated after this 24 hour stress-induction period (S3 Table). These results show that m6A is an abundant modification on mRNAs encoding factors of diverse functions in both tachyzoites and parasites undergoing early bradyzoite formation.

m6A marks are enriched near the 3'-boundary of mRNAs
To determine whether the modified mRNAs contain a consensus sequence targeted for m6A methylation, we utilized the DREME tool [27] which revealed a highly enriched motif: YGCAUGCR, in which Y represents a pYrimidine and R represents a puRine. The YGCAUGCR motif was identified in both tachyzoites and early differentiating bradyzoites, as well as in a combination of all the peaks identified by MeTDiff (Figs 3C and S1). Variations on a second AU-rich motif were also reliably identified, though with less significance (discussed below, S1 Fig).
In other organisms, the distribution of m6A marks within the transcriptome are known to be enriched in specific transcript regions. For example, m6A marks tend to localize near the stop codon in mammals [8], within the CDS in Plasmodium falciparum [28], and within the polyA tail in Trypanosoma brucei [29]. In our experiment, the MeRIPseq protocol resulted in an enrichment of m6A-containing fragments localized towards the 3' end of transcripts (S2 Fig). A metagene plot of m6A peaks demonstrates that they are highly enriched in Toxoplasma 3'-UTRs (Fig 3D). A specific example of a methylated transcript, the hypothetical protein TGME49_202650, shows increased MeRIPseq signal within its 3'-UTR and has m6A peaks (denoted as black bars) in both tachyzoite and early bradyzoite samples ( Fig 3E). When we examined the annotated peaks of this gene for the presence of the m6A-associated YGCAUGCR motif, we found it located just outside the boundary of the called peaks near the 3'-end of the transcript (Fig 3E, white arrowhead). We also noted that the YGCAUGCR motif was located only 22 nt upstream from a sharp decline in the mapping of the RNAseq data ( Fig  3E, black arrowhead) suggestive of the 3'-boundary of the transcript.
We suspected that this m6A-associated motif may be systemically enriched near the 3'boundary of mRNAs but that m6A-enriched regions may be difficult to detect due to the sharp decline in sequencing reads near the polyadenylation site. To assess this, we used the DREME algorithm to unbiasedly identify motifs that were enriched within the 3'-UTRs. We limited our analysis to the 6,109 protein coding genes with 3'-UTRs that were least 9 nt long since the Components of Toxoplasma m6A writer complex. A) Immunofluorescence assay of endogenously HA-tagged WTAP, METTL3, and METTL14 subunits of the writer complex. Each protein is localized to the parasites' nuclei. The outlines of parasites are shown in the merge panel for reference. B) Western blot of endogenously HA-tagged writer complex members. C) Relative quantification of writer complex member expression as assessed by Western blot densitometry. Data are presented as average ± standard deviation normalized to WTAP HA signal. � represents p � 0.05 as assessed by Student's t-test assuming unequal variances. D) Schematic summarizing the Toxoplasma writer complex membership as identified by co-immunoprecipitation experiments outlined in Table 2.
https://doi.org/10.1371/journal.ppat.1009335.g002 Significance (e value) is displayed in a bar chart for sequencing data from tachyzoites (Tz), bradyzoite-induced sample (Bz) and a combination of all m6A-enriched sequences (All). The same motif was independently identified from all annotated 3'-UTRs in ToxoDB. D) Metagene plot showing average distribution of m6A density along a normalized transcript from a single tachyzoite and bradyzoite-induced replicate. The m6A mark is most abundant in annotated 3'-UTRs. All three replicates from each condition displayed similar distributions. E) Distribution of m6A-enriched MeRIPseq (red) and input RNAseq (blue) reads from tachyzoite (Tz) and alkaline stressed (Bz) samples. m6A-containing peaks are displayed as black bars for both conditions. The m6A motif is indicated by the white arrowhead and the putative 3'-boundary is shown by the black arrowhead. All data are shown to scale. F) m6A-like motifs are enriched near mRNA 3'-boundaries YGCAUGCGR motif is 8 nt long. The top identified motif was a variation on the AU-rich motif previously identified in our MeRIPseq dataset, whereas the second was the aforementioned YGCAUGCR motif (Figs 3C and S1). We used the CentriMo algorithm [30] to determine whether these motifs were enriched near the annotated 3'-boundaries from the same gene set (S1 Fig). Whereas the AU-rich motifs were located downstream, the YGCAUGCR motif was significantly enriched in a region closely associated with the annotated 3'-boundary of mRNAs ( Fig 3F). This is strongly indicative that at least a subset of Toxoplasma transcripts is modified with the m6A mark near their 3'-boundary.

Depletion of writer components METTL3 and WTAP reduces m6A levels and arrests parasite replication
We endogenously tagged two members of the writer complex, METTL3 and WTAP, at their C-terminus with an HA epitope and auxin-inducible degron (AID) tag in RH strain parasites engineered to contain TIR1; this allows rapid degradation of the AID-fusion protein upon the addition of indole-3-acetic acid (IAA) [31]. Western blot analysis revealed near complete loss of the METTL3 HA-AID and WTAP HA-AID proteins in as little as 30 minutes following addition of 500 μM IAA in the culture medium ( Fig 4A). We also noted a decrease in SAG1 expression after extended periods of IAA incubation, possibly due to decreased parasite viability following extended loss of the writer proteins (see below). Consistent with their role as the members of the m6A methyltransferase, depletion of METTL3 and WTAP for 4 h diminished m6A levels in the parasite ( Fig 4B).
We next examined the effect of METTL3 and WTAP depletion on parasite replication using a standard parasite counting assay. Infected HFF cells were incubated for 16 h in IAA or DMSO vehicle; parasites depleted of either writer component exhibited arrested replication (Fig 4C). In order to determine whether parasites can recover from a transient loss of METTL3 expression, we conducted a plaque assay that used three different IAA treatment regimens ( Fig 4D). After an initial 24 h of growth, parasites were treated with IAA or vehicle for 24 hours, at which point the media was replaced with fresh IAA (or vehicle). In contrast to parental parasites, there were few to no plaques formed by METTL3 HA-AID or WTAP HA-AID parasites treated with a single pulse of IAA for 24 h or a sustained dose of IAA (Fig 4D and  4E). These results indicate that parasites could not recover after depletion of m6A for 24 h. It is (3'-B) in Toxoplasma transcripts. Each motif variant detected in this study is displayed. An enrichment of AU-rich elements was also seen downstream of the 3'-boundary. Statistical significance of the localized enrichment (Bonferroni-corrected one-tailed binomial test) is shown on the bottom right of the panel for each individual motif.

PLOS PATHOGENS
Toxoplasma m6A epitranscriptome noted that IAA did not inhibit parasite plaquing of the parental line; however, the tagging of METTL3 moderately compromises plaquing efficiency compared to the parental line ( Fig 4E). Together, these findings underscore the importance of m6A writer components METTL3 and WTAP for Toxoplasma viability.

YTH m6A reader proteins are core constituents of the cleavage and polyadenylation machinery in Toxoplasma
The m6A mark has been linked to a variety of biological processes, and the functional consequence of the modification is carried out by m6A reader proteins, most notably those containing the YTH RNA-binding domain [7][8][9]. The Toxoplasma genome encodes two such proteins, TGGT1_201200 and TGGT1_204070, which we termed YTH1 and YTH2, respectively (Table 1). To investigate their role in Toxoplasma, we endogenously tagged each at their C-termini with an HA epitope in RHΔΔ parasites [17,18]. Western blot analysis of YTH1 HA and YTH2 HA with anti-HA revealed primary bands of predicted sizes at~70 kDa and 60 kDa, respectively (Fig 5A and 5B). Multiple lower molecular weight bands were consistently detected for YTH1 HA and may either indicate that this protein is proteolytically processed or particularly heat labile. Additionally, a minor higher molecular weight band was consistently detected for YTH2 HA , suggestive of post-translational modification(s). As seen for the m6A writer complex members, both YTH1 HA and YTH2 HA localize to the parasite nucleus ( Fig 5C).
To gain insight into their cellular function, we immunoprecipitated each YTH protein from intracellular tachyzoites. Numerous proteins that co-purified with YTH1 HA are annotated in ToxoDB as core constituents of the 3' cleavage and polyadenylation machinery (Tables 3 and  S4), which coordinate the 3'-end formation of mRNAs. Additional interacting proteins include other putative RNA-binding proteins, many of which share clear sequence homology with other components of the 3'-end formation machinery (Table 3). Notably, the Toxoplasma YTH1 protein, which encodes N-terminal CCCH zinc finger (ZnF) domains and a C-terminal YTH domain, shares the same protein domain architecture and high homology with CPSF30-L (Cleavage and Polyadenylation Specificity Factor 30, long form) found in plants [32,33].
The YTH2 protein, which encodes an N-terminal YTH domain, also shares homology with CPSF30, though this is restricted to the YTH domain alone (Table 3). While fewer proteins were identified as interacting partners with YTH2 HA , one included a core member of the 3'end cleavage factor I complex and a second shares strong homology with the N-terminal region of mammalian Rb-binding protein 6 (RBBP6), which has been shown to play a role in 3'-end formation in yeast and mammals [34,35]. An additional interacting protein was an RRM-containing protein, which conceivably could play a role in binding mRNA during 3'end formation.
Collectively, these findings indicate that both YTH m6A reader proteins in Toxoplasma are components of the 3'-end mRNA cleavage and polyadenylation machinery, and that m6A deposition within this region serves to facilitate pre-mRNA 3'-end formation.

Deposition and recognition of m6A is required for normal mRNA 3'-end formation
Association of the YTH proteins with the cleavage and polyadenylation machinery, and the enrichment of the putative m6A motif near the 3'-boundary of transcripts, suggests that m6A plays a role in marking the 3'-end for processing. To test this hypothesis, we additionally generated endogenously tagged YTH1 HA-AID parasites in the RHΔΔ background and confirmed

PLOS PATHOGENS
that IAA depleted YTH1 protein (Fig 6A). We then assessed the consequences of m6A writer and reader depletion on the parasite transcriptome at 4 h and 16 h, the time points at which we observed decreased m6A levels and inhibition of parasite replication (Fig 4B and 4C). We treated METTL3 HA-AID , WTAP HA-AID , and YTH1 HA-AID parasites with 500 μM IAA for 4 h and 16 h and performed an RNAseq analysis (Figs 6B-6D and S3B-S3D and S5 Table). The tagging of these proteins had little effect on the transcriptome under basal conditions since non-treated samples clustered together in a principle component plot (S3A Fig). The addition of IAA led to an increase of 2,054 and 1,810 genes after 4 h and 16 h treatment, respectively, in the METTL3 HA-AID line whereas 1,166 and 1,375 genes were downregulated under these conditions (Figs 6B and S3B and S5 Table). Similarly, a greater number of genes were upregulated than downregulated in all strains and under both IAA treatment times. The majority of upregulated genes were shared between each strain after 4 h treatment (1,671) and a strong plurality of genes (1,449) were upregulated after 16 h IAA exposure (Figs 6E and S3E). The number of shared downregulated genes across all three parasite lines were 617 (4 h IAA) and 762 (16 h IAA) (Figs 6F and S3F). Top enriched GO terms for the shared upregulated genes included cilium and cell adhesion whereas shared downregulated genes were enriched for the term cytoplasm (S6 Table). A strong overlap between differentially expressed genes after 4 h and 16 h IAA treatments (S3G Fig) suggests that the parasites remained largely viable throughout the experiment, a concern since we saw a severe arrest of parasite replication at the later time point (Fig 4C). If m6A marks in the 3'-UTR function in 3'-end processing events that signal polyadenylation site selection and transcription termination, their loss would be expected to produce chimeric run-on transcripts. Our transcriptomics data show that more than a third of shared upregulated genes are arranged in a head-to-tail orientation whereas over 40% of shared downregulated genes are in a tail-to-tail orientation (S3H Fig); both observations are consistent with run-on transcripts being generated (Discussion). A de novo transcriptome assembly for each condition further supports this idea (Fig 6G and 6H). The gene TGGT1_255900 (putative Bax inhibitor-1) serves as a representative example of failed transcriptional termination following loss of either METTL3, WTAP, or YTH1 (Fig 6G). According to our MeRIPseq experiment, this gene is predicted to harbor m6A in its 3'-UTR (black bars), with the m6Aassociated YGCAUGCGR motif in close proximity to the gene's 3'-end boundary (arrowhead). Addition of IAA for either 4 h or 16 h in every knockdown strain led to increased read coverage of the intergenic region between TGGT1_255900 and the gene immediately downstream, TGGT1_255910 (indicated by grayscale heat maps), thereby predicting a chimeric transcript bridging the two genes ( Fig 6G). A systemic analysis of predicted transcript length demonstrated that this is a widespread phenomenon upon m6A writer or reader depletion (Fig 6H). Together, these results support a model whereby m6A marks the 3'-end of transcripts for processing, and impairment of either m6A deposition or recognition results in the formation of hybrid transcripts that negatively affects parasite fitness.

The Toxoplasma m6A writer complex possesses distinct features
In this study, we demonstrated that the Toxoplasma m6A writer complex consists of METTL3, METTL14, WTAP, a divergent VIRMA, and a writer-associated protein (WAP1) that remains to be characterized. Recently, affinity purification of the P. falciparum METTL3 homologue revealed that the writer complexes share a similar composition in both organisms, since METTL3, METTL14, WTAP, and the unannotated VIRMA (PF3D7_1366300) were co-purified [28]. The Toxoplasma m6A writer complex also shares some features of that in higher eukaryotes, such as a labile interaction between the METTL3-METTL14 and the WTAP portions of the complex. Studies in Drosophila melanogaster have determined that the interaction between the METTL3-METTL14 heterodimer and the WTAP portion of the writer complex are sensitive to high salt concentrations such as those that we used during the nuclear extraction process [10]. Alternatively, it has also been proposed that the METTL3-METT14 and WTAP portions of the complex may have distinct and non-overlapping functions [10].
Several reports have indicated that the catalytically active complex requires both METTL3 and METTL14 in a 1:1 stoichiometry [36,37]; it is therefore tempting to speculate that the lower amount of METTL14 in relation to METTL3 is biologically significant since it has been observed both here and in P. falciparum at the transcript level (Fig 2) [28]. For example, METTL14 limitation may help regulate writer complex functionality in Toxoplasma. Whether METTL14 expression is regulated by cell cycle, environmental stimuli, or parasite differentiation are important areas of future investigation. Additionally, understanding the role of the novel WAP1 subunit in modulating m6A writer activity remains to be explored. Given that each member of the writer complex is a fitness-conferring gene for tachyzoite growth [16], depletion of METTL3 halts parasite growth (Fig 4), and that the complex contains distinct members (Fig 2D), it seems reasonable to conclude that deposition of the m6A mark is crucial to parasite viability and may be a good target for generating future anti-Toxoplasma therapies.

m6A contributes to marking the 3'-end of newly synthesized transcripts
Since m6A deposition did not show stage-specific regulation (Fig 3), and the parasite does not appear to encode m6A eraser proteins (Table 1), we propose that the m6A mark may be nonreversible and less dynamic than what has been proposed in higher eukaryotes [38,39]. Like Toxoplasma, P. falciparum also lacks identifiable m6A eraser proteins [40]. Taken along with the essential requirement of the m6A mark, these characteristics speak to its involvement in coordinating essential house-keeping functions in apicomplexans.
The m6A writer complex is known to modify mRNA co-transcriptionally [9]. In other systems, this occurs on specific consensus sequences, supposedly due to recognition of these motifs by components of the writer complex. While animals, plants, and brewer's yeast share similar motifs of DRACH, RRACH, and RGAC, respectively [8,41,42], these motifs are varied in parasites such as P. falciparum (GGACA) [28] and Trypanosoma brucei (CAU) [29].
Our MeRIPseq experiment identified a strongly enriched motif (YGCAUGCR) near the annotated 3'-boundaries of mRNAs, raising the prospect that m6A marks the 3'-end of pre-mRNAs for processing (Fig 3C). In other species, pre-mRNAs are processed at their 3'-termini when a collection of sequence motifs are recognized by the cleavage and polyadenylation machinery. In animals, these are categorized as the upstream sequence element, the polyadenylation signal (PAS), the cleavage site, and a GU-rich downstream sequence element [43]. The PAS consists of an AAUAAA motif located within 10-35nt upstream of the cleavage site, but this is only conserved in~70% of transcripts [43]. A similar organization exists in plants: the far upstream element (FUE) precedes an A-rich near upstream element (NUE) located 10-40nt upstream of the U-rich cleavage element [44][45]. Strict sequence conservation of the NUE is less important than what is seen in animals [45].
A recent attempt to identify the PAS of Toxoplasma and other related apicomplexan parasites was unable to define specific motifs [46] conservation is not required. However, a distinct enrichment of adenosine residues was noted to be present~20 nt upstream of the cleavage site [46] which could be supportive of a role for m6A in coordinating the process. It is tempting to speculate that the YGCAUGCR motif, which is located in close proximity upstream of annotated 3'-boundary (Fig 3E and 3F), operates analogously to the plant-like FUE in at least a subset of transcripts, as has been demonstrated in Arabidopsis [47]. We also identified an AU-rich motif that was enriched to within 10-40 nt downstream of the m6A element, putting it in a range that is consistent with it operating as an analogue to the NUE of plants [47]. However, it is important to note limitations of the MeRIPseq methodology that we employed here, namely the lack of nucleotide resolution in m6A detection, potential sequence biases that have been described with antibody-based m6A identification approaches, and a high false positive rate of detection [48,49]. Application of antibody-independent methodologies that provide nucleotide resolution will improve the identification of m6A marks throughout the Toxoplasma transcriptome (see Concluding Remarks below).

Plant-like m6A reader proteins coordinate mRNA 3'-end processing in Toxoplasma
The near complete purification of the cleavage and polyadenylation machinery along with the only two YTH domain-containing proteins strongly positions the m6A system as a foundational piece driving 3'-end formation. The functions and individual constituents of the various cleavage and polyadenylation machinery are reviewed elsewhere [32,43,45]; however, they can be broken down into cleavage and polyadenylation specificity factors (CPSF), cleavage stimulation factors (CSTF), and cleavage factors I and II (CFs).
CPSF30 plays a central role in coordinating the machinery associated with processing the 3'-end of transcripts. Interestingly, Arabidopsis thaliana produces two CPSF30 isoforms that arise from an alternative polyadenylation event [32,33]. The N-terminus of CPSF30 contains CCCH-type ZnF motifs that are responsible for binding the NUE [47] and is present in both CPSF30 isoforms whereas the longer version, CPSF30-L, is extended at its C-terminus and encodes an YTH domain which binds the FUE, consistent with the protein architecture of apicomplexan YTH1 proteins [47,50]. Plant CPSF30-L plays a role in nitrate signaling, the floral transition, and the ABA response by mediating alternative polyadenylation events [47,51], and mediates the polyadenylation of tandemly duplicated genes in an m6A-dependent manner [52], suggesting functional conservation of the m6A-polyadenylation axis from apicomplexan parasites to plants.
Although both YTH1 and YTH2 share homology with CPSF30 in plants, it is likely that only YTH1 plays the role of plant CPSF30-L in Toxoplasma since it shares the same domain architecture; in contrast, YTH2 homology is restricted to the YTH domain and lacks the ZnFs characteristic of CPSF30s in all other organisms. In addition, unlike YTH2, YTH1 co-purified with all the other CPSF and CSTF factors that are identifiable in the Toxoplasma genome (Table 3), which is consistent with the role of CPSF30 as a central component of this machinery.
The exact role YTH2 plays in the 3' end formation of transcripts remains to be resolved; however, it did associate with clearly identifiable members of the cleavage machinery (Table 3). Furthermore, a recent study has determined that the P. falciparum YTH2 homologue binds mRNAs near their 3'-ends [40], providing additional support for it playing a similar role in apicomplexan parasites. However, it has also recently been implicated in modulating translation in P. falciparum [53]. Future studies are required to determine the exact role YTH2 plays in 3'-end formation.

The m6A system directs essential process of mRNA 3'-end formation in Toxoplasma
By depleting parasites of m6A writer components METTL3 and WTAP, and the m6A reader YTH1, we demonstrated that the m6A system is critical for 3'end processing of mRNA transcripts. Knockdown of any of the aforementioned components of the m6A system produces run-on chimeric transcripts and a widespread upregulation of transcript levels (Figs 6 and S3). We propose that these observations are due to impaired transcription termination, whose signaling is intimately linked with 3'-end formation (reviewed in [54]). Supportive of this model, downregulated genes were generally arranged in tail-to-tail gene orientations (S3H Fig). Deficiency in m6A deposition or recognition resulting in transcriptional read through rather than termination may enable the production of antisense transcripts spanning these loci; antisense RNA has been shown to downregulate cognate targets in Toxoplasma [55][56][57]. Considered together, this widespread dysregulation and disruption of transcriptional control is the likely mechanism leading to the irreversible demise of m6A knockdown parasites.

Concluding remarks
Herein, we explored each identifiable component of the m6A system in Toxoplasma. In so doing, we uncovered the role that the m6A mark plays in coordinating the 3'-end formation of transcripts. Due to the similarity of the machinery and the organization of the motifs surrounding the cleavage site of transcripts with plants, we propose that the m6A system plays an evolutionarily conserved role in this process from plants to apicomplexan parasites. This raises the tantalizing possibility of leveraging the m6A system for future anti-Toxoplasma therapies, especially since 3'-end processing has previously been suggested to be an attractive drug target in Toxoplasma and P. falciparum [58][59].
The discovery that the m6A system is involved in transcript maturation raises several possible avenues for future investigations. For instance, we have yet to understand whether m6A plays a role in regulating the alternative polyadenylation of transcripts, a wide-spread phenomenon that has been detected in Toxoplasma and other apicomplexans [46,60,61]. Additionally, since we only identified reader proteins in the nucleus and many additional proteins outside of the YTH family have been shown to recognize the m6A mark in other organisms [8], whether the m6A mark directs other post-transcriptional regulatory events and the existence of cytoplasmic-localized readers remain open questions.
While our manuscript was under review, a preprint entitled "A plant-like mechanisms coupling m6A reading to polyadenylation safeguards transcriptome integrity and developmental genes partitioning in Toxoplasma" by Farhat et al. was posted to the bioRxiv server [62]. As implied by the title, this study reached many of the same conclusions presented here, including the m6A writer and YTH1 complex constituents and the role of m6A in 3'-end formation. Of interest, Farhat and colleagues used Nanopore direct RNA sequencing, which identified RRAC as the m6A consensus sequence, which is more in line with previously identified m6A motifs. This finding highlights the superior resolution of direct RNA sequencing over antibody-based methods like MeRIPseq. We propose that the most likely scenario is that the adenosine in RRAC is the substrate of m6A; the contribution of the motif identified in our dataset (YGCAUGCR) on the m6A system-if any-remains to be resolved.

Cell strains and culture conditions
All parasites were cultured in human foreskin fibroblasts (ATCC: SCRC-1041) using standard procedures. Fibroblasts were initially grown in DMEM supplemented with 10% heat inactivated FBS until they reached confluency. The type II Toxoplasma gondii strain ME49 (ATCC: 50611) was used to assess m6A localization and for MeRIPseq. RH strain parasites in which the HXGPRT and Ku80 genes (RHΔΔ) have been removed [17,18] were used for endogenous tagging of genes of writer and reader proteins. RHΔΔ engineered to express a copy of the TIR1 [31], gifted by Dr. David Sibley (Washington University), was used to generate the METTL3 HA-AID WTAP HA-AID , and YTH1 HA-AID lines. All tachyzoite cultures were maintained in DMEM supplemented with 1% heat inactivated FBS and 5% CO 2 . Bradyzoite differentiation was induced by incubating cultures in RPMI media supplemented with 5% heat inactivated FBS and buffered with 50 mM HEPES (pH 8.3) in an ambient CO 2 incubator. Media for bradyzoite cultures was changed daily to maintain alkaline pH.

Endogenous tagging of genes
All writer proteins as well as YTH1 tagged lines were made by transfecting an appropriate linearized pLIC vector [18] into RHΔΔ parasites to generate C-terminal tagging by single homologous crossover. Briefly, for each gene, a sequence corresponding to approximately 1kb upstream of the annotated coding sequence stop codon was amplified from genomic DNA and cloned into the PacI site of the pLIC_HA vector. All primers are listed in S7 Table. The plasmids were linearized using a unique restriction site within the gene-specific sequence and 5 μg were transfected into the parasites using a Nucleofector (Lonza Biosciences). Clones were selected for the DHFR cassette with 2 μM pyrimethamine and the correct integration was validated by PCR of the genomic DNA.
The YTH2 HA , METTL3 HA-AID , WTAP HA-AID , and YTH1 HA-AID lines were made by cotransfecting pCas9-GFP [63] and a PCR amplicon encoding the selected tag and selectable marker into parasites to generate C-terminal tagging by double homologous crossover. Briefly, the guide RNA cassette from pCas9-GFP was mutagenized by PCR to target cleavage near the stop codon of the gene of interest. A PCR amplicon was generated from a pLIC series vector to contain the HA (or HA-AID) epitope and the DHFR cassette along with an approximate 40nt overlap of upstream and downstream of the cut site. All primers are listed in S7 Table. Once generated, 1 μg pCas9-GFP plasmid and 2 μg donor PCR amplicon were transfected into parasites using a Nucleofector (Lonza Biosciences). Clones were selected for the DHFR cassette with 2 μM pyrimethamine and the correct integration was validated by PCR of the genomic DNA.

Affinity purification of HA-tagged proteins
All affinity purifications were performed using intracellular tachyzoites freshly isolated from HFF cells. Two independent experiments were conducted separately for the writer proteins, whereas two biological samples were processed at the same time to generate data for the reader proteins. Each experiment also included parental strain parasites to control for non-specific interactions. Parasites were collected from infected fibroblasts by scraping the monolayers and passing them through a 23-gauge syringe and 5 μm pore filter. The parasites were washed twice with in PBS and then processed for subcellular fractionation. Affinity purification of the writer components were conducted from nuclear extracts whereas those conducted for the reader proteins were performed on whole cell lysate. Nuclear extracts were obtained by first incubating parasites for 5 minutes on ice in low salt lysis buffer (50 mM HEPES-NaOH pH 7.5, 10 mM NaCl, 0.1% NP-40, 20% glycerol), centrifugation to pellet nuclei and resuspending the pellet in high salt lysis buffer (50 mM HEPES-NaOH pH 7.5, 420 mM NaCl, 0.4% NP-40, 20% glycerol). Whole cell lysate was generated by incubating in lysis buffer (50 mM Tris pH 7.4, 150 mM NaCl, 1 mM MgCl 2 , 0.5% NP-40, 10% glycerol) and then passaging through a 23-gauge syringe. After clarifying the lysate in a refrigerated centrifuge at 21,000 xg in a tabletop centrifuge, the lysate was precleared on mouse IgG magnetic beads (Cell Signaling, 5873S) for 1 h while rotating at 4˚C. The supernatant was transferred onto α-HA magnetic beads (Thermo Fisher, 88836) that were prewashed with lysis buffer and incubated overnight while rotating at 4˚C. The beads were washed 3x with lysis buffer (with 0.05% NP-40) and 3x with PBS then submitted for protein discovery with the Proteomics Core Facility at the Indiana University School of Medicine. Data was filtered to show proteins that were not detected in the parental strains and had at least 2 peptides in each replicate experiment.

Parasite viability assays
Parasite replication assays were performed by inoculating confluent HFF monolayers. The parasites were allowed to invade for 4 h, then the media was changed to include either 500 μM IAA (Sigma, I2886) or an equivalent volume of DMSO vehicle. The cultures were grown for 16 h before fixation. After staining, 100 randomly chosen vacuoles were counted for each condition in triplicate. Two independent experiments were performed.
For plaque assays, 5,000 parasites/well were inoculated into 12-well plates. After 24 h, the media was aspirated and wells were treated either with 500 μM IAA or an equivalent amount of DMSO vehicle. After an additional 24 h, the media was aspirated and wells were treated with either IAA or DMSO according to the designated treatment regimen. After an additional 4 days of growth, the plates were fixed in ice cold methanol and stained with crystal violet. Plaques were counted for each well which included 4 replicates of each condition. At least two independent experiments were conducted with similar results. Significance was assessed by Student's t-test assuming unequal variance.

Western blots
To collect protein lysate, infected monolayers were rinsed twice with PBS and lysed in RIPA buffer (25 mM Tris pH 7.4, 150 mM NaCl, 0.1% SDS, 1% NP-40, 0.5% sodium deoxycholate) supplemented with a protease and phosphatase inhibitor cocktail (Sigma, PPC1010). Lysates were sonicated on ice and clarified by centrifugation at 21,000 xg for 10 min at 4˚C. Samples were run on NuPAGE gels and transferred to nitrocellulose membranes. Blots were blocked for 30 min in a 5% non-fat milk TBST solution. Primary antibodies against the HA epitope (Roche, 11867423001, 1:1,000) and the SAG1 loading control (Thermo Fisher, MA5-18268, 1:1,000) were incubated overnight with the blot while rocking overnight at 4˚C. Bands were detected by chemiluminescence using appropriate HRP-conjugated secondary antibodies (GE Healthcare, NA935V, NA931V, 1:5,000).

Dot blot
Total RNA was extracted from extracellular parasites with TRIzol (Ambion) as per the manufacturer. An aliquot of the RNA was run on an agarose gel to check for RNA integrity and for the absence of the 28S rRNA band characteristic of host cell contamination. Genomic DNA was isolated from the same aliquot of parasites with DNeasy Blood and Tissue kit (Qiagen) as per the manufacturer. Indicated amounts of nucleic acids were spotted onto Hybond-N + membrane (GE Healthcare) with a dot blot apparatus (BioRad). The nucleic acid was fixed to the membrane with a Stratagene crosslinker. After the membrane had dried, total nucleic acid stained with SYBR Gold (Invitrogen) for 5 min. Next, the membrane was blocked for 30 min in a 5% fat-free milk solution for 30 minutes followed by a 1 h incubation with α-m6A antibody (Abcam, ab151230, 1:1,000). The m6A mark was detected by chemiluminescence using an HRP-conjugated secondary antibody (GE Healthcare, NA934V, 1:5,000).

Quantitation of m6A levels
Total RNA was isolated from purified parasites that were treated with vehicle or 500 μM IAA as outlined above. The amount of m6A modification was assessed from 300 ng total RNA using the colorimetric EpiQuik m6A methylation quantification kit as per manufacturer's instructions (EpiGentek, P-9005-96). The m6A levels were determined from three or four biological replicates and significance was assessed using Student's t-test assuming unequal variances.

MeRIPseq and data analysis
For each sample, three T-175 flasks containing confluent HFFs were inoculated with ME49 parasites at an MOI of 10 and grown under tachyzoite conditions for 2 days. The media was aspirated and replaced to allow for tachyzoite or bradyzoite growth conditions for a further 24 h. Parasites were harvested by rinsing monolayers in PBS, scraping, syringe passage, and filtration as described for affinity purification experiments. Total RNA was isolated from the parasites and quality was assessed by gel electrophoresis. mRNA was enriched using Arraystar Seq-Star poly(A) isolation kit as per manufacturer's instructions and subsequently fragmented by heating at 94˚C in fragmentation buffer (10 mM Tris pH 7.0, 10 mM Zn 2+ ). An aliquot of mRNA was saved for RNA sequencing as an input control for m6A RNA immunoprecipitation sequencing (MeRIPseq). Fragmented RNA was incubated with a polyclonal α-m6A antibody (Synaptic Systems, 202003) for 2 h at 4˚C followed by an additional 2 h incubation with Dynabeads. Three washes were conducted with IP buffer (10 mM Tris pH 7.4, 150 mM NaCl, 0.1% NP-40) followed by two additional washes in wash buffer (10 mM Tris pH 7.4, 50 mM NaCl, 0.1% NP-40). The m6A-enriched mRNA was eluted in elution buffer (10 mM Tris pH 7.4, 1 mM EDTA, 0.05% SDS, 40U proteinase K) at 50˚C for 30 minutes and subsequently phenol-chloroform extracted and ethanol precipitated. The enrichment of m6A was assessed by Arraystar using a proprietary method (S2B Fig). RNAseq libraries were constructed with KAPA Stranded mRNA-seq kit as per the manufacturer and sequencing was performed on an Illumina HiSeq 4000 system.
The annotated Toxoplasma ME49 strain genomic information was downloaded from Tox-oDB v45 [64]. Sequencing reads were depleted of tRNA and rRNA reads in silico using Bowtie [65]. The remaining reads were aligned to the ME49 genome with HISAT2 [66]. Differential expression was determined with DESEQ2 [67]. Deduplicated sequencing reads were used to call m6A peaks using MeTPeak [25] and differential m6A marks were assessed using MeTDiff [26]. Gene coverage was assessed with the RSeQC package [68]. Determination of the m6A motif was conducted with DREME [27] using-rna-norc-mink 4 and-maxk 8 options and its positional enrichment relative to the 3'-end of the transcript was determined with CentriMO [30]. Default settings were used for all other bioinformatic analyses. All datasets from this work have been deposited in the NCBI GEO database under accession number GSE165067.

RNAseq of knockdown lines and data analysis
A T-75 flask containing confluent HFFs was inoculated with either METTL3 HA-AID , WTA-P HA-AID , or YTH1 HA-AID parasites at an MOI of 10 and grown under tachyzoite conditions 36 h prior to harvest. 500 μM IAA was added as indicated 4 or 16 h prior to harvest. Parasites were harvested and RNA was collected in TRIzol as previously described above. RNAseq libraries were constructed using standard Illumina protocols and sequenced 2x150bp on an Illumina HiSeq system. All datasets from this work have been deposited in the NCBI GEO database accession number GSE178355 Differential expression of IAA-treated lines was determined following the same pipeline as described above without in silico depletion of rRNA and tRNA. De novo transcript assembly was performed with StringTie using default settings [62]. Transcript length and gene orientation were determined using tools available on the public Galaxy webserver and ToxoDB.org [64,69].