A mutagenesis screen for essential plastid biogenesis genes in human malaria parasites

Endosymbiosis has driven major molecular and cellular innovations. Plasmodium spp. parasites that cause malaria contain an essential, non-photosynthetic plastid—the apicoplast—which originated from a secondary (eukaryote–eukaryote) endosymbiosis. To discover organellar pathways with evolutionary and biomedical significance, we performed a mutagenesis screen for essential genes required for apicoplast biogenesis in Plasmodium falciparum. Apicoplast(−) mutants were isolated using a chemical rescue that permits conditional disruption of the apicoplast and a new fluorescent reporter for organelle loss. Five candidate genes were validated (out of 12 identified), including a triosephosphate isomerase (TIM)-barrel protein that likely derived from a core metabolic enzyme but evolved a new activity. Our results demonstrate, to our knowledge, the first forward genetic screen to assign essential cellular functions to unannotated P. falciparum genes. A putative TIM-barrel enzyme and other newly identified apicoplast biogenesis proteins open opportunities to discover new mechanisms of organelle biogenesis, molecular evolution underlying eukaryotic diversity, and drug targets against multiple parasitic diseases.


Author summary
Plasmodium parasites, which cause malaria, and related apicomplexan parasites evolved from photosynthetic algae that acquired their chloroplast through two successive endosymbioses. Although no longer photosynthetic, the apicomplexan plastid-or apicoplast -was retained in these pathogens and provides critical metabolites during host cell infection. The apicoplast is of major interest for its unique biology and potential to yield new antimalarial drug targets. Here, we focused on the critical genes required to grow, divide, and inherit new apicoplasts during parasite replication. Given the apicoplast's divergent evolution, most of these cannot be recognized by their homology to genes with known functions. Instead, we overcame significant technical challenges in the Plasmodium PLOS  Introduction Plasmodium spp., which cause malaria, and related apicomplexan parasites are important human and veterinary pathogens. These disease-causing protozoa are highly divergent from well-studied model organisms that are the textbook examples of eukaryotic biology, such that parasite biology often reveals striking eukaryotic innovations. The apicoplast, a nonphotosynthetic plastid found in apicomplexa, is one such "invention" [1,2]. These intracellular parasites evolved from photosynthetic algae that acquired their plastids through secondary endosymbiosis [3]. During secondary endosymbiosis, a chloroplast-containing alga, itself the product of primary endosymbiosis, was taken up by another eukaryote to form a secondary plastid [4].
Despite the loss of photosynthesis in apicomplexans, the apicoplast contains several prokaryotic metabolic pathways, is essential for parasite replication during human infection, and is a target of antiparasitic drugs [5][6][7][8].
There are many traces of the apicoplast's quirky evolution in its present-day cell biology, in particular the pathways for its biogenesis. Like other endosymbiotic organelles, the single apicoplast cannot be formed de novo and must be inherited by its growth, division, and segregation into daughter cells. The few molecular details we have about apicoplast biogenesis hint at the major innovations that have occurred in the process of adopting and retaining this secondary plastid. The apicoplast is bound by four membranes acquired through successive endosymbioses, such that apicoplast proteins transit through the endoplasmic reticulum (ER) and use a symbiont-derived ER-associated degradation (ERAD)-like machinery (SELMA) to cross the two new outer membranes [9]. Curiously, autophagy-related protein 8 (Atg8), a highly conserved eukaryotic protein and key marker of autophagosomes, localizes to the apicoplast and is required for its inheritance in Plasmodium and the related apicomplexan parasite Toxoplasma gondii [10][11][12]. While SELMA and PfAtg8 are clear examples of molecular evolution in action, other novel or repurposed proteins required for apicoplast biogenesis remain undiscovered.
So far, new apicoplast biogenesis proteins have primarily been discovered through candidate approaches. SELMA was first identified as homologs of the ERAD machinery encoded in the nucleomorph, the remnant nucleus of the eukaryotic symbiont found in some algal secondary plastids [13]. Nuclear-encoded versions of SELMA containing apicoplast-targeting sequences were then detected in the genomes of apicomplexan parasites (which lack a nucleomorph) [14]. Because the apicoplast proteome is enriched in proteins likely to perform biogenesis functions such as protein import or genome replication, several apicoplast-targeted proteins of unknown function have also been shown to be required for its biogenesis [15]. ATG8's novel apicoplast function was discovered serendipitously by its unexpected localization on the apicoplast instead of autophagosomes [10]. Though candidate approaches have yielded new molecular insight [16][17][18], in general they are indirect and may bias against novel pathways.
In blood-stage Plasmodium falciparum, a method to chemically rescue parasites that have lost the apicoplast has paved the way for functional screens [19][20][21]. Addition of isopentenyl diphosphate (IPP) to the growth media is sufficient to reverse growth inhibition caused by apicoplast loss because it is the only essential metabolic product of the apicoplast in the blood stage. Taking advantage of the apicoplast chemical rescue, we recently took the first unbiased approach to discover a new apicoplast biogenesis protein [22]. We first screened for smallmolecule inhibitors that specifically disrupt apicoplast biogenesis in P. falciparum. Subsequent target identification led us to a membrane metalloprotease, the P. falciparum ATP-dependent zinc metalloprotease 1 (PfFtsH1), with an unexpected role in apicoplast biogenesis. This chemical genetic screen has the advantage of unbiased sampling of druggable targets in apicoplast biogenesis pathways. Unfortunately, it lacks throughput given the painstaking process of mapping inhibitors to their molecular targets.
Forward genetic screens are widely performed to uncover novel cellular pathways, such as those required for apicoplast biogenesis. Recently, genome-scale deletion screens performed in several apicomplexan parasites have uncovered a plethora of essential genes of unknown function [23][24][25]. Several challenges impede large-scale functional analysis of these essential genes. First, targeted gene modifications are still slow and labor intensive in P. falciparum, the most deadly of the human malaria species, despite an available in vitro blood culture system [26,27]. Second, efficient methods for generating conditional mutants, such as RNA interference (RNAi) or Clustered Regularly Interspaced Palindromic Repeats interference (CRISPRi) systems, are lacking in all apicomplexan organisms [28]. Finally, high-throughput, single-cell phenotyping for important functions need to be developed [29]. Overcoming these significant limitations, we designed a forward genetic screen using chemical mutagenesis, apicoplast chemical rescue, and a fluorescent reporter for apicoplast loss to identify essential apicoplast biogenesis genes in blood-stage P. falciparum. The screen identified known and novel genes required for apicoplast biogenesis and is, to our knowledge, the first forward genetic screen to assign essential cellular functions in P. falciparum.

A conditional fluorescent reporter enables single-cell phenotyping of apicoplast loss in P. falciparum
The apicoplast must be propagated during parasite replication, such that biogenesis defects result in newly replicated daughter parasites that do not contain an apicoplast. Because no clearance mechanism is known to eliminate the apicoplast in asexual blood-stage parasites, defective organelle biogenesis resulting in loss of inheritance is likely the major cause of apicoplast loss. To isolate rare P. falciparum mutants that have lost their apicoplast due to biogenesis defects, we set out to design a live-cell reporter for apicoplast loss. The apicoplast contains a prokaryotic caseinolytic protease (Clp) system composed of a Clp chaperone (ClpC) that recognizes and unfolds substrates and a Clp protease (ClpP) that degrades the recognized substrates [30][31][32]. We hypothesized that (1) in the presence of a functional apicoplast, ClpCP could be co-opted to degrade and turn "off" a fluorescent reporter, whereas (2) loss of the apicoplast would result in loss of ClpCP activity and therefore turn "on" the reporter (Fig 1A).
Clp substrates are typically recognized by unstructured degron sequences, the best studied of which is a transfer-messenger RNA (ssrA) that appends a short peptide to the C-terminus of translationally stalled proteins [33,34]. However, the substrate specificity of the apicoplast ClpCP system has yet to be defined. Reasoning that the PfClpCP homolog might recognize similar degrons as bacterial or algal Clp systems, we tested two degrons recognized by Escherichia coli ClpXP-the E. coli ssrA peptide (EcssrA) and X7-and the predicted ssrA peptide in the red alga Cyanidium caldarium, CcssrA (S1 Table) [35][36][37]. To assess their functionality in P. falciparum, the C-terminus of an apicoplast-targeted green fluorescent protein (acyl-carrier protein leader peptide [ACP L -GFP]) was modified with each of the degrons (Fig 1B). A cytosolic mCherry marker was also expressed on the same promoter as ACP L -GFP via a T2A "skip" peptide to normalize apicoplast GFP levels to stage-specific promoter expression [38]. Each construct was then integrated into an ectopic attB locus in Dd2 attB parasites to generate the reporter strains [39].
For each reporter strain, the ratio of GFP:mCherry fluorescence (as detected by flow cytometry) was assessed in untreated parasites containing an intact apicoplast, designated apicoplast (+), and in parasites treated with actinonin (which causes apicoplast loss via inhibition of FtsH1) and rescued with IPP rendering them apicoplast(−). In the absence of a degron, the Reporter construct for expression of apicoplast-targeted GFP (ACP L -GFP) and cytoplasmic mCherry used for integration into P. falciparum Dd2 attB parasites. (C) Ratio of GFP:mCherry fluorescence in untreated versus actinonin/IPP-treated parasites expressing ACP L -GFP-degron. Data are shown as mean ± SEM (n = 3). ���� P < 0.0001, unpaired two-tailed t test. See also S1 Fig. Tabulated data are shown in S1 Data. (D) GFP protein levels in untreated versus actinonin/IPP-treated parasites expressing GFP-EcssrA. Higher molecular weight species in GFP blot from unprocessed ACP L is indicated with black arrowhead. (E) Flow cytometry plots showing mCherry and GFP fluorescence in untreated versus actinonin/IPP-treated parasites expressing GFP-EcssrA. Uninfected RBCs are gated away (lower left quadrants), and the percentage mCherry + , GFP + parasites in the gated populations relative to the total number of cells quantified are indicated. (F) Representative live-cell fluorescent images of untreated and actinonin/IPP-treated parasites expressing mCherry and GFP-EcssrA. Hoechst stains for parasite nuclei. Scale bar 5 μm. ACP, acyl-carrier protein; ACP L , ACP leader peptide; a.f.u., arbitrary fluorescence units; EcssrA, E. coli ssrA peptide; GFP, green fluorescent protein; IPP, isopentenyl diphosphate; RBC, red blood cell.
We further characterized ACP L -GFP-EcssrA because the greatest recovery of GFP fluorescence following apicoplast loss was observed with this reporter (Fig 1C). Consistent with degradation of GFP in the apicoplast being dependent on EcssrA peptide, ACP L -GFP-EcssrA protein was detected at a significant level only in apicoplast(−) parasites, while unmodified ACP L -GFP was detected in both apicoplast(+) and (−) parasites ( Fig 1D). Of note, cleavage of the apicoplast-targeting ACP L sequence does not occur in apicoplast(−) parasites, resulting in an ACP L -GFP protein of higher molecular weight compared to apicoplast(+) parasites [19]. Flow cytometry and live fluorescence microscopy confirmed that apicoplast(+) parasites displayed only cytosolic mCherry fluorescence, whereas apicoplast(−) parasites displayed both cytosolic mCherry and dispersed, punctate GFP fluorescence (Fig 1E and 1F). As expected, addition of IPP alone did not result in significant formation of apicoplast(−) parasites or recovery of ACP L -GFP-EcssrA (S1 Fig and S1 Data). Taken together, these results demonstrate that ACP L -GFP-EcssrA serves as a specific reporter for apicoplast loss in P. falciparum.

A phenotypic screen isolates a collection of apicoplast(−) mutant clones
Next, we used the ACP L -GFP-EcssrA reporter strain to perform a phenotypic screen for apicoplast(−) mutants (Fig 2A). Ring-stage parasites were mutagenized with the alkylating agents ethyl methanesulfonate (EMS) or N-ethyl-N-nitrosourea (ENU) with the expectation that this would generate a more diverse population of parasites, some of which harbor mutations in apicoplast biogenesis genes rendering them apicoplast(−) [21,22]. To rescue the lethality of apicoplast loss, mutagenized parasites were supplemented with 200 μM IPP in the growth media. A control group of non-mutagenized parasites was also cultured with IPP to assess whether apicoplast(−) mutants naturally occurred over time in the absence of selective pressure to maintain the organelle.
After two replication cycles to allow for initial apicoplast loss, mCherry-expressing parasites displaying GFP fluorescence greater than about 70% percentile were isolated by fluorescenceactivated cell sorting (FACS). Selected mCherry(+) and GFP(+) parasites were then allowed to propagate to a detectable parasitemia before being subjected to another round of FACS. In one exemplary ENU-mutagenized population, a distinct population of GFP-positive parasites began to emerge after just two rounds of FACS ( Fig 2B). Other populations showed enrichment beginning after three rounds of FACS. No significant enrichment of GFP signal was observed when parasites were (1) grown with IPP over time without FACS or (2) grown without IPP and subjected to FACS, suggesting that we specifically enriched for apicoplast(−) mutants. Parasites from the final enriched apicoplast(−) populations were individually sorted  to generate apicoplast(−) clones derived from single parasites. Each clonal population was then re-checked for IPP-dependent growth and GFP fluorescence.

Whole-genome sequencing identifies 12 candidate genes required for apicoplast biogenesis
To determine the genetic basis of apicoplast loss, we sequenced the genomes of 51 apicoplast (−) clones (mutagenized = 40; non-mutagenized = 11) and three parent populations. Sequenced reads were mapped to the genome of P. falciparum 3D7 (version 35) with an average read depth of 26 for all samples. Notably, while the apicoplast genome was sequenced at an average depth of 16 reads in the parent populations, it was only detected with an average of 0.02 reads in apicoplast(−) clones ( Fig 2C and S2 Data). Because the organellar genome is a marker for the apicoplast, the absence of the apicoplast genome confirmed the loss of the apicoplast in the sequenced clones [19,22].
We next performed single nucleotide variant (SNV) analysis. A raw variant list was generated for each sample by comparison to the reference 3D7 genome and included SNVs found in parent populations at �5% allele frequency (minimum one read), and in apicoplast(−) clones at �90% (minimum five reads). Any SNV identified in apicoplast(−) clones that was also identified in any of the three parent populations was removed. We also filtered out SNVs detected in noncoding regions or resulting in synonymous amino acid changes in coding regions. Finally, SNVs identified in hypervariable regions of the genome (including the rifin, stevor, and EMP gene families) and/or previously annotated in the PlasmoDB single nucleotide polymorphism (SNP) database were excluded. After these filtering steps, 23 apicoplast(−) clones had at least one but no more than three SNVs that differed from the parent populations ( Fig  2D; S2 Table).
Because genes required for apicoplast biogenesis ought to be essential, we used essentiality data from literature or whole-genome deletion screens performed in blood-stage P. falciparum and P. berghei to prioritize gene candidates [24,25]. Of 18 unique SNVs identified, 12 were in genes categorized as "essential" in blood-stage P. falciparum and/or P. berghei (Table 1 and S2  Table). Although PfFtsH1 (Pf3D7_1239700) is categorized as "dispensable" in the P. falciparum deletion screen, it has been shown experimentally to be essential [25]. Overall, a mutation in one of these 12 essential gene candidates was identified in each of the 23 apicoplast(−) clones, consistent with the mutation causing apicoplast loss. Potentially disruptive mutations included a I437S variant in the known apicoplast biogenesis protein (PfFtsH1), truncation of Atg7 (PfAtg7) likely required for a cytoplasmic pathway for apicoplast biogenesis, and truncations of three proteins of unknown function (Pf3D7_0518100, Pf3D7_1305100, and Pf3D7_1363700) ( Table 1). The remaining candidates contained point mutations and had no known prior function in apicoplast biogenesis or localization to the organelle.

The Ile437Ser variant causes loss of PfFtsH1 activity in vitro
PfFtsH1 is an apicoplast membrane metalloprotease that was previously identified in a chemical genetic screen as the target of actinonin, an inhibitor that disrupts apicoplast biogenesis. Subsequent knockdown of PfFtsH1demonstrated that it is required for apicoplast biogenesis [22]. We hypothesized that the I437S variant identified in our screen disrupted PfFtsH1 function, leading to apicoplast loss. PfFtsH1 contains both an ATPase and peptidase domain. To test the effect of I437S on enzyme activity, we compared the activity of I437S with that of wildtype (WT) enzyme, an ATPase-inactive E249Q variant, and a peptidase-inactive D493A variant ( Fig 3A). All enzymes were purified without the transmembrane domain as previously described (S3 Fig) [

PfAtg7, a member of the Atg8 conjugation pathway, is a cytoplasmic protein required for apicoplast biogenesis
Though non-apicoplast proteins are expected to play a role in apicoplast biogenesis, all apicoplast biogenesis proteins validated so far have localized to the apicoplast because this criterion is most often used to select candidates. A significant advantage of our forward genetic screen is that it can uncover non-apicoplast pathways required for apicoplast biogenesis, which are biased against by other approaches. A cytoplasmic protein strongly identified in our screen was PfAtg7 (Pf3D7_1126100), which contained a nonsense mutation causing a protein truncation at position Q719. The premature stop codon was upstream of the E1-like activating domain, consistent with PfAtg7 loss of function ( Fig 4A). In yeast and mammalian cells, Atg7 Candidates were prioritized based on previous literature suggesting an apicoplast function (first group) and presence of a nonsense mutation (second group). The remaining candidates were unranked. Note that the first two digits of the Pf gene ID indicate the chromosome on which the gene is located. is required for conjugation of Atg8 to the autophagosome membrane [41]. PfAtg7 has not specifically been implicated in apicoplast biogenesis; however, PfAtg8 has been shown to localize to the apicoplast membrane and be required for apicoplast biogenesis. In analogy with its role in autophagy, PfAtg7 is likely required for conjugating PfAtg8 to the apicoplast membrane. Therefore, we suspected the loss-of-function mutant we identified caused apicoplast loss via loss of membrane-conjugated PfAtg8 [12].
To confirm PfAtg7's role in apicoplast biogenesis, we generated a P. falciparum strain in which it is conditionally expressed. The endogenous PfAtg7 locus in a NF54 strain harboring a CRISPR cassette was modified with a C-terminal triple hemagglutinin (HA) tag and a 3 0 untranslated region (UTR) RNA aptamer sequence that binds a tetracycline repressor (TetR) and development of zygote inhibited (DOZI) fusion protein to generate PfAtg7-TetR/DOZI [42,43]. In the presence of anhydrotetracycline (ATc), the 3 0 UTR aptamer is unbound, and PfAtg7-3×HA protein was detectable, albeit at low levels consistent with its low expression in published transcriptome data (PlasmoDB.org) (  [40]. We tested whether the requirement for PfAtg7 was due to its role in apicoplast biogenesis. Growth inhibition caused by PfAtg7 knockdown was partially rescued by addition of IPP ( Fig  4B and S4 Data). Furthermore, in −ATc/+IPP parasites, the apicoplast marker acyl-carrier protein (ACP) mislocalized from a discrete organellar localization to multiple cytoplasmic puncta, a hallmark of apicoplast loss (Fig 4C) [19]. Loss of transit peptide cleavage of the apicoplast protein ClpP also confirmed apicoplast loss, because mislocalized apicoplast proteins do not undergo removal of their targeting sequences ( Fig 4D) [15,44]. IPP rescue of PfAtg7 knockdown parasites was incomplete, raising the possibility that PfAtg7 is also required for a non- apicoplast function. Alternatively, PfAtg7 knockdown may cause stalling of apicoplast morphologic development leading to general cellular toxicity that cannot be fully rescued with IPP until apicoplast loss is complete. To test these models, instead of PfAtg7 knockdown followed by apicoplast loss, we first induced apicoplast loss with actinonin and then assessed the effects of PfAtg7 knockdown. PfAtg7 knockdown in apicoplast(−) parasites did not cause any additional growth defect and was fully rescued by IPP, suggesting that the partial rescue observed in apicoplast(+) parasites was due to the order of disruption ( S5 Fig and S4 Data). These results confirmed that PfAtg7 is required for apicoplast biogenesis and likely is its only essential function.
Finally, to determine whether PfAtg7's role in apicoplast biogenesis was via PfAtg8 membrane conjugation, we transfected PfAtg7-TetR/DOZI with a transgene encoding GFP-PfAtg8. In this strain, GFP-PfAtg8 primarily localized to a branched structure in schizonts, consistent with its previously described apicoplast localization (Fig 4E) [10,45]. Upon PfAtg7 knockdown, GFP-PfAtg8 localization to this discrete structure was lost, and accumulation in the cytoplasm was observed within a single replication cycle prior to apicoplast loss (Fig 4E). This result suggests that, like yeast and mammalian Atg7 homologs, PfAtg7 has a conserved function in conjugating Atg8 to lipids. Altogether, PfAtg7 stood out as a cytoplasmic protein required for apicoplast biogenesis identified in our screen.

Forward genetics identifies three "conserved Plasmodium proteins of unknown function" required for apicoplast biogenesis
The real power of forward genetics is the ability to uncover novel pathways without any a priori knowledge. Therefore, we next turned our attention to the nearly 50% of the Plasmodium genome annotated as "conserved Plasmodium protein, unknown function." Three candidate genes (Pf3D7_0518100, Pf3D7_1305100, and Pf3D7_1363700) encoding proteins of unknown function were identified by nonsense mutations that caused protein truncation. The position of the premature stop codon near the 5 0 end (Pf3D7_0518100, Pf3D7_1305100) or in the middle (Pf3D7_1363700) of the genes suggested that these were loss-of-function mutations ( Fig  5A). Incidentally, all were also identified in a recently published proteomic dataset of apicoplast proteins, and immunofluorescence colocalization with the apicoplast marker ACP confirmed that Pf3D7_0518100 and Pf3D7_1305100 are localized to the apicoplast (S4 Fig) [15].
Therefore, we assessed whether knockdown of these genes disrupted apicoplast biogenesis. Similar to the experiments performed to validate PfAtg7, ATc-regulated knockdown strains for each of the candidate genes were generated. Upon ATc removal, protein levels for each gene decreased within 24 hours as detected by western blot (S2 Fig). Significant growth inhibition was also observed for all candidate genes tested, with varying kinetics of growth inhibition observed for each candidate (Fig 5B, 5E and 5H and S4 Data). Of note, the gene essentiality demonstrated here for Pf3D7_1305100 and Pf3D7_1363700 confirmed whole-genome essentiality data reported in P. berghei and/or P. falciparum. However, the essentiality of Pf3D7_0518100 did not agree with its "dispensable" annotation in the P. falciparum dataset. IPP supplementation reversed growth inhibition for all the candidates, demonstrating that their essentiality was due to an apicoplast-specific function (Fig 5B, 5E and 5H and S4 Data). Finally, mislocalization of ACP and loss of transit peptide cleavage of ClpP confirmed apicoplast loss for all candidates (Fig 5C, 5D, 5F, 5G, 5I and 5J). Because these genes have so far lacked any functional annotation and given their shared knockdown phenotype, we designated them "apicoplast-minus, IPP-rescued" (amr) genes: amr1 (Pf3D7_1363700), amr2 (Pf3D7_0518100), and amr3 (Pf3D7_1305100). Taken together, we successfully identified three novel proteins of unknown function required for apicoplast biogenesis, prioritizing these amr genes for functional studies.

Sequence analysis of AMR1 homologs suggests gene duplication followed by evolution of a new function in apicoplast biogenesis
To set up future functional studies, we noted that PfAMR1 contained a TIM-barrel domain with closest homology to indole-3-glycerol phosphate synthase (IGPS), a highly conserved enzyme in the tryptophan (trp) biosynthesis pathway [46][47][48]. This was surprising because Plasmodium and the related apicomplexan parasite Toxoplasma are trp auxotrophs [49][50][51]. Indeed, analysis of >30 apicomplexan genomes did not detect any of the other six trp biosynthetic enzymes, except the terminal enzyme tryptophan synthase (TS)-β, which was horizontally transferred into Cryptosporidium spp. [52]. Therefore, we suspected that PfAMR1 may have a function unrelated to trp biosynthesis.
To test the conservation of active-site residues, we aligned the sequences of several known IGPSs with IGPS homologs identified from P. falciparum, T. gondii, and Vitrella brassicaformis (S6 Fig) [53-55]. V. brassicaformis, a member of the Chromerids, is the closest free-living, photosynthetic relative to apicomplexan parasites. It contains a secondary plastid with the same origin as the apicoplast but, as a free-living alga [56], is also expected to have intact trp biosynthesis. Known catalytic and substrate binding residues based on enzyme structure-function studies performed in bacteria were first identified [57]. For known IGPSs and one of the V. brassicaformis IGPS homologs (Vbra_4894), the key catalytic and substrate binding residues were all conserved, despite the vast evolutionary distance between bacteria, metazoans, and Chromerids/apicomplexans (Fig 6A). However, in PfAMR1 and the other two V. brassicaformis homologs, key functional residues were not conserved. Based on the conservation of functional residues, we separated these sequences into two groups: "canonical IGPS proteins" (which have been shown, or are likely, to encode for IGPS activity) and "IGPS-like proteins" (e.g., PfAMR1), which we suggest have functionally diverged.
We next looked at the pattern of canonical IGPS versus IGPS-like proteins through two biological transitions: loss of trp biosynthesis (Vitrella versus Plasmodium spp.) and loss of the apicoplast (Plasmodium versus Cryptosporidium spp.) ( Fig 6B). As expected for a role in trp biosynthesis, canonical IGPS proteins were retained until the emergence of parasitism. In addition, genes encoding the remaining set of enzymes for trp biosynthesis were identified in the V. brassicaformis genome [58]. Unlike canonical IGPSs, however, IGPS-like proteins were retained in parasites that have lost trp biosynthesis. Instead, loss of IGPS-like proteins is associated only with loss of the apicoplast in Cryptosporidium spp. This pattern of acquisition and loss of IGPS-like proteins suggests an apicoplast-specific function separate from trp biosynthesis.
Finally, we performed functional complementation to test the biochemical activity of canonical IGPS and IGPS-like genes from V. brassicaformis and P. falciparum. An E. coli strain (trpC9800) containing an inactivating mutation in trpC, the E. coli IGPS homolog, was grown on minimal agar (M9) [59]. As expected, trpC9800 was dependent on trp supplementation for growth ( Fig 6C). Complementation with the Vbra_4894 homolog restored trpC9800 growth on M9, comparable to that of an E. coli strain with intact trp biosynthesis, suggesting that Vbra_4894 is an IGPS protein (Fig 6D). In contrast, neither PfAMR1 nor any of the IGPS-like genes were able to functionally replace trpC, supporting an alternative biochemical function. Because glutathione S-transferase (GST)-tagged complemented protein could not be detected in any strain, we cannot rule out that the lack of functional complementation was due to differential protein expression or solubility below the detection limit of western blot. However, isopropyl β-D-1-thiogalactopyranoside (IPTG) induction of protein expression was toxic for all complemented strains, suggesting all constructs supported protein expression. Overall, we propose that AMR1 has evolved a new biochemical function required for apicoplast biogenesis.

Discussion
Apicoplast biogenesis provides a fascinating window into molecular evolution, including examples of proteins that have been reused (e.g., translocon on the inner chloroplast membrane/translocon on the outer chloroplast membrane [TIC/TOC] complexes) [17,60,61], repurposed (e.g., Atg8, symbiont-derived ERAD-like machinery [SELMA]) [12,14], or newly invented in the process of serial endosymbioses. Overcoming significant technical challenges in the Plasmodium experimental system, we designed a forward genetic screen to identify essential apicoplast biogenesis pathways. This singular screen opens up opportunities to discover evolutionary innovations obscured by candidate-based approaches, including cytoplasmic pathways and genes lacking any functional annotations. In addition to confirming the role of PfFtsH1 in apicoplast biogenesis and identification of PfAtg8 conjugation machinery, we identified several proteins of unknown function required for apicoplast biogenesis that have so far gone undetected. Because our reporter specifically looked for apicoplast loss, we cannot rule out the possibility that some identified genes may be involved in maintenance of already formed apicoplasts. However, because no clearance mechanism is known for defective apicoplasts, we are not aware of any pathway by which defective apicoplasts would lead to organelle loss independent of parasite replication.
One surprising gene we identified was PfAMR1, which encodes a TIM-barrel domain found in diverse enzymes catalyzing small-molecule reactions [46]. PfAMR1 may have evolved from gene duplication of IGPS, an enzyme in the trp biosynthetic pathway [47]. However, the evolutionary pattern of retention in apicomplexan parasites lacking tryptophan biosynthesis and loss in Cryptosporidium spp., concomitant with plastid loss, supports a critical function of PfAMR1 in the apicoplast independent of tryptophan biosynthesis. We hypothesize that PfAMR1 may be involved in biosynthesis of a specialized lipid or signaling molecule required specifically for building new plastids in this lineage. Multiple new amr genes identified in this study provide striking examples of the unexpected findings enabled by unbiased screens.
Uncovering novel apicoplast biogenesis pathways also has important biomedical applications. While targeting the metabolic function of the apicoplast has been a major strategy for antimalarial drug discovery [62], it has become apparent that apicoplast biogenesis is equally as, or likely more, valuable as a therapeutic target [22]. These distinct pathogen pathways are nonetheless required in every proliferative stage of the Plasmodium life cycle and conserved among apicomplexan parasites. Targeting apicoplast biogenesis has the benefit of efficacy against multiple Plasmodium life stages and multiple pathogens. Consistent with this broad utility, antibiotics that inhibit translation in the apicoplast and disrupt its biogenesis are used clinically for malaria prophylaxis, acute malaria treatment, and treatment of babesiosis and toxoplasmosis [7,8,63].
Until now, a forward genetic screen for essential pathways in blood-stage Plasmodium has not been achieved. Previous screens in murine P. berghei and the human malaria parasite P. falciparum identified nonessential genes required for gametocyte formation [64,65], the developmental stage required for mosquito transmission. Functional screens for essential pathways have been impeded by several technical challenges, including the low transfection efficiency of P. falciparum, in vivo growth requirement of P. berghei, and absence of efficient methods for generating conditional mutants in both organisms [28]. Nonetheless, genome-scale deletion screens in P. berghei and P. falciparum using a homologous recombination-targeted deletion library or saturating transposon-based mutagenesis, respectively, have revealed a plethora of essential genes [24,25]. Functional assignment of these essential genes is a priority. In this context, the apicoplast biogenesis screen presented here is a major milestone towards unbiased functional identification of novel, essential genes.
A top priority for "version 2.0" is to expand the screen to genome scale, maximizing our ability to uncover novel pathways. Apicoplast biogenesis is a complex process encompassing a multitude of functions and is expected to require hundreds of gene products. The identification of 12 candidate genes and our sparse sampling of known genes suggest that the current IGPS, indole-3-glycerol phosphate synthase; IPP, isopentenyl diphosphate; PRT, phosphoribosyltransferase; trp, tryptophan; TS, tryptophan synthase; WT, wild-type.
https://doi.org/10.1371/journal.pbio.3000136.g006 screen is far from saturating. The most significant bottleneck is the dependence of this screen on chemical mutagenesis. Mapping mutations by whole-genome sequencing limits the number of mutants that can be analyzed. Even for sequenced clones, less than half had a detectable point mutation in a coding region. The remaining apicoplast(−) clones may have contained mutations in noncoding regions or other types of mutations that are more difficult to detect (insertion, deletions, or copy number variations). Particularly for apicoplast(−) clones selected from non-mutagenized conditions, we considered the possibility that some parasites might spontaneously fail to inherit the apicoplast due to mechanical defects during parasite replication; these daughter cells resulting from erroneous apicoplast division and segregation usually would not survive but are rescued by IPP. Finally, specific mutations identified in candidate genes also need to be validated one by one. In this study, four nonsense mutations were validated by conditional knockdown, while a missense mutation in PfFtsH1 was validated using an available activity assay. Although we were able to demonstrate loss of function as result of the PfFtsH1 I437S variant, other missense mutations identified in genes of unknown function will be challenging to follow up with available genetic tools.
Given these limitations, an alternative mutagenesis method will increase the screen throughput. Options include adaptation of the piggyBac transposon developed for the P. falciparum deletion screen [25] or development of large-scale targeted mutagenesis. Switching to more genetically tractable apicomplexan organisms, such as P. berghei or Toxoplasma, would provide ready options for large-scale targeted gene disruptions [23,24]. However, these would have to be performed under conditional regulation because chemical rescue of the apicoplast is not feasible in these organisms. We anticipate that continued advances in genetic methods in apicomplexan organisms will open up opportunities to expand this screen in the future.

Ethics statement
Human erythrocytes were purchased from the Stanford Blood Center (Stanford, California) to support in vitro P. falciparum cultures. Because erythrocytes were collected from anonymized donors with no access to identifying information, IRB approval was not required. All consent to participate in research was collected by the Stanford Blood Center.
For transfections, 50 μg of plasmid DNA was added to 200 μL packed red blood cells (RBCs), adjusted to 50% hematocrit in RPMI 1640, and electroporated as previously described [66]. On day 4 post transfection, parasites were selected for with 2.5. mg/L blasticidin S (RPI Research Products International). TetR/DOZI strains were cultured with 500 nM ATc for the entire duration of transfection. For TetR/DOZI strains expressing ACP L -GFP or GFP-PfAtg8, parasites were additionally selected for with 500 μg/mL G418 sulfate (Corning) during transfection.

Cloning
Oligonucleotides were purchased from the Stanford Protein and Nucleic Acid facility or IDT. gBlocks were ordered from IDT. Molecular cloning was performed using In-Fusion Cloning (Clontech) or Gibson Assembly (NEB). Primer and gBlock sequences for all cloning are available in S3 Table. To generate the plasmid pRL2-mCherry-T2A-ACP L -GFP, T2A-ACP L -GFP was first amplified from the pRL2-ACP L -GFP vector. mCherry was amplified from pTKO2-mCherry vector (kind gift from J. Boothroyd) and inserted in front of T2A-ACP L -GFP in the pRL2 backbone using the In-Fusion Cloning kit (Takara). To generate the pL2-mCherry-T2A-ACP L -GFPdegron plasmids, T2A-ACP L -GFP-degron was amplified from pRL2-mCherry-T2A-ACP L -GFP.
For CRISPR-Cas9-based editing of endogenous Pf3D7_0518100, Pf3D7_1126100, Pf3D7_1305100, and Pf3D7_1363700 loci, sgRNAs were designed using the eukaryotic CRISPR guide RNA/DNA design tool (http://chopchop.cbu.uib.no/). To generate a linear plasmid for CRISPR-Cas9-based editing, left and right homology regions were first amplified for each gene. For each gene, a gBlock containing the recoded coding sequence C-terminal of the CRISPR cut site and a triple HA tag was synthesized with appropriate overhangs for Gibson Assembly. This fragment along with the left homology region was simultaneously cloned into the FseI/ApaI sites of the linear plasmid pSN054-V5. Next, the appropriate right homology region and a gBlock containing the sgRNA expression cassette were simultaneously cloned into the AscI/I-SceI sites of the resultant vector to generate the plasmids.
To generate plasmid for expression of GFP-PfAtg8, GFP with a GlyAlaGlyAla linker was amplified from pRL2-ACP L -GFP. PfAtg8 was amplified from P. falciparum gDNA. Both fragments were inserted into pfYC110 vector [38] using the In-Fusion Cloning kit.
V. brassicaformis RNA from strain CCMP3346 was purchased from the National Center for Marine Algae and Microbiota and was subsequently used to generate cDNA using Superscript III cDNA Kit (Life Technologies). For Plasmodium PF3D7_1363700 cloning, codon optimized gBlocks were used to construct the Plasmodium construct. Constructs were cloned into the pGEXT vector using the In-Fusion Cloning kit.

Degron screening
Ring-stage mCherry-T2A-ACPL-GFP-degron parasites were treated with 10 μM actinonin (Sigma) and 200 μM IPP (Isoprenoids, LLC) to disrupt the apicoplast. Both treated and nontreated parasites were analyzed two cycles post treatment at the schizont stage on a BD Accuri C6 flow cytometer. For each condition, 100,000 to 500,000 events were recorded. Uninfected RBCs were first removed from the population by setting a gate for mCherry fluorescence. For each strain, the average GFP and mCherry fluorescence intensities were then calculated for the infected cell population using FlowJo. For example, if 10,000 infected cells were counted, then the GFP and mCherry fluorescence for each cell was measured by the flow cytometer, and the average fluorescence was determined for the whole population. To calculate the GFP:mCherry ratios for comparative analysis of degron efficiency, the GFP to mCherry fluorescence ratio for each individual infected cell was first obtained. The fluorescence ratios of all infected cells were then averaged to determine the overall population GFP:mCherry ratio.

Mutant screening
Ring-stage mCherry-T2A-ACPL-GFP-EcssrA (EcssrA) parasites were seeded onto a 96-well plate at a volume of 200 μL, 2% hematocrit, and 0.5%-1% parasitemia. Parasites were either untreated or treated with 1 mM EMS or 100 μM ENU for 2 hours, and then washed three times afterwards to remove the mutagen from the growth media. Parasites were cultured in growth media + 200 μM IPP for the duration of the screen.
At 120 hours post treatment, mutants were isolated on a Sony SH800S Cell Sorter. Uninfected RBCs were first analyzed to set the gate for overall fluorescence. Untreated EcssrA parasites were analyzed to gate for positive and negative mCherry and GFP expression, respectively. Actinonin-treated EcssrA parasites were analyzed to gate for positive GFP expression. Non-mutagenized and mutagenized parasites displaying both positive mCherry and GFP expression were FACS'd into a new 96-well plate. Enriched parasites were allowed to propagate to a detectable parasitemia before being subjected to subsequent FACS rounds.
Mutants were enriched until mCherry and GFP fluorescence approached actinonin-treated levels. In the final enrichment, up to 52 mutants were single-cell cloned. Mutants that survived single-cell sorting were split into growth media containing either 200 μM IPP or no IPP. Mutants displaying growth only in IPP were expanded to a 10 mL culture at approximately 10% parasitemia and ring-stage synchronized; 5 mL of culture was harvested for DNA extraction, and 5 mL culture was frozen at −80˚C.

DNA isolation and whole-genome sequencing
Ring-stage parasites were isolated from RBC in 0.1% saponin and washed three times in PBS. gDNA from mutants and the parental strain was isolated using the Quick-DNA Universal Kit (Zymo Research). Paired-end gDNA libraries were generated and barcoded for each mutant and the parental using the Nextera Library Prep Kit, modified for 8 PCR cycles (Stanford Functional Genomics Facility). Up to 26 pooled libraries were analyzed per lane of an Illumina HiSeq 4000 using 2 × 75 bp, paired-end sequencing (Stanford Functional Genomics Facility).

SNP analysis
Fastq files were checked for overall quality using FastQC. Ten and 15 bp were trimmed from the 5 0 and 3 0 ends of all 75 bp sequence reads, respectively, to remove low-quality reads; 20 and 30 bp were trimmed 5 0 and 3 0 ends of 150 bp sequence reads, respectively, from an additional parental Dd2 strain sequenced by the DeRisi lab (https://www.ncbi.nlm.nih.gov/sra/ SRX326518). The resulting paired-end sequencing reads were mapped using Bowtie2 against the P. falciparum 3D7 (version 35) reference genome. One mismatch per read was allowed, and only unique reads were aligned (reads were removed if they aligned to more than one region of the genome). PCR duplicates were removed using Samtools rmdup, and raw SNVs were called using Samtools mpileup. Indels were not analyzed.
Bcftools was used to generate the raw variant list for parental (allele frequency � 0.05, depth � 1) and mutant (allele frequency � 0.9, depth � 5) strains. Variants found in the parental strain were excluded from the mutant variant list. Variants were filtered to only include protein-coding mutations (missense and nonsense). Mutations found in hypervariable gene families were excluded. Remaining variants that were previously annotated in PlasmoDB were excluded to generate the final variant list. The reported variants were confirmed to meet the filtering requirements using Samtools tview. Mutants containing nonsense mutations were Sanger sequenced to confirm the presence of the mutations prior to genetic validation. The custom script and parameters used for analysis are available at https://github.com/ yehlabstanford/biogenesis_screen.

Fractionation assay
Parasites expressing GFP-PfAtg8 were grown in the presence or absence of ATc for 24 hours; 10 ml cultures were lysed with 0.1% saponin and washed 3 times with PBS. Parasite pellets were resuspended in ice-cold lysis buffer (1× PBS, 1% Triton X-114 [Thermo Scientific 28332], 2 mM EDTA, 1× protease inhibitors [Pierce A32955]) and incubated on ice for 30 minutes. Cell debris were removed by 10-minute centrifugation at 16,000 × g, 4˚C. Supernatant was transferred to a fresh Eppendorf tube, incubated 2 minutes at 37˚C to allow phase separation, and centrifuged 5 minutes at 16,000 × g at room temperature. The top (aqueous) layer was transferred to another tube. The interphase was removed to avoid cross-contamination between the layers. The bottom (detergent) layer was resuspended in 1× PBS, 0.2 mM EDTA to equalize the volumes of the two fractions. Both fractions were subjected to methanol-chloroform precipitation, resuspended in PBS containing 2× NuPAGE LDS sample buffer, boiled for 5 minutes at 95˚C, and analyzed by western blot as described above.

Microscopy
For live imaging, parasites were settled onto glass-bottom microwell dishes Lab-Tek II chambered coverglass (Thermo Fisher 155409) in PBS containing 0.4% glucose and 2 μg/mL Hoechst 33342 stain (Thermo Fisher H3570). Cells were imaged with a 100×, 1.4 NA objective on an Olympus IX70 microscope with a DeltaVision system (Applied Precision) controlled with SoftWorx version 4.1.0 and equipped with a CoolSnap-HQ CCD camera (Photometrics). Images were captured as a series of z-stacks separated by 0.2 μm intervals, deconvolved (except for mCherry images), and displayed as maximum intensity projections. Brightness and contrast were adjusted equally in SoftWorx or Fiji (ImageJ) for display purposes.
For immunofluorescence, fixed-cell imaging, parasites were first fixed with 4% paraformaldehyde (Electron Microscopy Science 15710) and 0.0075% glutaraldehyde in PBS (Electron Microscopy Sciences 16019) for 20 minutes. Cells were washed once in PBS and allowed to settle onto poly-L-lysine-coated coverslips (Corning) for 60 minutes. Coverslips were then washed once with PBS, permeabilized in 0.1% Triton X-100/PBS for 10 minutes, and washed twice more in PBS. Cells were treated with 0.1 mg/mL NaBH 4 /PBS for 10 minutes, washed once in PBS, and blocked in 5% BSA/PBS. Primary antibodies were diluted in 5% BSA/PBS at the following concentrations: 1:500 rabbit-α-PfACP (kind gift from S. Prigge) and 1:100 rat-α-HA 3F10 (Sigma 11867423001). Coverslips were washed three times in PBS, incubated with secondary antibodies goat-α-rat 488 (Thermo Fisher A-11006) and donkey-α-rabbit (Thermo Fisher A10042) at 1:3,000 dilution, and washed three times in PBS prior to mounting in Pro-Long Gold antifade reagent with DAPI (Thermo Fisher).

Knockdown assays
Ring-stage TetR/DOZI strain parasites were washed two times in growth media to remove ATc. Parasites were divided into three cultures supplemented with 500 nM ATc, no ATc, or no ATc + 200 μM IPP. Samples were collected at the schizont stage in each growth cycle for flow cytometry analysis and western blot. Parasites in each condition were diluted equally every growth cycle for up to six growth cycles.
For parasitemia measurements, parasite-infected or uninfected RBCs were incubated with the live-cell DNA stain dihydroethidium (Thermo Fisher D23107) for 30 minutes at a dilution of 1:300 (5 mM stock solution). Parasites were analyzed on a BD Accuri C6 flow cytometer, and up to 100,000 events were recorded.

Protein expression and purification
The parent His6-SUMO-PfFtsH191-612-GST plasmid as well as E249Q and D493A mutants were obtained from laboratory stocks. An I437S mutant was constructed by site-directed mutagenesis. Recombinant proteins were expressed from these plasmids and purified as described [22].

Alignment of IGPS and IGPS-like proteins
IGPS and IGPS-like proteins from V. brassicaformis were identified by BLAST through using CryptoDB. First, secondary structure was predicted using PSI-PRED in the XtalPred suite [53,55]. Only the sequences containing the TIM barrels of each sequence were used for alignment because there are large N-and C-terminal extensions in the noncanonical proteins. PRO-MALS3D was subsequently used to perform a multiple sequence alignment based on secondary structure and homology to proteins with determined 3D structures [54].

Bacterial complementation assay
W3110trpC9800 E. coli strain was purchased from the Yale University Coli Genetic Stock Center and were made chemically competent using calcium chloride. BL21 Star (DE3) competent cells (Thermo Fisher) were used for the WT condition. The competent W3110trpC9800 cells were transformed with the pGEXT vectors containing the different Vitrella or Plasmodium IGPS and IGPS-like genes and were plated on LB agar plates containing carbenicillin. For each construct, a colony was picked and washed in M9 minimal media (M9) (22 mM potassium phosphate monobasic, 22 mM sodium phosphate dibasic, 85 mM sodium chloride, 18.7 mM ammonium chloride, 2 mM magnesium sulfate, 0.1 mM calcium chloride, and 0.4% glycerol), resuspended in M9, and streaked onto M9/agar plates containing either carbenicillin (100 μg/ mL) or carbenicillin and 1 mM L-tryptophan (Sigma). Plates were incubated at 37˚C and allowed to grow for two days, after which images of the plates were taken. parasites. Data are shown as mean ± SD (n = 2). � P < 0.05, �� P < 0.01, ��� P < 0.001 compared to untreated control (−ATc black asterisks, −ATc/+IPP red asterisks), one-sample t test. Tabulated data are shown in S4 Data. (B) Apicoplast loss precedes PfAtg7 knockdown and IPP in apicoplast(−) parasites. Apicoplast(−) parasites were generated via actinonin/IPP treatment. Data are shown as mean ± SD (n = 2). �� P < 0.01, ��� P < 0.001 compared to untreated control (−ATc black asterisks), one-sample t test. Tabulated data are shown in S4 Data. (TIF) S6 Fig. Protein sequence alignment of IGPS and IGPS-like protein sequences from various organisms using PROMALS3D. Residues involved in substrate binding and catalysis (based on the E. coli sequence) are marked with an asterisk and are highlighted in yellow, respectively. Blue and red residues represent predicted β-sheets and α-helices respectively. All other residues have no predicted secondary structure. Highly conserved residues are represented as bold uppercase letter in the consensus line. Other consensus symbols are as follows: b: bulky; c: charged; h: hydrophobic; p: polar; s: small; t: tiny; l: aliphatic; "+": positive; "-": negative; "@": aromatic. (TIF) S1 (XLSX) S1 Data. Spreadsheet containing tabulated data for Figs 1C, S1D and S1F.