Evolutionary diversification of the HAP2 membrane insertion motifs to drive gamete fusion across eukaryotes

HAPLESS2 (HAP2) is a broadly conserved, gamete-expressed transmembrane protein that was shown recently to be structurally homologous to viral class II fusion proteins, which initiate fusion with host cells via insertion of fusion loops into the host membrane. However, the functional conformation of the HAP2 fusion loops has remained unknown, as the reported X-ray structure of Chlamydomonas reinhardtii HAP2 lacked this critical region. Here, we report a structure-guided alignment that reveals diversification of the proposed HAP2 fusion loops. Representative crystal structures show that in flowering plants, HAP2 has a single prominent fusion loop projecting an amphipathic helix at its apex, while in trypanosomes, three small nonpolar loops of HAP2 are poised to interact with the target membrane. A detailed structure-function analysis of the Arabidopsis HAP2 amphipathic fusion helix defines key residues that are essential for membrane insertion and for gamete fusion. Our study suggests that HAP2 may have evolved multiple modes of membrane insertion to accommodate the diversity of membrane environments it has encountered during eukaryotic evolution.


Introduction
The fusion of gamete plasma membranes to form a zygote is central to sexual reproduction, yet a molecular mechanism for this fundamental process has only very recently been proposed.The crystal structure of the C. reinhardtii HAPLESS 2 (HAP2) ectodomain (CrHAP2e) [1] revealed that this broadly conserved, gamete-expressed transmembrane protein [2][3][4] has the same three-dimensional fold as class II viral fusion proteins.It was proposed that, like its viral counterparts, HAP2 initiates gamete fusion by insertion of fusion loops into the opposing gamete plasma membrane [5].Consistent with this idea, mutations in the proposed fusion loops of CrHAP2e disrupted its ability to insert into artificial membranes in vitro and mediate gamete fusion in vivo [1].Because the reported CrHAP2e structure did not include the critical region of the fusion loops, the question of how HAP2 inserts into the opposing gamete plasma membrane was left unanswered.
Viral fusion proteins have been divided into several structural classes that use either fusion peptides (class I) or fusion loops (classes II and III) to insert into host cell membranes [6].HAP2 is homologous to the viral class II fusion proteins, which fold into an articulated rod made of three β-sheet-rich domains, termed I, II, and III, with the central domain I flanked by domains II and III in the prefusion form [7][8][9].Domain III connects to the C-terminal transmembrane domain via a flexible linker, while domain II bears the fusion loops at the opposite end.
Stable insertion into the host membrane is required to withstand pulling forces that occur as the fusion protein bridges the intermembrane space via a transient extended intermediate state that later collapses into a trimeric "hairpin" conformation, which drives the merger of the viral envelope and host membrane [10].The interaction with membranes was best studied for the class I proteins, in which the fusion peptide is an N-terminal extension that folds independently of the rest of the fusion protein upon its insertion into a lipid bilayer, exposing a relatively extensive nonpolar platform to the outer leaflet [11].In the case of the influenza virus hemagglutinin, the prototypic class I fusion protein, it was shown that the fusion peptide forms an α-helical hairpin [12], exposing nine bulky nonpolar side chains to the external lipid layer of the host membrane.In contrast, class II proteins from arthropod-borne viruses (arboviruses) were proposed to insert internal fusion loops located at the domain II "tip," i.e., at the end of the protein opposite the C-terminal transmembrane domain integral to the viral envelope.In contrast to the class I proteins, the fusion loops do not change conformation upon interaction with lipids, and the fusion protein of the Rift Valley fever virus (RVFV) was shown to feature an internal pocket that specifically accommodates the head group of glycerophospholipids from the outer leaflet of the target membrane [13].The resulting polar interaction network allows stable insertion of the protein with only one or two bulky aromatic side chains of the fusion loops exposed to the aliphatic moiety of the membrane.Understanding how HAP2 achieves stable insertion into the target membrane is key for understanding the molecular mechanism driving gamete fusion.The diversity of reproductive systems in which HAP2 has been implicated across eukaryotic organisms likely represents a large variety of fusion manuscript.Recurrent funding from the CNRS received by FAR.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Recurrent funding from the Institut Pasteur received by FAR.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Allocation ministerielle pour l'Ecole Polytechnique AMX received by JFe.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.National Science Foundation (NSF) (grant number IOS-1353798).Received by MJ.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.National Science Foundation (NSF) (grant number IOS-1540019).Received by MJ.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.National Institutes of Health Training Grant (grant number #T32-GM007601).Received by JFo.The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests.FAR is a consultant for Flagship Pioneering.
environments (e.g., lipid compositions of gamete membranes), and it remains elusive whether HAP2 has evolved different membrane interaction surfaces to secure the initial interaction with the target membrane.
Here, we carried out a comparative structure/function study of HAP2 from distantly related eukaryotes and found striking sequence variability at the domain II tip, which features multiple insertions and deletions, contrasting with the relative conservation of the rest of the protein.We then obtained structural data for HAP2 from two organisms that displayed the most contrasting diversity in this region: the flowering plant Arabidopsis thaliana (AtHAP2) and the protozoan Trypanosoma cruzi (TcHAP2).The X-ray structures confirmed that the membrane interaction surfaces were totally different: while in AtHAP2 there is a single fusion loop that projects an amphipathic helix (termed αF) toward the membrane (an unprecedented feature in viral class II fusion proteins), in TcHAP2, the membrane interaction surface is composed of three short loops.Biochemical and genetic experiments focusing on αF confirmed that the nonpolar residues are required for membrane insertion in vitro and for gamete fusion in vivo.Bioinformatic analyses show that αF is likely to be conserved across flowering plants, and functional studies revealed that it is interchangeable between rice and Arabidopsis, which are among the most distantly related flowering plants.

Sequences encoding the proposed HAP2 fusion loops are highly divergent
The domain II tip is the region of class II fusion proteins that must firmly insert into target membranes to initiate the fusion process.In order to probe the potential structural diversity of HAP2 in this region, we used the structure of CrHAP2e to guide the alignment of 38 HAP2 amino acid (aa) sequences representing organisms from four eukaryotic kingdoms (Fig 1, S1 Table ).We focused on the region corresponding to the b, c, and d β-strands and their interstrand connections at the domain II tip (Fig 1), which contains the fusion loops [1] (Fig 1B).Nine invariant residues anchored the alignment: seven cysteines (in green background) participating in highly conserved disulfide bonds; a conserved glutamate in β-strand b; and a conserved arginine residue in the cd connection, making a salt-bridge with the conserved glutamate, both residues shown in red background in Fig 1B (see Materials and methods).The resulting alignment showed, as expected, that sequences from relatively closely related groups (e.g., flowering plants) were similar to each other.But the comparison between different phylogenetic groups revealed three regions (vertical blue frames, Fig 1B) with high diversity in length and primary sequence right at the tip of domain II: regions 2 and 3 correspond to the two AtHAP2 predicted fusion loops [1] in the connection between strands c and d.Variable region 1 is at the beginning of the long bc connection, immediately downstream of strand b and preceding disulfide 2 at the membrane-proximal end of the bdc β-sheet (Fig 1C).Residues within this region of the bc connection were indeed shown to be part of the membrane-interacting region of other class II fusion proteins, e.g., bunyavirus Gc [14,15].Although variable region 1 is very short and retracted in CrHAP2 [1], and residues from this region are not predicted to reach the target membrane, the sequence alignment indicates that this is not the case in HAP2 from other organisms.For example, several orthologs have an insertion (left-most vertical blue box, variable region 1, Fig 1B )-the most prominent in the apicomplexan parasite Toxoplasma gondii-potentially making a loop projecting apically to make contact with the target membrane.
The HAP2 segment predicted to form fusion loop 1 in Chlamydomonas is highly variable (blue frame 2 in Fig 1B).This region is absent in flowering plants, displays an additional and unique pair of cysteine residues expected to make an extra disulfide bond in kinetoplastids, (A) Linear diagram of the HAP2 primary structure indicating the positions of the signal peptide ("S"), the polypeptide segments corresponding to domains I, II, and III ("DI-III"), highlighting the tip of domain II (the bdc β-sheet, aligned in B), the pfam HAP2-GCS1 domain (pf10699), the linker ("L") between DI and DIII, the stem that connects the ectodomain to the transmembrane domain ("tm"), and the cytosolic tail.(B) HAP2 aa sequence alignment obtained with MUSCLE [16] using representative members from four out of the five eukaryotic kingdoms [17] named on the left column.A. thaliana and T. cruzi, the two orthologs studied further in this work, are highlighted in purple background, as well as C. reinhardtii, which served as reference.Only the bdc β-sheet region is and is much larger in insects (66-residue insertion in Tribolium castaneum).The alignment also shows that most orthologs have significant deletions in the variable region 3 (third blue frame, Fig 1B ), as observed in protozoan and animal sequences from the Porifera, Cnidaria, and Annelida.The short segment corresponding to the CrHAP2 α0 helix varies in length from four to seven residues and features α-helix-breaking residues (glycine or proline) in several orthologs (Fig 1B, in between the second and third blue boxes), suggesting α0 may not be a conserved structural feature of HAP2.The alignment of the predicted HAP2 membrane interaction domain from species representing the eukaryotic diversity therefore suggests that HAP2 has evolved multiple structural motifs for insertion into the target membrane.

The HAP2 core structure is conserved across eukaryotes
To understand the organization of the divergent structural motifs used by HAP2 for membrane insertion, we selected representative sequences for further study from the flowering plants and the kinetoplastids, which were among those with the most contrasting features in variable regions 2 and 3 (Fig 1B).We expressed the recombinant ectodomains of A. thaliana HAP2 (AtHAP2e; aa 25-494) and T. cruzi HAP2 (TcHAP2e, aa 26-516) in Drosophila Schneider 2 (S2) cells for structural studies (see Materials and methods).Size exclusion chromatography (SEC) and multiangle static light scattering (MALS) revealed that both proteins behaved as monomers in solution (S1A-S1D Fig) .AtHAP2e crystallized in the P6 3 hexagonal space group and diffracted anisotropically to a mean nominal resolution of 2.75 Å. TcHAP2e only crystallized upon limited proteolysis using subtilisin (see Materials and methods)., with black arrows marking the β-stands in the linear diagram drawn above, and the intervening connections in yellow (bc and cd) [1] with the putative CrHAP2 fusion loops 1 and 2 in the cd connection labeled.In the alignment, conserved and semiconserved residues are highlighted on red and beige backgrounds, respectively.A green background highlights cysteine residues, most of which are strictly conserved, involved in disulfide bonds numbered above the alignment according to the CrHAP2 structure.For clarity, insertions with respect to CrHAP2 are omitted, but their length is indicated on a dark blue background; the three major insertion/deletion hotspots in regions potentially interacting with the target membrane are framed in blue and numbered above the alignment in blue circles.The CrHAP2 α0 helix separating loops 1 and 2 is indicated with a helical symbol immediately above the sequence.Potential helix-breaking residues (glycine or proline) in the α0 region are indicated in bold characters.A random coil above the A. thaliana sequence highlights the absence of the α0 helix, and a helical symbol on a cyan background indicates the position of the A. thaliana fusion helix αF described in the text (Figs 2 and 3 [18]), RVFV Gc (PDB 6EGU; [13]), RV E1 (PDB 4ADI; [19]) and TBEV E (PDB 1URZ; [20]).The structures show the same organization of the three domains (yellow, red, and blue), which also share the same fold.Note that as expected, the viral proteins have evolved further from each other, as the same secondary structure elements are present, but their relative orientations have changed.The HAP2 structures can be considered as resembling a "frozen" ancestor of the class II fusion proteins observed on present-day viruses, given the huge difference in generation time and in the rate of mutations of RNA virus replication when compared with eukaryotic organisms.The statistics obtained with the DALI server [21] from the comparison of domain II from the various structures are provided in S3 Table .The HAP2 orthologs can be aligned with respect to the membrane using the highly conserved α2-helix.In contrast, the same helix does not display a conserved orientation and length in viral class II fusion proteins and therefore does not allow a similar alignment.

An amphipathic α-helix is the membrane insertion element in HAP2 of flowering plants
The AtHAP2 fusion loop features an amphipathic helix (αF, Figs 2, 3 and 4) at its apex, positioned to interact parallel to the target membrane by insertion of its nonpolar surface (I 171 , F 172 , M 175 , I 176 ) (Fig 4).Heliquest [22] predicted amphipathic helices in this segment for all plant and also for algal HAP2 sequences, including raising the possibility that the local conformation may be different in the aqueous solution used for crystallization than when bound to a lipid bilayer.A similar situation was observed for the fusion loops of the rubella virus fusion protein E1 [19].
The structural results, when combined with the sequence alignment, support the notion that HAP2 has evolved multiple modes for membrane insertion, with the algae having two

The HAP2 ectodomains insert into liposomes as trimers via the domain II tip
To test that AtHAP2 can indeed interact with membranes as predicted by the structures, we analyzed the behavior of AtHAP2e by mixing the monomeric protein (S1B Fig) with liposomes of varying composition (see Materials and methods), followed by gradient ultracentrifugation.We designed the gradients so that proteins bound to liposomes would float to the top fractions, while the unbound protein would sediment.We found that the soluble, wild-type (WT) AtHAP2e migrated to the top of the gradient, in contrast to a triple alanine substitution mutant in αF, which remained in the bottom of the tube (Fig 5A, S1 Data), indicating that HAP2 binds membranes and that the bulky nonpolar side chains of αF are required.Of note, we found that AtHAP2 association with liposomes was enhanced when artificial membranes included the negatively charged phospholipid 1,2-dioleoyl-sn-glycero-3-phospho-L-serine (DOPS; see Materials and methods).Electron microscopy further showed that both AtHAP2e (Fig 5B -5D [19,23,24].The size and shape of the projections observed on the HAP2e proteoliposomes are also compatible with a trimeric postfusion form and not with the monomers that were used to initiate the experiment (Fig 5B -5D), indicating that trimerization takes place upon interaction with lipids.Taken together, these results confirm that, as observed for CrHAP2 and a number of viral class II fusion proteins, the interaction with membranes leads to trimerization and that the trimers interact with the membrane via the variable, nonpolar surface of the domain II tip.A similar trimerization process occurred in the crystallization drops in the case of AtHAP2 and CrHAP2, which were also set to crystallize as monomers.Trimerization in the crystallization drops was also described for the flavivirus class II protein, albeit at acidic pH [25].In the HAP2 case, the trigger for trimerization-which is irreversible for class II fusion proteins [26]-is likely to have been the very high protein concentration used in the crystallization trials.The physiological trigger of HAP2 trimerization to induce gamete fusion in vivo remains to be understood.

αF amphipathicity is required for AtHAP2 in vitro liposome insertion and in vivo function
Flowering plant sperm are nonmotile and are delivered to female gametes in the cytoplasm of a pollen tube ( [27], Fig 6A).Rupture of the pollen tube releases a pair of isogenic sperm cells; one fuses with the egg to produce a zygote, the other with the central cell to initiate endosperm   and 6C).These two gamete fusion events are the defining feature of the flowering plants and are essential for the production of grain crops.To assess the role of αF in plant gamete fusion, we utilized a genetic transmission assay (Fig 6D ), which allowed us to determine whether HAP2 variants were able to restore gamete fusion to Arabidopsis hap2-2 null mutant sperm.In crosses between WT (HAP2/HAP2) females and hap2-2/HAP2 heterozygous males, the hap2-2 allele is not inherited by progeny (Fig 6D, [28,29]).However, if a transgene carrying fully functional HAP2 is introduced into hap2-2/HAP2 plants, 33% of progeny will inherit hap2- We predicted that the nonpolar residues (I 171 , F 172 , M 175 , I 176 ) on the nonpolar face of αF (Fig 4A -4C) would be critical for insertion into female gamete plasma membranes.To address this hypothesis, we tested whether these residues were important for association of AtHAP2e with liposomes (Fig 5A).When we mutated F 172 to alanine, membrane association was decreased, and when I 171 , F 172 , and I 176 were mutated to alanine (IFI>AAA triple mutant), membrane association was almost completely abrogated; the majority of AtHAP2e was found at the bottom of the sucrose gradient (Fig 5A ) and was not associated with liposomes (Fig 5E and 5F).In our genetic analysis in vivo, mutating either I 171 or I 176  We conclude that these hydrophobic αF residues, which are not individually essential for function, provide a hydrophobic surface at the domain II tip required for insertion into the egg and central cell plasma membrane to initiate HAP2-driven gamete fusion.

The conserved E 117 -R 163 salt bridge is essential for Arabidopsis gamete fusion
It was proposed that the salt bridge between the invariant R 185 and E 126 in CrHAP2 (Figs 1B, 2B and 2C) constrains the domain II tip and helps present the predicted fusion loops to the target membrane [1].However, Tetrahymena thermophila (Tt) mating was not eliminated when female gametophyte.(B) A schematic showing a pollen tube that has burst into a synergid, propelling the sperm to the site of fusion with the egg or central cell.(C) Arrows highlight the fusion of one sperm with the egg and the other with the central cell.(D) Assay for HAP2 function in vivo using transmission of hap2-2 (T-DNA insertion carries kanamycin resistance, "Kan R "). hap2-2/+ plants carrying WT or variant HAP2:YFP transgenes were produced.These plants will generate four pollen/sperm genotypes.Endogenous HAP2 (black) or introduced HAP2:YFP construct (purple/yellow) are shown at the sperm surface.The fertility outcome for each sperm genotype is indicated; a functional copy of HAP2 (either endogenous or a functional HAP2:YFP variant) must be expressed by sperm to produce a viable seed.Only hap2-2 sperm that carry a functional HAP2:YFP variant transgene will sire Kan R progeny when outcrossed to ms1 females.Expression of a fully functional HAP2: YFP variant will result in 33% Kan R seedlings.(E) The Arabidopsis ("At") and rice ("Os") HAP2 membrane interaction surfaces including the αF helix are shown; the region chosen for mutagenesis is identified with a box.(F) Quantitative analysis of effect of HAP2:YFP variants on hap2-2 transmission (see S2 Data).Each data point (circle) represents the percentage of progeny from a single cross-pollination that inherited the hap2-2 allele; the mean is indicated with a red diamond.A minimum of three independent transgenic lines were analyzed for each HAP2:YFP variant (see Materials and methods).HAP2:YFP variants tested are indicated on the x-axis: none, no transgene; others defined in text; p-values from the Wilcoxon rank sum test comparing each variant to AtHAP2 are indicated: Ã p < 0.05, ÃÃÃ p < 0.001.(G) HAP2:YFP is detectable in sperm cells of transgenic plants.Confocal micrographs show HAP2:YFP signal from the indicated variant and autofluorescence of the pollen cell wall.Scale bar, 5 μm.A schematic depicts the arrangement of a pair of sperm cells within the pollen grain; the pollen nucleus and cell wall are depicted.HAP2, HAPLESS 2; ms1, male sterile 1; ns, not significant; T-DNA, transfer DNA; WT, wild type; YFP, yellow fluorescent protein https://doi.org/10.1371/journal.pbio.2006357.g006the corresponding arginine was mutated [5], raising the question of whether the proposed function is maintained across species.We found that A. thaliana gamete fusion was completely abolished upon mutation of the corresponding arginine residue to (R 163 A, Fig 6F).The aa sequence of TtHAP2 shows that there is a lysine two residues downstream from the conserved arginine (Fig 1B, 21st line, Ciliophora in the "Chromista" block).It is possible that given the plasticity of this region, a salt bridge between this lysine and the conserved glutamate in β-strand b can rescue the fusion activity, but further experiments with TtHAP2 would be required to test this option.

αF is interchangeable across the diversity of flowering plants
To determine whether the helical nature of αF is critical for AtHAP2 function, we mutated D 173 -which is in the middle of the hydrophilic face of αF-to proline, a mutation predicted to introduce a kink in the helix.We found that D 173 P, but not D 173 A (which is expected to maintain helical character), reduced HAP2 function in vivo (Fig 6F ) and reduced the ability of AtHAP2e to associate with liposomes (Fig 5A ), indicating that altering the helical conformation in this region affects membrane insertion.Both the length and amphipathicity of αF appear to be conserved among flowering plants (Fig 1B).In addition, Lysine 179 is invariant among plant HAP2 sequences (Fig 1B ) and is situated just below αF (Fig 4A and 4B).We found that mutating K 179 to alanine strongly reduced the function of HAP2 in the gamete fusion assay (Fig 6F ) but did not affect the ability of HAP2e to insert into liposomes in vitro (Fig 5A).This result suggests that K 179 is not required for membrane insertion but could be critical for a different stage of the gamete fusion reaction in plants.
To further address the sequence requirements for αF, we tested whether AtαF could be functionally replaced with the corresponding amphipathic helix of rice (Oryza sativa [Os], S2 Fig), a distantly related flowering plant species.This replacement resulted in a chimeric HAP2 variant that was functional (OsαF, Fig 6E and 6F).OsαF maintains the amphipathic nature of the helix and shares an isoleucine at position 171 but substitutes a tryptophan at position 173 and a threonine at position 176 (S2 Fig) .The interchangeability of αF suggests plasticity in the mode of interaction between sperm-expressed HAP2 and its target membranes, at least within the flowering plant lineage and in spite of around 150 million years of divergence [30].In contrast, when the entire rice ectodomain was used to replace the Arabidopsis ectodomain, this chimeric protein was unable to restore function to hap2 mutant sperm [29], indicating that other aspects of the gamete fusion mechanism (e.g., regulation of HAP2 activity by specific interaction with additional, lineage-specific proteins) have diverged within flowering plants.

Variable modes of HAP2 membrane insertion across eukaryotes
In this study, we provide evidence at the primary sequence ( In Arabidopsis, this motif consists of a single loop that projects the amphipathic αF helix for insertion into the lipid bilayer (Fig 4 ), while in T. cruzi, the membrane interaction surface comprises three small loops (Figs 2 and S3).Although the domain II tip of TcHAP2 lacks an helix and in this respect appears similar to viral class II proteins, such as that of RVFV Gc, the elements required for binding a glycerophospholipid head group in the latter (Figs 2B and 3D, [13]) are absent, suggesting that TcHAP2 uses yet another way of interacting with the lipid head groups.Moreover, the amphipathic helix αF observed in the flowering plants (Fig 4) is unlike the membrane interaction surface described for any of the three structural classes of viral fusion proteins (Fig 2B , Fig 3).It will be interesting to determine the forces that have driven diversification of the HAP2 fusion loops.One possibility that needs to be explored is that female gametes have evolved specific plasma membrane compositions important for gamete fusion.We found that AtHAP2 inserted more efficiently into artificial liposomes containing DOPS (see Materials and methods), a phosphatidylserine mimic.Understanding of female gamete lipid composition is currently limited to bulk membrane analysis in species with large and easily accessible oocytes [31,32].However, the use of genetically encoded phospholipid sensors is leading to increased awareness of important functions for even low-abundance anionic phospholipids (e.g., phosphatidylserine) and can now be used to define the membrane composition of small and inaccessible female gametes like those of the flowering plants [33]. .These data are consistent with previous genetic and biochemical analyses in Chlamydomonas [1] and with the recent finding that antibodies against the predicted fusion loops of Plasmodium HAP2 block parasitic gamete fusion and transmission [34].The hypothesis that HAP2 functions by direct membrane insertion is also supported by experiments in which a synthetic peptide corresponding to the predicted fusion loop of TtHAP2, associated with membranes [5].

HAP2 membrane insertion via the fusion loops correlates with gamete fusion in vivo
Class II viral fusion proteins are activated by exposure to acidic pH upon entry into the host endosome and are positively and negatively regulated by partner viral proteins [26,35].Whether and how HAP2 is triggered and the nature of the potential partner proteins involved in negative or positive regulation of HAP2 function will be active areas of future research; it will be interesting to determine whether these mechanisms are lineage specific or broadly conserved.For example, in Arabidopsis, EGG CELL 1 (EC1) has been shown to activate HAP2 for fertilization only after the sperm cells are released from the pollen tube to female gametes [36].
Our results show a very strong correlation between the ability of AtHAP2 to insert into membranes in vitro and its functionality in vivo, indicating that membrane insertion is an essential step in the fusion process, as demonstrated for the viral fusion proteins.Nevertheless, a recent report [37] proposed a HAP2 fusion mechanism similar to that of the C. elegans somatic fusion factor proteins (epithelial fusion failure 1 [EFF-1] and anchor-cell fusion failure 1 [AFF-1]; [38]), which do not function via target membrane insertion.These are the only members of the class II fusion protein structural family that lack a target membrane insertion surface at the domain II tip and were proposed to function instead by trans-oligomerization of proteins resident in the membranes of adjacent cells destined to fuse.Our findings do not support such a mechanism for HAP2, in line with the observation that in Arabidopsis, HAP2 is essential for sperm fertility but is not required for female fertility [28,39].A virus-like fusion mechanism is also in agreement with the observation that HAP2 is expressed in only one of the two gamete types to be fused in multiple plant, protozoan, and animal species in which it was studied [2,3,28,[39][40][41]. Indeed, expression from only one gamete is sufficient for gamete fusion in all species tested thus far [2,28,39,40,42], making the requirement for a second HAP2-like membrane protein in the opposite gamete for fusion unlikely.

Potential HAP2 signatures
HAP2 is the only gamete plasma membrane fusion protein to be identified thus far [43].aa sequence analyses have identified a HAP2-specific motif (pfam10699, [44]) that has detected orthologs in four out of the five eukaryotic kingdoms ( [1,2,4], Figs 1 and S5), suggesting that this protein was present in the last common ancestor to all eukaryotes and was a seminal innovation in the evolution of sexual reproduction.But these analyses have not identified orthologs in some well-studied clades like nematodes, vertebrates, and fungi (S5 Fig) .While we cannot exclude the possibility that HAP2 was replaced in some lineages by a fusion protein of a different origin, its widespread but sporadic identification in eukaryotic genomes (S5 Fig) suggests the more likely scenario that for many organisms, its sequence has diverged enough to escape detection by traditional sequence-based searches.A possible evolutionary force that may have driven HAP2 divergence in some lineages is positive selection for sequence diversity to maintain barriers between species.Proteins mediating cellular interactions critical for fertilization are well known to diversify rapidly and reinforce interspecific fertility barriers [45].Such interactions between HAP2 and partner proteins that regulate its fusion activity may have driven further divergence of the HAP2 sequence in organisms like yeast, C. elegans, mice, or humans, which currently lack candidate gamete fusion proteins.
The two HAP2 structures provided here (Fig 2A and Fig 3A), together with the previously reported CrHAP2e [1], could be useful in defining a structural signature to identify additional orthologs.X-ray structures were required to identify viral class II fusion proteins because they lack any detectable sequence similarity across viral genera [26,35].A comparison of the structural conservation of the HAP2 domain II structure across eukaryotes (Fig 3A ) with that of the class II fusion proteins from different viral families (Fig 3B ) shows that diverse HAP2 molecules conserve the relative orientation and length of most of its secondary structure elements, while the viral class II proteins show higher variation.Indeed, the HAP2 structures share an α2 helix of identical length and orientation, the same organization of the ij loop, and the core bdc β-sheet, including an invariant salt bridge anchoring the variable fusion loops to the central core of the molecule.In comparison, the viral class II proteins (Fig 3B), in spite of sharing the same elements within domain II (α2 helix, ij hairpin, core bdc β-sheet), differ in their relative orientations and positioning and are thus more structurally diverse.This is clearly seen in the table of DALI scores (S3 Table ), which reveals a large gap in between the very similar HAP2 domain II structures and the more divergent domain II structures from the other known class II proteins.This gap suggests the possibility that HAP2 orthologs with much lower sequence identity may be found that display the same three-dimensional fold.HAP2 orthologs identified thus far share approximately 30% sequence identity between the aligned residues across domain II, whereas viral fusion proteins are only about 10% identical in the aligned residues.It is thus plausible that additional HAP2 orthologs with lower sequence identity exist at a level insufficient for sequence-based identification.Irrespective of whether class II fusion proteins have a viral or cellular origin [46], the fact that the viral evolutionary clock is several orders of magnitude faster than its cellular counterpart suggests that HAP2 may have maintained an ancestral organization relative to the class II proteins of present-day viruses.The high conservation of the HAP2 core structure further suggests that bioinformatics approaches should be able to translate the observed structural similarity into signatures detectable in more distant HAP2 sequences, for instance, by analyzing the covariance of interacting residues distant in the sequence.

Plant growth conditions
Seeds were stratified (at least days at 4˚C) on solid Murashige and Skoog (MS) medium (Sigma Aldrich, St. Louis, MO) supplemented with appropriate antibiotics and germinated at 22˚C under constant light (Percival incubator).After 7-14 days, seedlings were transferred to sterile #2MIX potting media (www.fafard.com)with fertilizer (N:P:K, 15:5:15) and were grown at 20˚C, 50%-60% humidity, on a 16-hour-light / 8-hour-dark light cycle in growth chambers (Environmental growth chambers, Chagrin Falls, OH, United States of America).

S2 cell lines
Drosophila S2 cells (ATCC CRL-1963) were cultured in Schneider's complete media (Thermo Scientific) before transfection and in Insect Xpress media after transfection (Lonza, Basel, Switzerland).Culturing and transfection of S2 cells has been described previously [47].

Expression and purification of the soluble HAP2 ectodomain
Codon-optimized synthetic cDNA corresponding to a soluble C-terminally truncated version of the HAP2e comprising residues 25-494 from A. thaliana and 26-516 from T. cruzi were cloned into a modified Drosophila S2 expression vector described previously [48], and transfection was performed as reported earlier [47].For large-scale productions, cells were induced with 4 μM CdCl 2 at a density of approximately 7 × 10 6 cells/ml for 8 days and pelleted, and the soluble ectodomains were purified by affinity chromatography from the supernatant using a StrepTactin Superflow column followed by SEC using a Superdex200 column in 10 mM TRIS pH8 100 mM NaCl.Pure proteins were concentrated to approximately 8 and 4 mg/ml, respectively.

SEC with MALS
Purified HAP2e were subjected to SEC using a Superdex 200 column (GE HealthCare) equilibrated with 10 mM TRIS pH8 100 mM NaCl.Separation was performed at 20˚C with a flow rate of 0.5 ml min −1 .Online MALS detection was performed with a DAWN-HELEOS II detector (Wyatt Technology, Santa Barbara, CA, USA) using a laser emitting at 690 nm.Online differential refractive index measurement was performed with an Optilab T-rEX detector (Wyatt Technology).Data were analyzed, and weight-averaged molecular masses (Mw) and mass distributions (polydispersity) for each sample were calculated using the ASTRA software (Wyatt Technology).

Liposome coflotation experiments
DOPS, 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC), cholesterol, and sphingomyelin were purchased from Avanti Polar Lipids.Liposomes were freshly prepared in PBS by the freeze-thaw and extrusion method using molar ratios of 17% DOPE, 16% DOPC, 50% cholesterol, 17% sphingomyelin.Alternatively, for coflotation assays on WT and mutant AtHAP2e proteins, the following alternative composition was used: 50% cholesterol, 30% DOPC, 20% DOPS.Then, 0.07 μM purified HAP2e was mixed with 8 mM liposomes and incubated for 1 hour at 21˚C in 100 μL PBS.Samples were then adjusted to a final concentration of 20% sucrose, overlaid with a 5%-60% sucrose gradient (in PBS), and centrifuged for 1 hour at 4˚C at approximately 150,000 × g.Fractions from the top, middle, and bottom of the gradient were analyzed by immunoblotting using a monoclonal anti-strep tag antibody, and bands quantified using the GeneTools Syngene software.The percentage of HAP2e in either fraction was calculated as the ratio between HAP2e in individual fractions and total HAP2e (sum of HAP2e in top and bottom fractions).

Electron microscopy
Purified HAP2e (A. thaliana or T. cruzi) mixed with liposomes was spotted on glow-discharged carbon grids (CF300, EMS, USA), negatively stained with 2% phosphotungstic acid (PTA) PH 7.4, analyzed with a Tecnai G2 Bio-Twin electron microscope (FEI, USA), and imaged with an Eagle camera (FEI, USA).For cryo-electron microscopy, liposomes mixed with purified HAP2e were applied on a glow-discharged Lacey Carbon grid (Agar Scientific, UK).Samples were plunge-frozen in liquid ethane using an automated system (Leica EMGP, Austria) and visualized on a Tecnai F20 electron microscope operating at a voltage of 200 kV.Image frames were recorded in low-dose mode on a Falcon II direct electron detector (FEI, USA).

Crystallization and structure determination
Crystals of HAP2e from A. thaliana were grown at 293 K using the hanging-drop vapor-diffusion method in drops containing 1 μL protein solution mixed with 1 μL reservoir solution containing 100 mM Sodium Citrate pH 4.5 and 200 mM zinc acetate.Diffraction-quality rod-like crystals appeared after 1 week and were flash-frozen in mother liquor containing 30% (v/v) glycerol.The crystals diffracted anisotropically to 2.24 Å along the c Ã axis but only to 3.7 Å in orthogonal directions.Therefore, an ellipsoidal cut off was applied using the StarAniso server (STARANISO version 1.7.2 18-Apr-17 Ian J. Tickle, Global Phasing, Cambridge, UK; http:// staraniso.globalphasing.org/cgi-bin/staraniso.cgi) in order to remove noise, and refinement was carried out to a nominal resolution of 2.75 Å.
Limited proteolysis of HAP2e from T. cruzi was carried out by adding subtilisin (dissolved in 10 mM Tris pH8, 30 mM NaCl at 10 mg/mL) to a solution containing T. cruzi HAP2e at 16 mg/mL in 10 mM Tris pH8, 100 mM NaCl at a 1:300 w:w ratio.After 1 hour of incubation at room temperature, the protease was inactivated by addition of 1 mM PMSF, and the proteaseresistant fragment accounting to approximately 60% of the digested protein was purified by SEC using a Superdex200 column (TcHAP2e sub ).Crystals of TcHAP2e sub were grown at 293 K using the hanging-drop vapor-diffusion method in drops containing 1 μL protein solution mixed with 1 μL reservoir solution containing 100 mM CHES pH 9.0, 200 mM NaCl, and 10% w/v PEG 8k.Diffraction-quality rodlike crystals appeared after 7-10 days and were flash-frozen in mother liquor containing 25% (v/v) glycerol.
Data collection was carried out at the ESRF (ID30A-3) and the Synchrotron Soleil (Proxima-1).Data were processed, scaled, and reduced with XDS [49], Pointless [50], and programs from the CCP4 suite [51].For the structure of A. thaliana HAP2e, the full C. reinhardtii HAP2 monomer (PDB 5MF1) was used to create a search model with Sculptor [52].The so-called tip region was deleted (bc and cd connections).The structure was determined by the molecular replacement method using Phaser [53].Zinc atoms were localized in the density by calculating an anomalous difference map using ANODE [54].To determine the structure of the proteaseresistant fragment of T. cruzi HAP2, we initially used Sculptor [52] to create a search model for molecular replacement based on the structure of a protomer of the C. reinhardtii HAP2 trimer (PDB 5MF1).The full monomer model was divided into 3 individual domains (domains I, II, and III), and loops and exposed side chains were trimmed off.The structure was finally determined by the molecular replacement method using Phaser [53] and an isolated domain II as search Phases were refined using the anomalous signal of a highly redundant Sulfur-SAD data set collected at a wavelength of 2.06641 A on crystals of the native protein.
For both orthologs, model building was performed using Coot [55], and refinement was done using AutoBuster [56].

Generation of a trimeric model of TcHAP2
Two independent TcHAP2 trimer models were generated from superposition on either the AtHAP2 or CrHAP2 using the secondary structure matching function in Coot [55], and the resulting two TcHAP2 trimers were structurally very similar, with rmsd = 2.6 Å for 711 common Cα atoms (237 per subunit).
After seeds were collected from transformed plants, they were sterilized using a 50% bleach solution containing 0.02% Triton X-100 for 7 minutes, followed by washing 4 times with sterilized water.Seeds were resuspended in sterilized 0.1% agarose and were plated on 15-cm dishes containing MS media supplemented with Basta (25 mg/L; Chem Service, West Chester, PA; Oakwood Chemical, Estill, SC).After at least 2 days at 4˚C, plates were moved to a 22˚C incubator with constant light.After 7-14 days, Basta-resistant seedlings were transferred to soil made up with 1X fertilizer (4 plants / pot).For primary transformants (T 1 ), DNA was isolated using the leaf boil method [61].T 1 plants were genotyped for the hap2-2 allele using primer sets to detect both the Salk tDNA insertion (LbaI, [59]; hap2seqTR3 [28]) and WT genomic HAP2 (hap2c2; hap2seqTR3; see S5 Table ).

Genetic transmission assay for assessing transmission of hap2-2 allele
Pollen from T 1 plants that were heterozygous for hap2-2/+ were hand-pollinated onto male sterile 1 (ms1-1) pistils [62].Seeds from individual ms1 crosses were gas-sterilized and plated on MS Kanamycin (50 mg/L) plates supplemented with sucrose (5 g/L).Seeds were coldtreated at 4˚C for at least 2 days and then transferred to a 22-˚C incubator with constant light.hap2-2/+ seeds were simultaneously plated on MS Kanamycin plates to ensure the drug resistance was working correctly.For analysis of hap2-2 transmission, resistant versus sensitive seedlings were counted manually after 6-10 days at 22˚C.
Mutations were confirmed in at least one T 1 line per transgene by sequencing of a PCR product.PCR products were generated using a EcoRI-HAP2-CVFP-F1 forward primer and the HAP2Ex7R reverse primer.For HAP2 variants that had 0% transmission of hap2-2, seeds were collected from the T1 plant and plated on MS Kanamycin and MS Basta separately to confirm segregation of the hap2-2 allele and the transgene, respectively, in the next generation.

Analysis of HAP2:YFP variant expression
Expression of HAP2:YFP in pHAP2:HAP2variant:YFP transgenic plants was initially screened using fluorescence microscopy.Pollen from T 1 plants was hydrated on slides in pollen growth media [63] and imaged using a Axiovert 200M microscope (images not shown).Each transgenic line was analyzed using confocal microscopy (Fig 5G, S4 Fig) of the T 1 and/or T 2 generation.Pollen were hydrated in pollen growth media and imaged using a Zeiss LSM 800 confocal microscope.All images were taken using excitation with the 488 laser at 10% (488 settings kept constant) and using the 40× water objective with a 2× zoom.Single optical slices were collected (974 × 974 pixels, 16-bit).All images were imported to ImageJ (FIJI version) [64] and cropped to 300 × 300 pixels.

Statistical analysis
Data are presented as mean ± SD unless otherwise indicated in figure legends, and experimental repeats are indicated in figure legends.
The following criteria were used to analyze HAP2:YFP variant transgenic plants: at least three independent transgenic lines were analyzed, at least three crosses were performed for each line, and only crosses with at least 10 seeds were analyzed (S4 Table ).Transmission data did not follow a normal distribution for each transgene (confirmed by Anderson-Darling test for normality in R studio [65], using R version 3.2.3[66]); therefore, a nonparametric statistical test was used.Each HAP2 mutant was compared to the WT transgene (AtHAP2) using a Wilcoxon rank sum test with continuity correction in R studio.
size is proportional to the size of the displayed side chain.Nonpolar residues are colored in yellow.Charged residues are colored red (negative charge) and blue (positive charge).Polar uncharged residues purple and light pink.Glycines are displayed in gray, prolines in green.The hydrophobic moment value of the respective amphipathic helix predicted by Heliquest (μH) is shown below the organism name, suggesting the formation of an amphipathic helix in the putative membrane-interacting region of plant and algal HAP2 but also in some animal HAP2 orthologs.HAP2, HAPLESS 2. (TIF) S3 Fig. TcHAP2e trimers insert into liposomes via the hydrophobic domain II tip.(A) Surface representation of the membrane-facing tip of the putative TcHAP2 trimer, obtained as described in Materials and methods, viewed from the membrane (left panel) and from the side (right panel).Surface residues are colored according to hydrophilicity from dark orange (hydrophilic) to white (hydrophobic).The fusion loop region from one protomer is encircled, and hydrophobic residues that are exposed at the tip are labeled.Residue 129 belongs to the bc connection, the others to cd. (B-C) Membrane insertion of TcHAP2e.Electron micrographs of liposomes incubated in presence of HAP2e were analyzed by negative staining (B) or cryo-EM (C).Scale bar 50 nm.Liposomes decorated with HAP2e display protein projections at the surface.The protein projections (some of which are indicated by long arrows in C) are shaped as tapered rods similar to the ones of AtHAP2e (see Fig 4) and form lateral assemblies.The background is coated with proteins not bound to liposomes (short arrows in C).AtHAP2e, A. thaliana HAPLESS 2 ectodomain; EM, electron microscopy; TcHAP2, T. cruzi HAPLESS 2; TcHAP2e, T. cruzi HAPLESS 2 ectodomain.A schematic phylogeny representing the five eukaryotic kingdoms, subkingdoms, and phyla for which wholegenome sequence information is available (classification system; [17,67]).Selected groups of species are provided for some phyla.Red font indicates presence of pfam10699 (the HAP2-GCS1 domain).Filled red circles: all available genomes of the phylum contain pfam10699; open red circles: the phylum contains examples both of genomes that have or do not have pfam10699; filled black circles: phyla in which none of the available genomes contains pfam10699.HAP2 was identified in D. melanogaster and Apis mellifera [4], but these sequences (and genomes) lack pfam10699 (blue font).Parenthetical numbers indicate the total number phyla within the subkingdom.Only phyla with at least one whole-genome sequence with >10,000 annotated proteins are listed.Bracketed numbers indicate the number of species from each phylum used in this analysis.GCS1, GENERARATIVE CELL SPECIFIC 1; HAP2,

Fig 1 .
Fig 1. Substantial divergence in the domain II tip of HAP2 across eukaryotes.(A)Linear diagram of the HAP2 primary structure indicating the positions of the signal peptide ("S"), the polypeptide segments corresponding to domains I, II, and III ("DI-III"), highlighting the tip of domain II (the bdc β-sheet, aligned in B), the pfam HAP2-GCS1 domain (pf10699), the linker ("L") between DI and DIII, the stem that connects the ectodomain to the transmembrane domain ("tm"), and the cytosolic tail.(B) HAP2 aa sequence alignment obtained with MUSCLE[16] using representative members from four out of the five eukaryotic kingdoms[17] named on the left column.A. thaliana and T. cruzi, the two orthologs studied further in this work, are highlighted in purple background, as well as C. reinhardtii, which served as reference.Only the bdc β-sheet region is The resulting purified protease-resistant 40-kD fragment (termed TcHAP2e sub , S1E Fig) crystallized in the hexagonal space group P6 1 22 (S1F and S1G Fig), and the best crystals diffracted to 3.1-Å resolution (see Materials and methods; crystallographic statistics are listed in S2 Table).We determined both structures by the molecular replacement method using a search model derived from the structure of CrHAP2e (Protein Data Bank [PDB]: 5MF1) The experimental electron density map of AtHAP2 allowed us to trace 446 out of 469 ectodomain residues, including the region at the tip of domain II (S1H Fig, right panel) that was disordered in CrHAP2.The resulting atomic model of AtHAP2 revealed a trimer in unequivocal postfusion hairpin conformation (Figs 2A and 3A), as observed previously for CrHAP2.The crystals of TcHAP2e sub consisted of domain II with a short extension into domain I (Fig 2A).There was continuous electron density for the loops at the tip of domain II (S1H Fig, left panel), and we could build the polypeptide chain unambiguously in this region, albeit for a 2-residue break immediately after strand b (i.e., in the bc loop, Fig 2A).Monomeric displayed ).A black underline marks the A. thaliana segment deleted in the AtΔαF construct used in genetic assays (see below, section αF amphipathicity is required for AtHAP2 in vitro liposome insertion and in vivo function).(C) Schematic diagram of the bdc β-sheet and the connecting loops at the tip of domain II relative to the target membrane.The β-strands are colored beige to match the conserved blocks in the alignment.Drawn are the disulfide bonds (green bars, numbered as in the alignment) and the salt bridge (red bar, labeled "s"), which are conserved features constraining the organization of the domain II tip.Blue circles indicate the position of insertion hotspots (marked identically in B at the top of the alignment) predicted to project residues into the target membrane for insertion, the first one in the bc β-strands connection (left) and the two others in the cd connection (right).aa, amino acid; CrHAP2, C. reinhardtii HAP2; HAP2, HAPLESS 2 https://doi.org/10.1371/journal.pbio.2006357.g001

Fig 2 .
Fig 2. Structures of A. thaliana and T. cruzi HAP2.(A) Ribbon representation of the AtHAP2 trimer (left panel) and domain II of TcHAP2 (right).A "fused" membrane bilayer is diagrammed above, roughly to scale and with light and dark gray indicating polar and nonpolar moieties, respectively, to illustrate the relation of the fusion loops with respect to a membrane upon insertion.The AtHAP2 trimer subunit in the foreground and the TcHAP2 monomeric domain II are colored according to the class II convention: red, yellow, and blue for domains I, II, and III, respectively; cyan for the domain I-III linker.Disulfide bonds are shown as green sticks, and dashed tubes indicate disordered regions in the crystal (including in the membrane interaction surface in TcHAP2 domain II, arrow on the right panel) (B) Pairwise superposition of the AtHAP2 (in yellow) on TcHAP2, CrHAP2, and RVFV Gc [13].The core bdc β-sheet and ij loop of the latter are drawn in purple, with connecting loops in blue, to highlight the structurally conserved domain II scaffold.The conserved disulfide bonds (green) and salt bridge (colored according to atom type) are shown as sticks, hydrogen bonds as dotted lines.An orange arrow highlights the striking superposition of the side chains of the conserved salt bridge, as only the Cα atoms of the β-strands were used for the superposition.(C) Cartoon representation of the structurally conserved scaffold (in gray) in the HAP2 tip region, with variable loops in yellow.The three insets show the observed conformations of the variable cd connection in the known structures (At, Tc, and CrHAP2).AtHAP2, A. thaliana HAP2; CrHAP2, C. reinhardtii HAP2; HAP2, HAPLESS 2; RVFV, Rift Valley fever virus; TcHAP2, T. cruzi HAP2 https://doi.org/10.1371/journal.pbio.2006357.g002

Fig 3 .
Fig 3. Side-to-side comparison of the HAP2 structures and relation to viral class II proteins.(A) The postfusion trimer structures of AtHAP2e and CrHAP2e (PDB 5MF1;[1]) and the model for a TcHAP2 trimer obtained by superimposing the domain II structure on the CrHAP2e trimer.In the third panel, a grey oval indicates the region expected to be occupied by domains I and III of TcHAP2, for which we do not have a structure, but for which the aa sequence alignment indicates they are very closed, except for some inter β-strand connections.(B) The corresponding postfusion trimer structures of four representative viral class II fusion proteins: SFV E1 (PDB 1RER;[18]), RVFV Gc (PDB 6EGU;[13]), RV E1 (PDB 4ADI;[19]) and TBEV E (PDB 1URZ;[20]).The structures show the same organization of the three domains (yellow, red, and blue), which also share the same fold.Note that as expected, the viral proteins have evolved further from each other, as the same secondary structure elements are present, but their relative orientations have changed.The HAP2 structures can be considered as resembling a "frozen" ancestor of the class II fusion proteins observed on present-day viruses, given the huge difference in generation time and in the rate of mutations of RNA virus replication when compared with eukaryotic organisms.The statistics obtained with the DALI server[21] from the comparison of domain II from the various structures are provided in S3 Table.The HAP2 orthologs can be aligned with respect to the membrane using the highly conserved α2-helix.In contrast, the same helix does not display a conserved orientation and length in viral class II fusion proteins and therefore does not allow a similar alignment.AtHAP2e, A. thaliana HAPLESS 2 ectodomain; CrHAP2e, C. reinhardtii HAPLESS 2 ectodomain; HAP2, HAPLESS 2; PDB, Protein Data Bank; RV, rubella virus; RFVF, Rift Valley fever virus; SFV, Semliki Forest virus; TBEV, tick-borne encephalitis virus; TcHAP2, T. cruzi HAPLESS 2 Fig 3. Side-to-side comparison of the HAP2 structures and relation to viral class II proteins.(A) The postfusion trimer structures of AtHAP2e and CrHAP2e (PDB 5MF1;[1]) and the model for a TcHAP2 trimer obtained by superimposing the domain II structure on the CrHAP2e trimer.In the third panel, a grey oval indicates the region expected to be occupied by domains I and III of TcHAP2, for which we do not have a structure, but for which the aa sequence alignment indicates they are very closed, except for some inter β-strand connections.(B) The corresponding postfusion trimer structures of four representative viral class II fusion proteins: SFV E1 (PDB 1RER;[18]), RVFV Gc (PDB 6EGU;[13]), RV E1 (PDB 4ADI;[19]) and TBEV E (PDB 1URZ;[20]).The structures show the same organization of the three domains (yellow, red, and blue), which also share the same fold.Note that as expected, the viral proteins have evolved further from each other, as the same secondary structure elements are present, but their relative orientations have changed.The HAP2 structures can be considered as resembling a "frozen" ancestor of the class II fusion proteins observed on present-day viruses, given the huge difference in generation time and in the rate of mutations of RNA virus replication when compared with eukaryotic organisms.The statistics obtained with the DALI server[21] from the comparison of domain II from the various structures are provided in S3 Table.The HAP2 orthologs can be aligned with respect to the membrane using the highly conserved α2-helix.In contrast, the same helix does not display a conserved orientation and length in viral class II fusion proteins and therefore does not allow a similar alignment.AtHAP2e, A. thaliana HAPLESS 2 ectodomain; CrHAP2e, C. reinhardtii HAPLESS 2 ectodomain; HAP2, HAPLESS 2; PDB, Protein Data Bank; RV, rubella virus; RFVF, Rift Valley fever virus; SFV, Semliki Forest virus; TBEV, tick-borne encephalitis virus; TcHAP2, T. cruzi HAPLESS 2 https://doi.org/10.1371/journal.pbio.2006357.g003 CrHAP2 (S2 Fig, sequences in Fig 1B).This server also predicted an amphipathic helix in variable region 2 of the cnidarian animal Nematostella vectensis (S2 Fig), suggesting that this motif may be of widespread use for HAP2 target membrane insertion.In contrast to AtHAP2-in which the bc strand connection is basal to αF (Fig 2C) and away from the lipid contact area-in TcHAP2, both cd and bc connections are located roughly at the same level at the tip of domain II, suggesting that, together, they may comprise the membrane insertion element (Fig 1C, Fig 4D, right panel).A surface representation of the modeled TcHAP2 trimer indeed suggests a relatively flat membrane interaction surface with a tripartite nonpolar patch comprising V 129 (in the N-terminal side of the bc connection) and L 167 , L 168 and I 183 , F 184 (in the variable regions 2 and 3, respectively, of the cd connection, S3A Fig).Compared to CrHAP2, in TcHAP2, the α0 helix features an additional helical turn after disulfide 5 (Fig 2B, compare left and middle panels).As anticipated from the aa sequence, the loop preceding α0 is stabilized by an extra disulfide bond, numbered 5b (Figs 1B and 2C, middle inset).The second loop (variable region 3) is just a turn connecting α0 to the d strand.Residues I 183 /F 184 are located in the extra turn of α0 and are pointing inward (toward the bdc βsheet) in the structure (S3 Fig),

Fig 4 .
Fig 4. The AtHAP2 fusion helix.(A-B) Gray ribbon representation of AtHAP2 αF at the tip of domain II.Side chains of apolar (yellow), negatively charged (red), and positively charged (blue) residues are represented as sticks and labeled, with disulfides in green and numbered as in Fig 1; residues interrogated in subsequent experiments are in bold.(A) and (B) show two orthogonal views.(C) Helical wheel projection of αF; circle size is proportional to side chain size (amino acids colored as in A and B).The hydrophobic moment of the amphipathic α helix (μH) predicted by Heliquest [22] is displayed as a vertical arrow and its value indicated above the diagram.(D) The tip region of AtHAP2 is displayed in the central panel, with TcHAP2 on the right and the fusion protein Gc of the RVFV (PDB 6EGU;[13]) on the left, with a phosphatidyl-choline lipid bound in a glycerophospolipid-specific pocket in between the cd connection and the top end of the bdc β-sheet (the lipid is shown as connected spheres with the head group red and the aliphatic tails dark gray).The horizontal dotted lines mark roughly the borders of the polar lipid head group region of the membrane, based on this structure.In the central and right panel, the dotted lines show the predicted span of the polar head group region based on the aromatic residues that project into the aliphatic portion of the membrane.This location is clearer in the case of AtHAP2, in which F172 is placed at the same level as F826 in the viral protein.In the case of TcHAP2, the conformation of this region may have changed, as the nonpolar residues do not project out.The dotted lines are therefore only tentative.The trimer axis is shown in black, and the light blue bar indicates the region superposed in Fig2B, which is shifted down in the case of AtHAP2, as indicated by the blue arrow, such that the end of the bdc β-sheet is predicted to be away from the membrane.In all three panels, the bc and cd connections are in yellow, the conserved scaffold in gray, and disulfides as green sticks.AtHAP2, A. thaliana HAP2; PDB, Protein Data Bank; RVFV, Rift Valley fever virus; TcHAP2, T. cruzi HAP2 https://doi.org/10.1371/journal.pbio.2006357.g004 ) and TcHAP2e (S3B and S3C Fig) decorated the liposome surface, as had been observed with CrHAP2e.Side views of the proteoliposome edge showed approximately 12-nm-long projections with a tapered end toward the membrane (Fig 5B and 5D, S3A and S3B Fig), and top views showed a typical pattern of hexagonal packing of postfusion trimeric ectodomains at the liposome surface (Fig 5C), consistent with the overall shape, orientation, and lateral packing of postfusion trimers inserted into liposomes observed earlier for viral class II fusion proteins

Fig 5 .
Fig 5. AtHAP2e trimers insert into liposomes via the fusion helix.(A-F) Membrane insertion of AtHAP2e.(A) AtHAP2e membrane insertion was analyzed by differential sucrose gradient fractionation; the amount of HAP2e present in the top (gray) and bottom (black) fractions is shown as mean ± SD (see S1 Data); corresponding immunoblots are shown below.Statistical comparisons were performed using a Student t test.Ã p < 0.05; ÃÃÃ p < 0.001.Electron micrographs of liposomes incubated with WT HAP2e (B,C,D) and the IFI>AAA mutant (E,F).Samples were analyzed by negative stain EM (B,C,E) or cryo-EM (D,F).Scale bar 50 nm.Liposomes decorated with HAP2e display protein projections at the surface (B,C,D), contrasting with the smooth liposome surface in the presence of IFI>AAA mutant (E,F).The protein projections (some of which are indicated by long arrows in D) are rod shaped with a tapered membrane-proximal end-an inset illustrates the fitting of a trimeric AtHAP2e into one projection (B).The background is coated with proteins not bound to liposomes (short arrows in D,F).(C) Top view of a liposome decorated with AtHAP2e clustering in hexagonal arrays; an inset illustrates fitting of an AtHAP2 trimer; red asterisks illustrate trimer clustering.(G) Surface representation of the membrane-facing tip of the AtHAP2 trimer viewed from the side (left panel) and the top (right panel).Surface residues are colored according to hydrophilicity from dark orange (hydrophilic) to white (hydrophobic).The fusion loop region from one protomer is encircled, and hydrophobic residues that are exposed at the tip are labeled.AtHAP2, A. thaliana HAPLESS 2; AtHAP2e, A. thaliana HAPLESS 2 ectodomain; EM, electron microscopy; n.s., not significant; WT, wild type https://doi.org/10.1371/journal.pbio.2006357.g005

Fig 6 .
Fig 6.Arabidopsis gamete fusion requires presentation of hydrophobic residues within the amphipathic HAP2 αF helix.(A) Schematic of male gamete delivery and double gamete fusion within an Arabidopsis flower.Four anthers are shown surrounding the pistil, which contains two columns of ovules; petals and sepals are not depicted.Two pollen tubes (black lines) are shown targeting ovules.The inset illustrates a pollen tube approaching a female gametophyte, which develops within the ovule.A pair of sperm cells and the pollen tube nucleus are shown within the pollen tube; synergid cells ("SYN"), egg cell ("EC"), central cell ("CC"), and three antipodal cells (dark blue) are shown within the

2 (
Fig 6D, AtHAP2 WT).This assay provided a quantitative readout for HAP2 function in vivo and facilitated dissection of the critical residues of the domain II tip (Fig 6D, S4 Table).We first tested whether AtHAP2 αF is essential for function by deleting the helix along with two N-and C-terminal residues (box in Fig 6E, AtΔαF).Deletion of αF rendered HAP2 nonfunctional; hap2-2 transmission rates were zero (Fig 6F, S2 Data, AtΔαF) even though expression of HAP2ΔαF:yellow fluorescent protein (YFP) in sperm was similar to the WT control (Fig 6G, S4 Fig).
to alanine led to mild reductions in function (Fig 6F), while F 172 A more strongly reduced HAP2-induced gamete fusion (Fig 6F).The I 171 F 172 I 176 >AAA triple mutant completely eliminated HAP2 function in vivo (Fig 6F), in line with the abrogated association with liposomes observed in vitro (Fig 5A).

Fig 1 )
and at the structural (Fig 2) levels that HAP2, while maintaining its overall structure (Fig 3), has evolved highly divergent membrane interaction motifs by way of focal diversification across the eukaryotes in which it was positively identified (S5 Fig).Analysis of the HAP2 gene structure in intron-rich genomes (e.g., flowering plants, algae, protozoa, cnidarians, and insects) revealed the consistent presence of an intron between the regions encoding the domain II β-strands b, c, and d (S6 Fig), providing a potential route for focused diversification of the intervening functional loops via alteration of splicing patterns and/or exon shuffling during evolution.The X-ray structures of HAP2 from two representative organisms exhibiting a contrasting pattern of insertions and deletions in the loops at the domain II tip suggest very different modes of membrane insertion.
We have shown that Tc and AtHAP2e insert into artificial membranes (Fig 5, S3 Fig) and have identified key nonpolar residues on αF that are essential for AtHAP2e membrane insertion in vitro (Fig 5A) and for gamete fusion in vivo (Fig 6) of HAP2variant:YFP accumulation in sperm cells.Confocal micrographs of pollen grains from three independent transgenic lines expressing the indicated HAP2:YFP variant construct are shown.Each pollen grain shown contains a pair of sperm representing the maximum YFP intensity observed in the represented transgenic line.Signal from the pollen grain wall is due to autofluorescence and observed in the hap2-2/+ control, which does not contain a HAP2:YFP construct.S4 Table lists the number of independent transgenic lines that were tested for each HAP2:YFP variant using our genetic transmission assay for HAP2 function (Fig6).The average percent transmission of hap2-2 (33% is expected for a fully functional single-insertion-site line) for each line is indicated in the bottom-right corner of each image.Scale bar, 5 μm.HAP2, HAPLESS 2; YFP, yellow fluorescent protein.(TIF) S5 Fig. Scattered HAP2 positive identification across eukaryotic taxa.