Characterization and Evolutionary Implications of the Triad Asp-Xxx-Glu in Group II Phosphopantetheinyl Transferases

Phosphopantetheinyl transferases (PPTases), which play an essential role in both primary and secondary metabolism, are magnesium binding enzymes. In this study, we characterized the magnesium binding residues of all known group II PPTases by biochemical and evolutionary analysis. Our results suggested that group II PPTases could be classified into two subgroups, two-magnesium-binding-residue-PPTases containing the triad Asp-Xxx-Glu and three-magnesium-binding-residue-PPTases containing the triad Asp-Glu-Glu. Mutations of two three-magnesium-binding-residue-PPTases and one two-magnesium-binding-residue-PPTase indicate that the first and the third residues in the triads are essential to activities; the second residues in the triads are non-essential. Although variations of the second residues in the triad Asp-Xxx-Glu exist throughout the whole phylogenetic tree, the second residues are conserved in animals, plants, algae, and most prokaryotes, respectively. Evolutionary analysis suggests that: the animal group II PPTases may originate from one common ancestor; the plant two-magnesium-binding-residue-PPTases may originate from one common ancestor; the plant three-magnesium-binding-residue-PPTases may derive from horizontal gene transfer from prokaryotes.

The group I PPTases are found in most organisms except animals; the group II PPTases exist in almost all organisms; the group III PPTases are only found as domains fused within FASs in fungi and some PKSs in Streptomyces [1]. In most bacteria, group I PPTases phosphopantetheinylate ACPs in FASs and group II PPTases phosphopantetheinylate ACPs/PCPs in PKSs/NRPSs [4,6,19]. Animals and few bacteria contain single group II PPTase, which phosphopantetheinylate ACPs from both primary metabolism and secondary metabolism [20][21][22]. The group III PPTases phosphopantetheinylate the ACPs which locate with the group III PPTases in the same peptides [17][18].
Magnesium ion is essential to PPTase activity [23][24]. X-ray crystal structure analyses of two group II PPTases, Sfp from Bacillus subtilis and AASHDPPT from Homo sapiens, reveal that one group II PPTase binds one magnesium ion. Six ligands of magnesium ion in Sfp are two phosphates of the CoA, one water molecule, and carboxylates of Asp107, Glu109, and Glu151 of Sfp [24]. Interestingly, although the overall structure of AASHDPPT closely resembles that of Sfp, six ligands of the magnesium ion are two phosphates of the CoA, two water molecules, and carboxylates of Asp129 and Glu181 of AASHDPPT. The two magnesium binding residues of AASHDPPT, Asp129 and Glu181, correspond to the first and the third magnesium binding residues of Sfp ( Figure 1) [25]. Sfp and AASHDPPT represent the threemagnesium-binding-residue-PPTases and the two-magnesiumbinding-residue-PPTases, respectively.
To understand the relationship between structure and activity and the evolution of PPTases will shed a light on catalytic mechanisms of PPTases. Here, we carried out a systematically evolutionary analysis and a biochemical analysis of group II PPTases. Our results suggested that: (i) group II PPTases could be classified into two subgroups, two-magnesium-binding-residue-PPTases with the triad Asp-Xxx-Glu and three-magnesiumbinding-residue-PPTases with the triad Asp-Glu-Glu; (ii) the first and the third residues in the triads are essential to enzyme activities; the second residues in the triads are non-essential; (iii) the animal group II PPTases may originate from one common ancestor; the plant two-magnesium-binding-residue-PPTases may originate from one common ancestor; the plant three-magnesiumbinding-residue-PPTases may derive from horizontal gene transfer from prokaryotes.

Data collection
Protein sequences of annotated PPTases of E. coli, Streptomyces, and Homo sapiens were obtained from the National Center for Biotechnology Information (NCBI) database and were used as queries for gene search using BLASTP, PSI-BLAST from NCBI protein NR databases, with e value 1e-6 as the cutoff. PPTase homologs were selected based on the following criterion: sequence identity .35%, and length coverage .70%. In order to obtain all available annotated PPTases, archaea PPTases, cyanobacterial PPTases, plant PPTases, and animal PPTases were also obtained by searching for the annotated sequences as phosphopantetheinyl transferases from GenBank databases, Phytozome (http:// phytozome.net), and Ensembl (http://www.ensembl.org). PPTase data from both methods were merged, and representative sequences was used for further analysis.
SchPPT was digested with NdeI/HindIII from plasmid pHJ0024 [4], in which SchPPT was cloned as a NdeI/HindIII fragment into pET28a (Novagen), into the same sites of pET44a, yielding plasmid pYY0041. Each gene of five point mutants of SchPPT was amplified by mutagenesis PCR (QuikChange Site-Directed Mutagenesis Kit, Stratagene) from pYY0041 as template and primers HJ0117-HJ0122 and HJ0147-HJ0150, respectively. Each of the five genes was cloned into pET44a, yielding plasmids pYY0042-pYY0046, respectively. Co-expression of each plasmid pYY0041-pYY0046 with pHJ0021 in E. coli, and purification and LC-MS analysis of scn ACP0-2 are performed according to the procedures described above.

In vivo gene complement system
The SchPPT in-frame deletion mutant was constructed by using PCR targeting system as follows [26]. The cosmid pHJ0030 [4], in which SchPPT was replaced with aac(3)IV, was transferred into E. coli DH5a/BT340 to excise the aac(3)IV gene, resulting in cosmid pHJ0034. After conjugal transfer of pHJ0034 from E. coli ET12567/pUZ8002 into S. chattanoogensis L10, exconjugants were obtained after selection for thiostrepton. Exconjugants were then inoculated onto YMG plates for two rounds of nonselective growth before selection by replica plating for thiostrepton-sensitive colonies. The resulting strain, in which SchPPT was in-frame deleted, was designated as sHJ007 and confirmed by PCR analysis using primers HJ0077/HJ0078.
Protein sequence of Hppt was obtained from NCBI database. Codons of the encoding gene were changed into the preferred codons of E. coli. The correspondence DNA sequence was chemically synthesized and cloned into the NdeI/HindIII sites of pET44a, yielding plasmid pYY0052. Each of three Hppt point mutant genes was amplified by mutagenesis PCR from pYY0052 as template and primers H112-H115(F/R), yielding plasmids pYY0064-pYY0066, respectively. Co-expression of each plasmid (pYY0052, pYY0064-pYY0066) with pHJ0029 in E. coli, purification and HPLC analysis of sch FAS ACP were performed according to the procedures described above.

In vitro phosphopantetheinylation of sch FAS ACP catalyzed by Sppt or Sppt Q112E
Protein sequence of Sppt was obtained from NCBI database. Codons of the encoding gene were changed into the preferred codons of E. coli. The correspondence DNA sequence was chemically synthesized and cloned into the NdeI/HindIII sites of pET28a, yielding plasmid pYY0060. The Sppt Q112E gene was amplified by mutagenesis PCR from pYY0060 as template and primers FK161/FK162 and cloned into the NdeI/HindIII sites of pET28a, yielding plasmid pYY0061. BL21(DE3)/pYY0060 and BL21(DE3)/pYY0061 were induced with 0.4 mM IPTG at 37uC for 4 h to overproduce Sppt and Sppt Q112E , respectively. The proteins were purified by affinity chromatography on Ni-NTA agarose and then dialyzed against 20 mM Tris-HCl (pH 8.0), 25 mM NaCl, 1 mM DTT, and 10% glycerol.
A typical in vitro phosphopantetheinylation reaction mixture of 0.1 ml containing 100 mM Tris-HCl (pH 7.5), 1.25 mM MgCl 2 , 2.5 mM TCEP, 200 mM sch FAS ACP, 20 mM Sppt or Sppt Q112E , and 2 mM CoA was incubated at 25uC for 30 min. The reactions were quenched by freezing reaction mixtures with dry ice. HPLC analysis of sch FAS ACP were performed as described previously [4].

Phylogenetic analysis
Multiple sequence alignment (MSA) was carried out by using CLUSTALW and MUSCLE with the default parameter setting [33][34]. The alignment was then manually improved by using BioEdit 7.1.11, and the MSA generated by CLUSTALW was used as reference for manual adjustments. The best amino acid substitution model was determined with MEGA 6 to be LG+I+ G+F. We constructed maximum likelihood (ML) and neighborjoining (NJ) tree using PHYML and MEGA version 6.06 [35][36]. Reliability of interior branches was assessed using bootstrap support with 1000 replicates. Tree files were viewed using MEGA, and edited by Adobe Illustrator.

Variation of the magnesium binding residues of Group II PPTases
Since the magnesium binding residues of PPTases are essential for PPTase activities, we aligned the magnesium binding residues of 556 group II PPTases from databases of GenBank, Phytozome (http://phytozome.net), and Ensembl (http://www.ensembl.org). Interestingly, group II PPTases can be classified into two subgroups based on numbers of the magnesium binding residues. The three-magnesium-binding-residue-PPTases contain three magnesium binding residues, which form the triad Asp-Glu-Glu, such as Sfp. The two-magnesium-binding-residue-PPTases contain two magnesium binding residues corresponding to the first and the third magnesium binding residues of the three-magnesium-binding-residue-PPTases, forming the triad Asp-Xxx-Glu, such as AASHDPPT. The second residues of the triad Asp-Xxx-Glu include Met, Val, Ala, Gln, Thr, Ser, Leu, and Cys. All known animal PPTases, algal PPTases, and fugal PPTases belong to twomagnesium-binding-residue-PPTases. All known animal PPTases and algal PPTases contain the triads Asp-Met-Glu and Asp-Ala-Glu, respectively. Most prokaryotic group II PPTases belong to three-magnesium-binding-residue-PPTases. All known prokaryotic two-magnesium-binding-residue-PPTases contain the triad Asp-Gln-Glu. Both three-magnesium-binding-residue-PPTases and two-magnesium-binding-residue-PPTases are found in plant. Most plant two-magnesium-binding-residue-PPTases contain the triad Asp-Val-Glu (Table 2 and Figures 1, S1, S2, S3, and S4).

Effects of magnesium binding residues of a threemagnesium-binding-residue-PPTase SchPPT
Since the second magnesium binding residues of three-magnesium-binding-residue-PPTases are missing in two-magnesium-binding-residue-PPTases, we characterized effects of three magnesium binding residues of the formers to their activities. SchPPT from S. chattanoogensis L10 was used as a model of three-magnesiumbinding-residue-PPTases. SchPPT is necessary to natamycin biosynthesis since it catalyzes the phosphopantetheinylation of scn ACPs (S. chattanoogensis natamycin biosynthetic acyl carrier proteins) in natamycin biosynthetic PKS [4]. We constructed five point mutants of SchPPT. The first magnesium binding residue D105 and the third magnesium binding residue E151 in SchPPT were replaced with Ala, resulting in SchPPT D105A and SchPPT E151A , respectively. The second magnesium binding residue E107 was replaced with Ala, Val, and Met, resulting in SchPPT E107A , SchPPT E107V , and SchPPT E107M , respectively (Figure 1).
An in vitro co-expression system was built up to characterize activities of point mutants of SchPPT. Scn ACP0-2, the second ACP domain in the loading module of scn PKS, was used as a substrate of SchPPT and the point mutants [4]. Scn ACP0-2 in pET28a was coexpressed with pYY0040, in which both His-tag gene and Nus-Tag gene were deleted from pET44a, in E. coli. LC-MS data showed scn ACP0-2 produced from E. coli contained only apo-proteins, which was consistent with the results reported previously [4]. Then scn ACP0-2 in pET28a was co-expressed with SchPPT in pET44a in E. coli. LC-MS data showed scn ACP0-2 contained both apoproteins and holo-proteins, indicating SchPPT could phosphopantetheinylate scn ACP0-2 under these conditions. Finally scn ACP0-2 in pET28a was co-expressed with each of the SchPPT point mutant genes in pET44a in E. coli. LC-MS data showed both SchPPT D105A and SchPPT E151A lost their activities to phosphopantetheinylate scn ACP0-2. However, SchPPT E107A , SchPPT E107V , and SchPPT E107M were still active ( Figure S5).
To exclude the possibility that abolishment of activities of SchPPT D105A and SchPPT E151A due to mis-folding of proteins or no expression of genes, both mutants were produced in E. coli as His-tagged proteins and purified to homogeneities. In vitro phosphopantetheinylation of scn ACP0-2 was performed by incubation of scn ACP0-2 with CoA and each of the mutants by using wild type SchPPT as a positive control as described previously [4]. LC-MS data showed only wild type SchPPT but neither of the two mutants phosphopantetheinylated scn ACP0-2 under these conditions ( Figure S6).
An in vivo gene complement system was also built up to characterize the activity of SchPPT point mutants. We in-frame deleted SchPPT in S. chattanoogensis L10, resulting in strain sHJ007. Fermentation of sHJ007 in YEME liquid medium in triplicate showed sHJ007 lost ability to produce natamycin, confirming the activity of SchPPT is essential to natamycin production. Then we complemented SchPPT under the control of the ermEp* promoter in the sHJ007, resulting in strain sHJ008. Fermentation of sHJ008 in YEME liquid medium in triplicate showed sHJ008 produced natamycin with the yield of 494 mg/L at 96 h, indicating SchPPT could complement sHJ007 under these conditions. Finally we complemented each of five SchPPT point mutant genes under the control of the ermEp* promoter in the sHJ007, resulting in strain sHJ009-sHJ0013. Fermentation data showed complementation of neither SchPPT D105A nor SchPPT E151A could produce natamycin. However, complementation of SchPPT E107A , SchPPT E107V , and SchPPT E107M produced natamycin at 96 h with the yield of 434 mg/L, 482 mg/L, and 188 mg/L, respectively ( Figure 2). Both in vitro and in vivo data herein reveal that the first and the third magnesium binding residues in SchPPT are essential for enzyme activity; however, the second magnesium binding residues is non-essential.

Effects of magnesium binding residues of a threemagnesium-binding-residue-PPTase Hppt
Hppt, a single PPTase in Haemophilus influenza, was also used as a model of three-magnesium-binding-residue-PPTases. Hppt, in Table 2. Distribution of group II PPTases.  which all codons of wild type gene were changed into the preferred codons of E. coli, was chemically synthesized and cloned into pET44a, yielding plasmid pYY0052. The ACP of FAS in S. chattanoogensis L10, sch FAS ACP, was used as the substrate of Hppt. Plasmid pHJ0029 [4], in which sch FAS ACP was cloned into pET28a, was co-expressed with pYY0040 in E. coli. HPLC data showed sch FAS ACP produced from E. coli contained both apo-proteins and holo-proteins (Figure 3), which was consistent with the results that E. coli ACPS could phosphopantetheinylate sch FAS ACP incompletely [4]. Sch FAS ACP in pET28a was then co-expressed with pYY0052 in E. coli. HPLC data showed sch FAS ACP contained only holo-proteins, indicating Hppt could phosphopantetheinylate sch FAS ACP under these conditions. We finally constructed three point mutants of Hppt. The magnesium binding residues (D112, E114, and E155) were replaced with Ala, resulting in Hppt D112A , Hppt E114A , and Hppt E155A , respectively ( Figure S1). Sch FAS ACP in pET28a was then co-expressed with each of the Hppt point mutant genes in pET44a in E. coli. Among of three Hppt point mutants, only Hppt E114A remained this activity (Figure 3).

Construction of a three-magnesium-binding-residue-PPTase mimic based on a two-magnesium-bindingresidue-PPTase Sppt
Sppt, a single PPTase in Synechocystis sp. PCC6803, was used as a model of two-magnesium-binding-residue-PPTases. It has been reported that Sppt phosphopantetheinylates ACPs of type II FASs but not ACPs from secondary metabolism [20]. Sppt, in which all codons of wild type gene were changed into the preferred codons of E. coli, was chemically synthesized and cloned into pET28a. Sppt was produced in E. coli and then purified to homogeneity. Sch FAS ACP was also used as the substrate of Sppt. After incubation of sch FAS ACP with CoA in the presence of Sppt, HPLC analysis showed all apo-proteins converted into holoproteins, indicating Sppt could phosphopantetheinylate sch FAS ACP. We constructed a three-magnesium-binding-residue-PPTase mimic based on Sppt. The second residue in the triad of Sppt, Q112, was replaced with Glu, resulting in a three-magnesiumbinding-residue-PPTase mimic Sppt Q112E (Figure 1). After incubation of sch FAS ACP with CoA in the presence of Sppt Q112E , HPLC analysis showed all apo-proteins converted into holoproteins, indicating mutation of Sppt into a three-magnesiumbinding-residue-PPTase mimic remained its activity (Figure 3).

Gene synteny and gene duplication
To study colinearity of group II PPTases, 56 PPTases from representative species were selected for gene synteny analysis,    (Table 3 and Figures S7, S8, S9, and S10). Except six animal PPTases and three plant twomagnesium-binding-residue-PPTases, the other PPTases don't show any gene synteny conservation ( Figure 4). Some organisms contain more than one copy of group II PPTase encoding genes. The genome of Oryza sativa Japonica Group contains three group II PPTase encoding genes, oryzaI, oryzaII, and oryzaIII within the chromosome 8, the chromosome 11, and the chromosome 12, respectively. OryzaII and OryzaIII may derive from gene duplication, since both of them have the same triad Asp-Val-Glu, high DNA/protein sequence similarity/ identity, and conserved gene synteny at their encoding loci. However, OryzaI may come from a different origin with OryzaII and OryzaIII, since OryzaI has a different triad Asp-Glu-Glu, low DNA/protein similarity/identity comparing with OryzaII/Qry-zaIII, and has no gene synteny with OryzaII/QryzaIII.

Phylogenetic relationships between the PPTases
To study evolutionary relationship among group II PPTases, the above selected 56 PPTases were also analyzed by different phylogenetic methods. The maximum like tree can be separated into three-magnesium-binding-residue-PPTases part and twomagnesium-binding-residue-PPTases part ( Figure 5). The twomagnesium-binding-residue-PPTases part included animal PPTases, algal PPTases, fungal PPTases, and plant two-magnesium-binding-residue-PPTases. The three-magnesium-bindingresidue-PPTases part included plant three-magnesium-bindingresidue-PPTases and prokaryotic three-magnesium-binding-residue-PPTases. All of these 10 animal PPTases form one clade. These animal PPTases may originate from one common ancestor since (i) they have the same amino acids (Met) at the second position of the triad Asp-Xxx-Glu; (ii) they are closely related homologs in the phylogenetic tree; and (iii) the vertebrate PPTases even have gene synteny. Notably, the phylogenetic tree of the animal PPTases is consistent with animal species evolution ( Figures 5 and S10). Interestingly, the 10 plant PPTases are distinctly separated into two clades, the clade of plant twomagnesium-binding-residue-PPTases and the clade of plant threemagnesium-binding-residue-PPTases. The plant two-magnesiumbinding-residue-PPTases may originate from one common ancestor since (i) they have the same amino acids (Val) at the second position of the triad Asp-Xxx-Glu; (ii) they are clustered in the phylogenetic tree; (iii) and they have gene synteny. The plant three-magnesium-binding-residue-PPTases may derive from horizontal gene transfer from prokaryotes since they and most prokaryotic PPTases have the same triad Asp-Glu-Glu and are closely related homologs. Prokaryotic two-magnesium-bindingresidue-PPTases are close to prokaryotic three-magnesium-bind- ing-residue-PPTases but not to eukaryotic two-magnesium-binding-residue-PPTases.

Discussion
To date, all known group II PPTases contain two or three magnesium binding residues. Three-magnesium-binding-residue-PPTases contain three conserved magnesium binding residues forming the triad Asp-Glu-Glu, including most prokaryotic group II PPTases and some plant group II PPTases. Two-magnesiumbinding-residue-PPTases with the triad Asp-Xxx-Glu contain two conserved magnesium binding residues, which corresponding to the first and the third magnesium binding residues of the threemagnesium-binding-residue-PPTases, including most eukaryotic group II PPTases.
Here characterization of the point mutants of two threemagnesium-binding-residue-PPTases (SchPPT and Hppt) showed mutations of the first residues and the third residues in the triad abolished their activities. Our data are consistent with the results that mutations of the first residues and the third residues in the triads of Sfp, Lys5, and AASHDPPT abolished the activities or decreased the activities with more than 20-fold [23,25,[39][40].
Our results here showed mutations of SchPPT and Hppt into twomagnesium-binding-residue-PPTase mimics and mutation of Sppt (a two-magnesium-binding-residue-PPTase) into a three-magnesium-binding-residue-PPTase mimic remained their activities. However, it is unknown if replacement of triad Asp-Xxx-Glu in a two-magnesium-binding-residue-PPTase with triad Asp-Glu-Glu result in a bona-fide three-magnesium-binding-residue-PPTase with lack of the structural information.
Conservations of the first and the third residues in the triads of all known PPTases and our biochemical results suggested that the first and the third residues in the triads of group II PPTases are essential to the activities. The variations of the second residues in the triads and our biochemical results suggested that the second residues in the triads are non-essential to the activities. However, although the second residues in the triads are not critical to their functions, they are conserved in animals (Met), algae (Ala), plants (Val and Glu), and most prokaryotes (Glu). Therefore, the variation in this site is not random and can be used for species classification. The fixation of the second residues in the triads in different taxa may be due to selective sweep or other evolutionary forces. Most likely, the mutations of the second Mg residue may be due to random genetic drift, and the fixation of this residue in separate clades is largely independent of fitness, which could be explained by random fixation of very slightly deleterious mutations, as suggested by neutral evolution theory. A better understand of the evolution of PPTases gene family will shed new insights into the mechanism of this important enzyme in systems level [41][42].