The Evolutionary Portrait of Metazoan NAD Salvage

Nicotinamide Adenine Dinucleotide (NAD) levels are essential for cellular homeostasis and survival. Main sources of intracellular NAD are the salvage pathways from nicotinamide, where Nicotinamide phosphoribosyltransferases (NAMPTs) and Nicotinamidases (PNCs) have a key role. NAMPTs and PNCs are important in aging, infection and disease conditions such as diabetes and cancer. These enzymes have been considered redundant since either one or the other exists in each individual genome. The co-occurrence of NAMPT and PNC was only recently detected in invertebrates though no structural or functional characterization exists for them. Here, using expression and evolutionary analysis combined with homology modeling and protein-ligand docking, we show that both genes are expressed simultaneously in key species of major invertebrate branches and emphasize sequence and structural conservation patterns in metazoan NAMPT and PNC homologues. The results anticipate that NAMPTs and PNCs are simultaneously active, raising the possibility that NAD salvage pathways are not redundant as both are maintained to fulfill the requirement for NAD production in some species.


Introduction
Nicotinamide Adenine Dinucleotide (NAD) is an essential molecule to cells. As a cofactor in redox reactions, NAD regulates the metabolism and energy production and, as a substrate for NAD-consuming enzymes such as poly(ADP-ribose) polymerases (PARPs) and sirtuins, NAD is involved in DNA repair, transcriptional silencing and cell survival [1]. To maintain adequate NAD levels, several routes are used for NAD synthesis that depend on distinct precursors: de novo pathways synthesize NAD from tryptophan or aspartic acid whereas salvage pathways recycle NAD from nicotinamide (Nam), nicotinic acid (Na) and their ribosides [2][3][4].
The nicotinamide salvage pathway is the major source of intracellular NAD in humans [5,6] and is also required for growth in several microorganisms [7][8][9][10]. NAD salvage from Nam is a two-or four-step reaction, in which the rate-limiting enzymes and functional homologues are, respectively, nicotinamide phosphoribosyltransferases (NAMPTs) and nicotinamidases (PNCs) [11][12][13]. In humans, NAMPT is widely studied due to its involvement in inflammation and disease such as cancer [14,15]. In contrast, humans lack nicotinamidase but expression of the Drosophila Pnc protects human neuronal cells from death originated by oxidative stress [16]. Moreover, an increased Pnc1 and sirtuin activity confers protection to proteotoxic stress in yeast and C. elegans [17,18]. The yeast Pnc1 is a biomarker of stress and a regulator of sirtuin activity [11,18], and thus, most studies in yeast and invertebrates have focused in the link between these enzymes and aging [16,19]. Notwithstanding, despite their importance to major cellular processes, there is a poor functional characterization of nicotinamidases [20,21] and their role in infection has been less explored [7,8,22].
NAMPTs and PNCs do not co-occur in all organisms and, until very recently, lineages with both NAMPT and PNC had been only found in bacteria and algae [30][31][32]. NAMPT was thought to be absent from invertebrates but the discovery that NAMPT homologues are present in several invertebrate species and that some species have both NAMPT and PNC homologues [33] challenged the classical view that these enzymes are redundant and mutually exclusive [1], emphasizing the need for studies characterizing the structural and functional properties of these enzymes.
Motivated by the lack of information for NAMPT and PNC homologues in relevant invertebrate species, which would render the biological meaning of simultaneous versus unique occurrence of these proteins more evident, we carried out an integrated study to establish gene expression, amino acid conservation and structural comparisons. We provide experimental evidence that both genes are expressed simultaneously in key invertebrate species. In addition, evolutionary conserved patterns at the amino acid sequence and at the structural levels were detected. Also, using homology modeling and protein-ligand docking, we identify the amino acids that bind Nam in the active sites of invertebrate NAMPTs and PNCs. Taken together, the results suggest that invertebrate NAMPTs and PNCs are concurrently functional and, thus, that both NAD salvage pathways might not be redundant.

Expression of invertebrate NAMPTs and PNCs
NAMPT homologues have been previously found in the vibriophage KVP40 [34], bacteria [10,32], and the unicellular green algae Chlamydomonas reinhardtii [31], motivating the search for NAMPT homologues in invertebrates, some of which simultaneously have PNC sequences [33] (Table S1). No recognizable NAMPT homologue has been detected so far in representative species of the phyla Arthropoda or Nematoda, although NAMPT and PNC were found in more basal lineages such as the choanoflagellate Monosiga brevicollis and the sea anemone N. vectensis ( Figure 1). Such phylogenetic distribution is consistent with a scenario where both genes were present in the Metazoan ancestor and were selectively lost in specific lineages, as evidenced by the different patterns in protostomes. Namely, both genes were found in lophotrochozoans that includes mollusks (Lottia gigantea) and annelids (C. teleta and Helobdella robusta), and the absence of NAMPT was observed in ecdysozoans such as nematodes and arthropods. In deuterostomes, which comprises chordates, hemichordates and echinoderms, both genes were likely present in early lineages, which is supported by the evidence from the extant B. floridae, Saccoglossus kowaleskii and S. purpuratus species, but NAMPT was secondarily lost in the urochordate Ciona intestinalis while PNC was lost in vertebrates ( Figure S1). RT-PCR of selected species showed that both NAMPT and PNC genes are expressed in the adult forms of Branchiostoma floridae (Cephalochordata), Strongylocentrotus purpuratus (Echinodermata), Capitella teleta (Annelida) and Nematostella vectensis (Cnidaria) (Figure 1). In addition, available EST (Expressed Sequence Tag) data indicates that NAMPT and PNC genes are also co-expressed during developmental stages (Table  S2), suggesting a widespread usage of both Nam salvage pathways across Metazoans.

Evolutionary divergence of NAMPTs and PNCs
We have further characterized the evolutionary divergence of NAMPT and PNC homologues, measured as protein distances calculated from amino acid sequence alignments ( Figure 2). The resulting matrix ( Figure 2) showed that NAMPT is conserved, even when large evolutionary distances are considered. For example, the divergence between the human and cnidarian (N. vectensis) NAMPT homologues is about 50%, as much as when compared with amphioxus (B. floridae). Among invertebrates the sequences showing the smallest divergence are from N. vectensis and C. teleta (31.2%). Conversely, PNC sequences are highly divergent even in closely related species, as shown for the annelids C. teleta and H. robusta, or the basal chordates B. floridae and C. intestinalis. Curiously, the smallest divergence between PNC sequences was found for C. teleta and B. floridae (51.3%). This trend was also evident when we plotted protein distances taking implicitly in consideration the evolutionary divergence time between each pair of species studied (Movie S1 and Table S3). Analyses of protein distances (pd) indicated that NAMPT homologues are considerably more conserved (pd = 0.44760.116) than PNC (pd = 0.84260.151) (mean6std), which is remarkable for species spanning over 1000 million years of divergence (Table S3). For PNC proteins, in addition to the larger values, no correlation with evolutionary distance was observed, while NAMPT distances were smaller and increased consistently with the evolutionary distance (ed) between species. The Kendall rank correlation coefficient was used to measure the dependence between pd and ed, showing no relevant dependence between both quantities for PNC (t = 20.052). However, for NAMPT both quantities vary consistently (t = 0.413).

Motif conservation in NAMPTs and PNCs
We next used the previously constructed amino acid sequence alignments dataset to search for conserved motifs in NAMPT and PNC homologues. In line with the aforementioned results, analyses of NAMPT sequences ( Figure 3A) revealed conserved amino acid motifs surrounding catalytic residues [24,25,[35][36][37] Tyr18, Phe193, Asp219, His247, Asp279, Asp313, corresponding to the boxed amino acids in Figure 3. As well, Asp16 and Arg311, Gly353 and Asp354, and Gly384 that bind nicotinamide, ribose or phosphate, respectively, are preserved and the additional NMN interacting residues Arg196 and Gly383 in rat NAMPT [25] are present in all sequences analyzed. The amino acid stretches that represent the dimer interface are also conserved in invertebrate NAMPTs ( Figure 3A and Figure S2), as previously shown for vertebrates [25].
Similar analyses on PNC homologues showed that, while overall amino acid sequence identity is low ( Figure 3B), motifs surrounding metal-binding and catalytic residues (boxed amino acids) show up. Indeed, all PNC sequences have conserved residues that coordinate the metal ion (corresponding to Saccharomyces cerevisiae Asp51, His53 and His94) and the catalytic triad (S. cerevisiae Asp8, Lys122 and Cys167). The characteristic cis-peptide bond that has been identified in available nicotinamidase/pyrazinamidase structures also corresponds to conserved residues present in these species, namely Val-Ala in Pyrococcus horikoshii, S. cerevisiae, Leishmania infantum and C. intestinalis [7,38,39], Ile-Ala in Mycobacterium tuberculosis, Acinetobacter baumanii, H. robusta and B. floridae [40,41], or Val-Leu in Streptococcus pneumoniae [42], and are preceded by a conserved glycine that has a role in catalysis [38,40,41]. Additionally, mutations that lead to M. tuberculosis loss of pyrazinamidase activity have defined residues that delineate the active site scaffold [38], corresponding to S. cerevisiae Glu10, Asp12, Phe13, Leu20, His57, Trp91, Gly123, Tyr131, Ser132, Val162, Ala163, Tyr166 and Thr171, and most of them are conserved in all invertebrate PNC sequences as well ( Figure 3B and Figure S3).

Genetic architecture conservation of NAMPT homologues
Given the degree of conservation of both proteins, and taking into account the divergence times of over 1000 million years between the species considered here, we have investigated the conservation at the gene structure and genome organization levels. NAMPT retained microsynteny in chordates, as indicated by the conserved gene order between H. sapiens, M. musculus, D. rerio and B. floridae, and also showed macrosynteny conservation in some lineages, namely between Trichoplax adhaerens and either H. sapiens, N. vectensis or M. brevicollis ( Figure 4 and Figures S4, S5). For PNC homologues, no syntenic regions were found. Although recent studies point to a higher level of microsynteny conservation in metazoans than previously estimated [43], these evidences are challenging in some lineages due to poor genome annotation and breakdown in small scaffolds. At the level of exon-intron structure, NAMPT is more homogeneous than PNC, considering the number and size of exons, and total gene length ( Figures S6, S7). The exception is observed in N. vectensis, where NAMPT has many small exons spanning 14 Kb in the genome, while PNC has only two exons in less than 2 Kb. Using the information on conserved motifs and gene structure, we were able to successfully identify NAMPT and PNC homologues as well as predict the corresponding gene structures in the hemichordate S. kowaleskii, a phylogenetic informative species ( Figure S8).

Secondary structure conservation of PNC homologues
Nicotinamidase sequences are poorly conserved even in closely related species (Figures 2 and 3). Yet, considering some structures determined for archaea (P. horikoshii, PDB id: 1IM5), bacteria (A. baumanii, PDB id: 2WTA) and fungi (S. cerevisiae, PDB id: 2H0R), sharing only 30% protein identity ( Figure 5A), the 3D structures are perfectly superimposable ( Figure 5B). Such structural conservation is observed across the three domains of life, as all PNC enzymes share a similar core fold ( Figure S9), with a potential increase in complexity of the enzyme that is active as a monomer in P. horikoshii [38], dimer in A. baumanii [40] and heptamer in S. cerevisiae [39]. Thus, we have aligned PNC sequences based on secondary structure predictions and determined that invertebrate PNCs also show structural conservation ( Figure 6). The regions corresponding to alpha-helices (red) and beta-sheets (yellow) are conserved at the structural level, even if the amino acids are not ( Figure 6A). For instance, the alpha-helices of regions I, II and III comprise different amino acids while displaying a similar fold. To illustrate this, region II is shown in detail for P. horikoshii, A. baumanii and S. cerevisiae ( Figure 6B).

Modeling and docking analyses of invertebrate NAMPTs and PNCs
To gain insight into the structures of invertebrate NAMPTs and PNCs, we have performed homology modeling and protein-ligand docking. To overcome limitations in the interpretation of results, we have used several templates to generate the models (Table S4). The LIGPLOT program was used to generate schematic diagrams between ligand (Nicotinamide, NCA) and receptor (NAMPT and PNC), which are shown in Figure 7. The prediction accuracy redocking test performed for the NAMPT (PDB 2E5D from H. sapiens) and PNC (PDB 3R2J from L. infantum), were in agreement with the ligand-receptor conformation in these X-ray structures. We obtained a similar active site ligand-receptor interaction for both NAMPT and PNC, which insure that the docking approach was accurate enough to be applied to the various molecular systems.
In NAMPT protein active site, all species, except N. vectensis, maintained most of the ligand-receptor interactions when compared with the structure of human NAMPT ( Figure 7A). The homologous NAMPT of B. floridae has a hydrogen bond network that stabilizes the active site with two H-bonds between the sidechain of Arg-293 and the oxygen atom of the ligand. A similar bonding network can be observed in the human protein (PDB 2E5D) where Asp-219 binds to the nitrogen atom of the substrate (NCA). Hydrophobic interactions are similar when compared with

Discussion
Nicotinamide phosphoribosyltransferases (NAMPTs) and nicotinamidases (PNCs) are the main NAD salvage enzymes and, until recently, were thought to occur in distinct lineages. Our data show that several Metazoan species have predicted homologues of both enzymes and that both genes are simultaneously expressed in B. floridae, S. purpuratus, C. teleta and N. vectensis. The distribution of NAMPT and PNC homologues points to the presence of both genes in early eukaryote evolution with selective gene loss and retention in different animal lineages. Interestingly, loss of either one of the genes was predominantly found in fast evolving lineages, namely D. melanogaster, C. elegans and C. intestinalis, while slow-evolving species such as B. floridae retained both [44]. This is also reflected in genome architecture, with conserved NAMPT microsynteny in vertebrates and B. floridae.
We also highlight different conservation patterns in NAMPT and PNC homologues, at the protein amino acid sequence and at the 3D structural level. NAMPT sequences are highly conserved, as evidenced by small evolutionary divergences between species and long stretches of identical amino acids surrounding important catalytic and structural positions. As dimerization is required for NAMPT activity [25], in addition to active site residues that interact with the substrates and reaction products, amino acids that constitute the dimer interface are also conserved. For PNCs the sequence identity is lower, yet, critical amino acids are conserved and the overall fold is maintained in all the three domains of life. These are unifying features of nicotinamidases, even though there is a diversity of catalytic mechanisms described, with some exceptions concerning metal binding and metal ion coordination [7,20,21,29,41,42].
Homology modeling and protein-ligand docking indicates that active site residues and interactions of invertebrate NAMPTs with the substrate, nicotinamide, are similar to what is described for vertebrate NAMPTs [24,25,[35][36][37]. In invertebrate PNCs, most interactions are maintained while additional hydrogen bonds and hydrophobic contacts were found. These new interactions might derive from complementary amino acid changes as a result of epistatic interactions between residues [21,45], which is consistent with a structural conservation of PNCs.
Our analyses validate invertebrate NAMPTs and PNCs, suggesting that both the two-step and the four-step NAD salvage pathways are functional in these organisms. These findings imply that either these enzymes are not redundant, or that specific metabolic requirements call for increased NAD production in some species that only the presence of both enzymes would fulfill.

Sequence analysis
The human NAMPT and the yeast PNC1 amino acid sequences were used as queries in BLAST searches [46], from National Center of Biotechnology Information, NCBI (http://www.ncbi. nlm.nih.gov/sites/genome) and Joint Genome Institute, JGI (http://genome.jgi-psf.org/) sequenced genomes. In organisms with multiple hits, the reciprocal best hit was selected for further analysis. All sequences retrieved in this process and further analyzed are listed in Table S1. Estimates of evolutionary divergence between sequences were conducted in MEGA5 [47] and calculated as the number of amino acid substitutions per site. Analyses were conducted using the Poisson correction model [48] and involved 13 amino acid sequence homologues for each protein. Positions containing gaps and missing data were eliminated, resulting in a total of 436 (NAMPT) and 167 (PNC) positions in the final dataset. Alignments were visualized in Geneious [49] v5.5.6 to generate logos. Structural alignments of PNC homologues were performed in Ali2D (http://toolkit. tuebingen.mpg.de/ali2d). Divergence times between species were estimated using Time Tree (http://www.timetree.org/) [50]. MATLAB version R2010b was used to generate 3D graphics (the input data is shown as Table S3) and calculate Kendall rank correlation coefficients. Correlations were measured against a reference function consisting of a monotonic increasing function of protein distances against evolutionary divergence (the hypothesis). Synteny was determined using the CHSminer software (http:// www.biosino.org/papers/CHSMiner/) [51] and the JGI genome portal (http://genome.jgi-psf.org/). Saccoglossus kowaleskii BLAST searches were also performed as described above (http://blast. hgsc.bcm.tmc.edu/blast.hgsc?organism = 20), the corresponding genome contigs (115790 and 40985) were retrieved and the NAMPT and PNC protein sequences were manually predicted, based on the conserved motifs identified. Exon predictions were then performed in Genescan (http://genes.mit.edu/GENSCAN. html).

Molecular homology modeling
Prime [52] was used to search homologous proteins in NCBI PDB database (http://www.rcsb.org/pdb/home/home.do) for PNC and NAMPT. PDB templates (Table S4) were selected considering lowest e-values (,161026), and structures without many missing residues (gaps,20%). PNC and NAMPT sequences for the species B. floridae, C. teleta, S. purpuratus and N. vectensis were used to generate the alignments with homologue proteins. For secondary structure prediction the third-party program SSpro [53,54] was used and then all the templates re-aligned to the query sequence. The resulting alignment was used to build the protein models. The LigPrep [55] interactive optimizer (protein preparation wizard) with neutral pH was used to optimize the protein model. Finally hydrogens were added, bond order was assigned and selenomethionines were converted to methionines for the generated models.

Molecular docking simulations
The 3D-structures of ligands were obtained from the PDB structures. The protein-ligand complexes were prepared with AutoDockTools [56,57]: hydrogen atoms were added for each protein and Kollman united atom charges assigned. Hydrogens were also added to the ligand (NCA) and charges were calculated by the Gasteiger-Marsili method. The rotatable bonds in the ligands were assigned with AutoDockTools. The Zn atom of PNC was assigned a charge of +2. AutoDock4.2 [56] was used to perform protein-ligand docking calculations. To insure the accuracy of our methodological approach we first have done redocking of the two most recently available X-ray structures (NAMPT PDB 2E5D and PNC PDB 3R2J) and then applied it to the various predicted protein-ligand systems. Various grid sizes were tested using as structural criteria the similarity between our docked results and the X-ray structure of H. sapiens NAMPT (2E5D) and L. infantum PNC (3R2J). We have selected a cubic grid box of 30625640 Å for NAMPT and 35635640 Å for PNC, centered on the C2-C5 ligand atoms distance mean with a grid spacing of 0.375 Å as shown in Table S4.
We considered the binding pockets described in the literature [7,36] (also shown in Table S5) to perform the flexible proteinligand docking. The corresponding residues in the homology alignment are described in Table S5. We performed the docking simulations using 100 independent Lamarckian genetic algorithm (LGA) runs, with the population size set to 200, the number of energy evaluations set to 10 000 000 and the maximum number of generations set to 27 000. All other parameters were used as default [58,59]. The results were analysed clustering together the conformations within a RMSD of 2 Å . The cluster with lower energy and with a conformation similar to the X-ray structure of NAMPT (PDB id: 2E5D) and PNC (PDB id: 3R2J) was selected for each species.

H-bonds and hydrophobic interactions for ligandreceptor molecules
Interactions between the ligand (NCA) and receptors (NAMPT and PNC) were calculated using LIGPLOT [60]. The hydrogen bonds were calculated using geometrical criteria [61] of proteinligand complex (The used criteria is: H-A distance ,2.7 Å , D-A distance ,3.3 Å , D-H-A angle .90u, D-A-AA angle .90u and H-A-AA angle .90u, where A is the hydrogen acceptor, D is the hydrogen donor, AA is the atom attached to the hydrogen acceptor, and H an atom of hydrogen). LIGPLOT also calculates noncovalent bond interactions (hydrophobic interactions) by applying a simple cut-off of 3.9 Å . LIGPLOT diagrams were generated for each species. PyMOL [62] was used to generate the 3D images.

Expression analysis
B. floridae (whole organism), C. teleta (whole organism), S. purpuratus (gonad) and N. vectensis (whole organism) samples were obtained from Ocean Genome Legacy (OGL Accession ID numbers S13045, S13061, S13034 and S13115, respectively) [63]. RNA was extracted with the Illustra TriplePrep kit (GE Healthcare) and genomic DNA was removed from RNA preparations with an additional DNase treatment using DNase I, RNase-free (Fermentas, Thermo Fisher Scientific Inc.), according to the manufacturer's procedure. Complementary DNA (cDNA) was synthesized from 1 mg of total RNA using the RETROscripH First Strand Synthesis Kit (Ambion) with oligo-dT primers according to the manufacturer's instructions. Reverse-transcription PCR reactions were prepared using Hot-StarTaqH Master Mix Kit (Qiagen) with 2 ml of the synthesized cDNA in a 10 ml final volume. Q solution was included in the reaction (10%) in NAMPT amplification in B. floridae and S. purpuratus. PNC and NAMPT were amplified with species-specific primers described in Table S6, with a final concentration of 0.2 mM. Thermocycling conditions were as follows: initial denaturation at 95uC for 15 min, 40 cycles at 95uC for 30 sec, variable annealing temperatures ranging from 52uC to 62uC (Table S6) for 1 min30 sec, and 72uC for 1 min, and a final extension step of 10 min at 72uC. All amplification products were visualized on 1.5% agarose gels and were confirmed by sequencing. For that, PCR products were purified with ExoSAP-IT (USB Corporation) by incubation at 37uC for 15 min, followed by enzyme inactivation for 15 min at 85uC. The resulting purified fragments were sequenced using an ABI Big Dye Terminator Cycle Sequencing Ready Reaction kit v 3.1 (Applied Biosystems) and analyzed in an ABI PRISM 3130xl (Applied Biosystems).
Expressed Sequence Tag (EST) information was retrieved from available databases for B. floridae [64], S. purpuratus [65] and N. vectensis [66] and is detailed in Table S2.

Supporting Information
Movie S1 Evolutionary divergence between NAMPT and PNC homologues. Protein distances were plotted for each pair of species arranged accordingly to their respective divergence time. This plot shows that NAMPT is highly conserved across large evolutionary distances, while PNC is less conserved even in closely related species. Notice that in addition to being highly conserved, protein distances and evolutionary distances are correlated in NAMPT (quantified by the Kendall coefficient of 0.413), as opposed to PNC (where the Kendall coefficient was 20.052).