Phylogenetic based dissection of eukaryotic Mo-insertase functionality: From mechanism to complex assembly

Tim Julian Schmidt; Ahmed H. Hassan; Boas Pucker; Tobias Kruse

doi:10.1371/journal.pone.0350191

Abstract

Molybdenum cofactor (Moco) biosynthesis is vitally important for all organisms, yet the domain organization of the eukaryotic molybdenum insertase (Mo-insertase) remains enigmatic. We combine extensive phylogenetic reconstructions, sequence analysis and structural modeling in order to uncover evolutionary and functional principles of eukaryotic Mo-insertases. We note, that the vast majority of plant, fungi and animal species evolved fused E- and G-domains, yet the orientation of both domains in the fusion proteins differs among different eukaryotic lineages. Despite the divergent domain arrangements amongst eukaryotic Mo-insertases the E-domain active site is well conserved, with very few tolerated substitutions. Among the Mo-insertases from different eukaryotic species, vertebrate gephyrin is the only Mo-insertase with a dual function as – next to its metabolic function – it scaffolds inhibitory neurotransmitter receptors in the post synapsis. Gephyrin is surprisingly high conserved, including surface patches not directly involved in catalysis and receptor clustering. This profile suggests additional, as yet uncharacterized, functional constraints on gephyrins evolution. Together, our results reveal how eukaryotic Mo-insertases combine evolutionary domain organization plasticity with stringent active site conservation and recognize the evolutionary constraint on gephyrin’s surface conservation to be extreme, likely due to its mutual metabolic and neuronal function.

Citation: Schmidt TJ, Hassan AH, Pucker B, Kruse T (2026) Phylogenetic based dissection of eukaryotic Mo-insertase functionality: From mechanism to complex assembly. PLoS One 21(6): e0350191. https://doi.org/10.1371/journal.pone.0350191

Editor: Shailender Kumar Verma, University of Delhi, INDIA

Received: June 23, 2025; Accepted: May 11, 2026; Published: June 12, 2026

Copyright: © 2026 Schmidt et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Molybdenum cofactor (Moco) biosynthesis is catalyzed by an ancient and highly conserved multi-step biosynthesis pathway [1], with the general steps of Moco biosynthesis being highly similar amongst eukaryotes and prokaryotes. Initially GTP is converted to cyclic pyranopterin monophosphate (cPMP), a reaction that – in all eukaryotes – was suggested to reside in the mitochondrial matrix [2]. Upon formation, cPMP is exported into the cytosol [3] where all subsequent Moco biosynthesis steps take place. In the second step of Moco biosynthesis, cPMP is converted to molybdopterin (MPT), the metal free precursor of Moco. Upon formation, MPT is used as a substrate by the molybdenum insertase (Mo-insertase). Here, notable differences exist regarding the domain organization of eukaryotic and prokaryotic Mo-insertases. Pioneering work identified E. coli molybdate utilization to depend on the two separate enzymes MoeA and MogA [4, 5], while the homologous domains are fused as a single polypeptide in most eukaryotes [6]. For consistency, in eukaryotes the prokaryotic nomenclature [4] has been retained as the MoeA homologous domain of eukaryotic Mo-insertases is referred to as E-domain while the MogA homologous domain is referred to as G-domain. Work with the plant Mo-insertase Cnx1 identified the G-domain to adenylylate MPT, yielding MPT-AMP (adenylated MPT [7,8], Fig 1) which is used as the substrate for the subsequent molybdate insertion reaction catalyzed by the E-domain. Both Cnx1 domains form a complex which was suggested to ensure the directed and protein protected MPT-AMP transfer from G- to E-domain [9]. Upon binding to the E-domain, molybdate is incorporated into the MPT dithiolene moiety [10]. This reaction, precisely the initial molybdate binding and its transfer to the active site bound MPT (dithiolene), requires a defined set of surface exposed residues which were first identified in the eukaryotic model Mo-insertase Cnx1E from the higher plant Arabidopsis thaliana (summarized in [6]). Consistent with the essential function of these residues for Cnx1 catalytic activity, a high degree of conservation of these residues has been reported amongst various eukaryotes [11]. Molybdate insertion into the MPT dithiolene moiety results in the formation of adenylylated Moco (Moco-AMP, [12]). Upon formation, the phosphor-anhydride bond within Moco-AMP is hydrolyzed and physiologically active Moco is released (Fig 1, [10,12]).

Download:

Fig 1. Functionalization of molybdenum.

(A). Cnx1E residues involved in molybdate binding to its initial binding site [11] are shown, directed interactions are indicated by dashed lines with distances in Ångström given above; numbering refers to the A. thaliana Mo-insertase Cnx1E (Q39054.2 [13]). (B) The Mo-insertase G-domain catalyzes the adenylylation of molybdopterin (MPT), yielding MPT-AMP. MPT-AMP and molybdate are the E-domain substrates and most likely, a minor backbone flip within the active site results in the movement of active site bound molybdate (A) into the MPT dithiolene moiety [11] yielding Moco-AMP. A minor rearrangement of the phosphor-anhydride bond in the Moco-AMP molecule is believed to initiate its hydrolysis, resulting in the release of physiologically active Moco [10,12].

https://doi.org/10.1371/journal.pone.0350191.g001

Thus formed Moco may be transferred to Moco dependent enzymes or contribute to the cell’s insertion-competent Moco pool (summarized in [14]). Moco is the active site prosthetic group of Moco dependent enzymes (Mo-enzymes) which catalyze a diverse set of vitally important redox reactions (summarized, e.g. in [15]). For plants, loss of nitrate reductase is most critical, as it is essentially required for plant survival [1,16]. From the mammalian Mo-enzymes, sulfite oxidase activity is most crucial, as its depletion results in severe neurological phenotypes and ultimately leads to death of the affected individuum [17,18]. Next to its importance for metabolic processes, Mo-metabolism is otherwise essential for mammals: Here the Mo-insertase gephyrin was first identified as a receptor clustering protein in the post synapse, but not as enzyme essential for the cellular Mo-metabolism [19,20]. Hence explaining the name of the protein (gephyrin, greek for bridge, [21]). Precisely, gephyrin is required for clustering of glycine- and GABA- (γ-aminobutyric acid, type A) receptors in the postsynaptic membrane of inhibitory synapses [19,20]. Other than the plant Mo-insertase Cnx1 which forms a compact, asymmetric hexameric complex [9], gephyrin is suggested to form a lateral network which is essential for receptor clustering [22,23] and references therein.

With very few exceptions known, eukaryotic Mo-insertases are generally assumed to possess E- and G-domains fused together in one protein, while for prokaryotes both domains exist as separate entities. To shed light on the domain organization of eukaryotic Mo-insertases in the present work we carried out an in silico-based approach to identify Mo-insertase sequences both in eukaryotes and prokaryotes which revealed a great number of sequences of hitherto not described putative Mo-insertases. Whilst prokaryotic Mo-insertases were identified to assemble from two separate domains, the vast majority of identified eukaryotic Mo-insertases assembled from two fused domains. Surprisingly exceptions were found in some invertebrate species, algae and protists, where both domains exist as separate entities.

All eukaryotic Mo-insertases identified within this work were found to possess a highly conserved surface patch which forms the active site. Catalytically important residues located here were found to be strictly conserved with few identified exceptions. Assessment of structural models of these exceptions suggests, that in six out of seven cases these will not impair functionality.

As an unexpected peculiarity, vertebrate type Mo-insertases were identified to possess a tremendously high degree of sequence conservation which – referring to the current available knowledge – is not explainable by its functions for Mo-metabolism and receptor clustering.

Materials and methods

Retrieval of Bait and Reference Sequences – In an initial setup, the protein sequence of Escherichia coli MoeA (WP_003903624.1) was used as query in a BLASTp search see Fig 2 for an overview.

Download:

Fig 2. Simplified taxon sampling scheme: Essentially, the MoeA protein sequence was used as query in BLASTp searches against the NCBI database (db).

Doing so revealed numerous prokaryotic and eukaryotic phylae to contain MoeA homologs from which the corresponding proteomes were downloaded and included into a new, combined db. *For vertebrate and invertebrates a random sampling approach was chosen which allowed it to include at least one representative proteome per phylum. **The taxonomic groups included into the combined db from ‘other eukaryotes’ are described in detail in the following section. A readily established plant db was likewise included into the combined db. If not stated otherwise, subsequent analysis was carried out exclusively with this established combined db. Species names and sources of the sequence data sets per species used for BLAST-based analyses are deposited in S1 Data File. Please see the following sections for details.

https://doi.org/10.1371/journal.pone.0350191.g002

Initially, BLASTp v2.15 searches were carried out, using the NCBI protein database (non-redundant protein sequences (nr) with default settings between March and June 2024. Initially, the top two hits (referring to the bit score of the NCBI BLASTp search result) for all bacterial phylae were identified. Next the annotated polypeptide sequences of the respective species were downloaded from the NCBI database [24]. In few cases, no annotated polypeptide sequences were available for the species of the top two hits obtained from BLASTp searches. Here, the first two hits for which annotated polypeptide sequences were deposited were considered. Thus obtained sequences were used for downstream analysis. For fungi, MoeA (WP_003903624.1) was used as query in BLASTp searches using the NCBI protein database (non-redundant sequences (nr), default settings) in the following taxonomic groups: Ascomycota, Basidiomycota, Dikarya incertae sedis, Blastocladiomycota, Chytridiomycota, Cryptomycota, Microsporidia, Mucoromycota, Nephridiophagidae, Olpidiomycota, Sanchytriomycota, and Zoopagomycota. Referring to the NCBI taxonomy browser [25] Ascomycota, Basidomycota, Chytridiomycota, Cryptomycota, Microsporidia, Mucoromycota, Olpidiomycota, Sanchytriomycota and Zoopagomycota are fungal phyla, while Nephridiophagidae is a fungal subclass ranked as phylum and Dikarya incerta sedis are unranked species. From each taxonomic group, again the top two hits (referring to the bit score of the BLASTp search result) were identified and the annotated polypeptide sequences were downloaded from the NCBI database [24] and used for downstream analysis. As we sought to use plants as a control group, we decided to use an available dataset comprising the annotated polypeptide sequences of in total 147 plant species. For animals, the annotated protein sequences of well characterized model organisms, i.e. Mus musculus, Gallus, Rattus norvegicus, Danio rerio, Xenopus laevis, Drosophila melanogaster, Caenorhabditis elegans and that of Homo sapiens were downloaded from the NCBI database [24] and used for downstream analysis. Upon identification of hits with highest sequence similarities amongst the analyzed jawed vertebrate (Gnathostomata) species, the number of jawed vertebrate species included into analysis was significantly expanded to cover various species from different clades aiming to obtain a comprehensive dataset. Therefore, an extension of the animal dataset was based on a BLASTp v2.16 analysis against nr with the same settings in January 2025. In total the analyzed vertebrate dataset encompassed 63 sequences from 59 geni and 63 species, respectively. The annotated polypeptide sequences of selected Gnathostomata species were downloaded from the NCBI database [24] and used for downstream analysis. As highest Mo-insertase sequence similarity was only identified amongst (jawed) vertebrate species, we sought to analyze Mo-insertase sequences from invertebrate species for comparison. As compared to initial taxon sampling, the number of invertebrate species included into analysis was significantly expanded to cover various species from the phyla Nematoda and Mollusca and the class Insecta [25]. Annotated protein sequences of selected invertebrate species were downloaded from the NCBI database [24] and used for downstream analysis. Finally, the MoeA sequence (WP_003903624.1) was used as query in BLASTp searches targeting eukaryotic taxons not covered by the hitherto described approaches, thus identifying MoeA related sequences in various protists and algae. From each taxonomic group [25], precisely Amoebozoa (clade), Ancyromonadida (clade), Apusozoa (class), Breviatea (class), Dephylleia (genus), Regifilida (order), Mantamonadidae (genus), Cryptophyceae (class), Discoba (clade), Glaucocystopheae (class), Haptophyta (phylum), Centroplasthelida (class), Malawimonadida (order), Metamonada (clade), Opisthokonta (class), Rhodophyta (phylum), Alveolata (clade), Rhizaria (clade), Stramenopiles (clade), Telonemia (genus), Kathablepharidaceae (order), Palpitomonas (genus), Virdiplantae (kingdom), Ancoracysta (genus), Picozoa (genus) and Hemimastigophora (phylum), the top two hits (referring to the bit score of the NCBI BLASTp search result) were identified and the respective, annotated polypeptide sequences were downloaded from the NCBI database [24] and used for downstream analysis.

BLAST – The Escherichia coli MoeA sequence (WP_003903624.1) was blasted against all downloaded polypeptide sequences using the script collect_best_BLAST_hits.py and default settings [26]. The obtained sequences were subsequently used for phylogenetic analysis.

Phylogenetic analysis – Generation of alignments was carried out using MAFFT v7.526 [27]. Phylogenetic trees were constructed with IQ-TREE2 v2.3.4 [28] using the maximum likelihood estimation and 1000 bootstrap replicates. The model used was LG + R8 (MoeA and ADH-trees). For the MOCS2B tree, the model used was JTTDCMut + R7. ITOL v6 was used to visualize the phylogenetic trees [29]. Patristic distances within clades were calculated using the Python script branch_length_comparison.py (https://github.com/bpucker/molyb) based on the dendropy module [30].

Elimination of contamination and splice variants in the dataset – After initial tree building, protein sequences from the same species, possessing highest sequence similarities, have been manually inspected for obvious annotation errors (eukaryotic and prokaryotic sequences) and splice variants (eukaryotic sequences). Given that protein sequences originating from multiple splice forms were identified, subsequent analysis was carried out with a single sequence per species. Here the top hit obtained from the BLAST using the specified script (see above) was used for further analysis.

Afterwards, the dataset was inspected manually a second time, to identify species harboring more than one protein sequence as identified by BLASTp analysis. Identified sequences were then checked for their origin to rule out contaminations using BLASTp. Thus identified contaminants were tabulated in S2 Table.

When more than one sequence per species remained in the data set upon application of the above-described regime, routinely all sequences from a single species were aligned to the Escherichia coli MoeA protein sequence (WP_003903624.1) and the Homo sapiens gephyrin sequence (NP_001019389.1) using MultAlin and the BLOSUM62-12-2 matrix [31]. Sequences were considered to be bona fide Mo-insertase E-domains if aligning to both, the E. coli and gephyrin E-domains. Application of this final control step revealed few sequences to be Mo-insertase G-domain like proteins (tabulated in S2 Table).

Mo-insertase domain classification – In order to identify E- and G-domain comprising regions in Mo-insertase fusion proteins, the following work flow was carried out: Upon initial identification, obtained sequences were routinely aligned with the G-domain encoding sequence of gephyrin ([22], invertebrate type Mo-insertases), Nit-9 [32], fungal type Mo-insertases) and Cnx1 [33], plant type Mo-insertases). Due to highest sequence conservation, for Gnathostome-type Mo-insertases these have been generally assumed to possess the domain organization identified for mammalian gephyrin [22].

Visualization of conservation grades – To visualize the degree of Mo-insertase surface conservation, the degree of amino acid conservation of vertebrate- / invertebrate- and plant-type Mo-insertases (see the results-section for details) was calculated to percent (https://github.com/bpucker/molyb). The conservation scores were subsequently written to the b-factor column of the A. thaliana Mo-insertase (PDB code: 6Q32) and the R. norvegicus Mo-insertase A-chain (PDB code: 2FU3), respectively. The modified R. norvergicus Mo-insertase A-chain has subsequently been duplicated to replace the B-chain in the figures shown by using PyMol [34]. For A. thaliana the second monomer was built by crystallographic symmetry. All structure visualization was carried out using PyMol [34].

3D structural prediction of proteins – Prediction of monomeric and dimeric Cnx1E models was performed using AlphaFold2 [35] and AlphaFold3 [36]. Modelling with AlphaFold3 was used to generate wt Cnx1E and active site variants S328T, S400T, K297R, K297L, K294R, and R369K. The G296A substitution was predicted to induce backbone rearrangements that could impact functionality (as described in [11]). To reveal any impact, the two deposited conformations of the G296-K297 segment from the Cnx1E crystal structure (PDB ID: 6ETF) were extracted and used as templates for structure prediction of the G296A variant using AlphaFold2 with default parameters [35]. Only the ‘relaxed’ backbone conformation present in 6ETF [11] yielded a viable template for modeling G296A. For structural comparison, Moco-AMP (PDB ID: 6Q32) [12] or molybdate (PDB ID: 6ETF) was superimposed from the solved Cnx1E structures onto the predicted models to visualize substrate/product positioning within the active site.

Analysis of catalytically important residues – Vitally important Mo-insertase active site residues were recently summarized for eukaryotic Mo-insertases [6]. The conservation amongst active site residues of various Mo-insertases reported here were checked using KIPEs v3 [37] (https://github.com/bpucker/KIPEs) and the bait input sequences as well as the analyzed residues are listed in Table 1.

Download:

Table 1. Reference sequences, residues and their positions for the inspection of active site amino acid residues in Mo-insertase candidates.

https://doi.org/10.1371/journal.pone.0350191.t001

Analysis of other conserved proteins (MOCS2B and ADH) – The Homo sapiens ADH (AAA19002.1) and MOCS2B (NP_004522.1) sequences were searched with BLAST against all downloaded polypeptide sequences using the script collect_best_BLAST_hits.py [26] and default settings. The top hit obtained from the BLAST was used for further analysis. The obtained sequences were subsequently used for the generation of alignments, using MAFFT v7 [27]. Phylogenetic trees were constructed with IQ-TREE2 [28] using the maximum likelihood estimation and 1000 bootstrap replicates. The model finder integrated in IQ-TREE2 identified LG + R8 as the best model. ITOL was used to visualize the phylogenetic trees [29].

Results

We used the E. coli MoeA sequence for BLASTp searches, which allows for the identification of putative Mo-insertases from both, eukaryotes and prokaryotes. The obtained sequences were used to create a phylogenetic tree comprising a total of 327 Mo-insertases from 289 geni (Figs 3, S1–S5, S5 and S6 Data Files). For better readability, we will in the following use the term Mo-insertase instead of putative Mo-insertase throughout this work.

Download:

Fig 3. Simplified phylogenetic tree obtained from maximum likelihood analysis.

The Mo-insertase families are color-coded as follows: Prokaryotic-type Mo-insertases (brownish), fungi-type Mo-insertases including protists (bluish), animal-type Mo-insertases (reddish), plant-type Mo-insertases including protists (greenish). For the detailed tree, please see S1–S5 Figs as indicated in the figure. For the sake of clarity, the prokaryotic clade has been collapsed. The dashed black lines are meant to group together Mo-insertases of different species based on their domain organization inferred within this work (see S2–S4 Data Files). As references, the domain organization of the Mo-insertases from Escherichia coli (MoeA: NCBI Reference Sequence NP_415348.1, MogA: NCBI Reference Sequence: NP_414550.1), Drosophila melanogaster (NCBI Reference Sequence: NP_726659.1, Rattus norvegicus (both annotations according to [13]), Caenorhabditis elegans (CAA90069 and CCD74267, annotation according to [38], Aspergillus nidulans (annotation according to [32]), Chlamydomonas reinhardtii (E-domain: GenBank entry DQ311646, G-domain: GenBank entry DQ311645.1, E-domain, [39]) and the model plant Arabidopsis thaliana (annotation according to [33]) is shown schematized. The dual function of the Mo-insertase gephyrin from vertebrates (i.e. clustering of γ-Aminobutyric acid type A and glycine receptors in the post synapse and functionalization of molybdenum [19, 20]) is indicated (see Fig 1 for a detailed representation of molybdate interacting residues). The number of sequences possessing the indicated domain organization is given within the respective colored box. Asterisks indicate species which possess a differing domain organization, summarized in S1 Table and S6 Fig. The tree bootstrap values are shown in S5 Data File, the alignment file is deposited as S6 Data File.

https://doi.org/10.1371/journal.pone.0350191.g003

As expected, Mo-insertases from different taxonomic lineages group into clades, which we annotated as animal-, fungi- and plant-type Mo-insertase family respectively. On the contrary, for prokaryotes, the diversity of the identified sequences did not result in forming family-level trees (S1 Fig).

The animal-type Mo-insertase family comprises 84 members, which in most cases possess an N-terminal G-domain fused to a C-terminal E-domain (Fig 3). Interestingly, exceptions were identified for the Mo-insertases from invertebrates namely Trichuris trichiura (T. trichiura), Caenorhabditis elegans (C. elegans) and Pinctada imbricata (P. imbricata), S1 Table. In these organisms, E- and G-domain were found to occur separately.

The fungi-type Mo-insertase family comprises in total 26 members including unicellular organisms from the SAR clade (i.e. Tribonema minus, Symbiodinium necroappetens and Reticulomyxa filosa) and the zooflagellate Tecamonas trahens. We found the vast majority of fungal Mo-insertases identified in this work to possess an N-terminal G-domain and a C-terminal E-domain (Figs 3 and S2). However, few exceptions were identified, summarized S1 Table and S6 Fig.

The vast majority of plant Mo-insertases is assigned to members of the Streptophyta phylum and possess an N-terminal E-domain fused to a C-terminal G-domain (see Fig 3 for comparison). A few Streptophyta were identified to harbor a separate E-domain (S1 Table) however, here no G-domain was identified which may be best explained by a provisional and/or incomplete annotation in the employed databases. Several unicellular organisms are summarized in S1 Table are part of the plant type Mo-insertase group, including the alga C. reinhardtii for which was reported, that here the E- and G-domains are expressed separately ([39], see Fig 3 for comparison). We identified this to hold true for all unicellular organisms included into the plant-type Mo-insertase group for which a G-domain sequence could be identified (S1 Table).

Having identified previously undescribed putative Mo-insertases in animals, fungi and plants, we next went on to confirm that these are indeed bona fide Mo-insertases. Therefore positional homologs to A. thaliana critical active site residues [6] were identified as described previously (detailed in the material and method section and [37]; Tables 1 and 2).

Download:

Table 2. Conserved active site residues of the eukaryotic Mo-insertase E-domain. The catalytically important residues of A. thaliana Cnx1E (Cnx1 sequence NP_197599.1) have been tabulated. Positional homologs of invertebrates, vertebrates, fungi and plants have been identified and the number of identical residues as compared to the Cnx1 active site residues has been calculated as rounded percentage. Non-conserved residues are likewise included and the rounded frequency (percentage) with which these occur is given. In total, 21 invertebrate-type, 62 vertebrate-type, 26 fungi-type, and 155 plant-type Mo-insertases were considered. X = alignment gap.

https://doi.org/10.1371/journal.pone.0350191.t002

To elucidate the potential impact of the identified non conserved residues on Mo-insertase functionality we carried out an in silico-based approach in which the structure of respective Cnx1E variants was predicted using AlphaFold [35,36] (Fig 4). Modelling of all variants was possible (see also S9 Fig for comparison), however for variant G296A, only the ‘relaxed’ backbone conformation present in 6ETF [11] yielded a viable template for modeling.

Download:

Fig 4. AlphaFold-Based modelling of Mo-insertase active site variants.

Cnx1 variants possessing the active site variations identified and tabulated in Table 2 were modelled using AlphaFold (as specified in the material and methods section). View of the active site structures of the modelled Cnx1 variants. Cnx1 wildtype (WT) and variants K297L/R, S328T, S400T: Moco-AMP is shown derived from the structural superimposition with protein structure 6Q32 [12]; Cnx1 variants G296A, K294R and R369K: Molybdate is shown derived from the structural superimposition of modelled variants with protein structure 6ETF [11]. The modelled active site structures are shown in ribbon representation; exchanged residues are depicted as sticks and colored blue. The outlines of the respective positional homologous wildtype residues were derived from the structural superimposition of modelled variants with the protein structure 6ETF [11] and are shown superimposed. Moco-AMP and molybdate are shown in ball and stick representation. Dashed lines indicate directed interactions between modelled and non-modelled active site residues with Moco-AMP and/or molybdate respectively, with distances given in Ångström (Å). Brackets indicate distances between modelled residues, while distance measurements involving wildtype residues are given without brackets.

https://doi.org/10.1371/journal.pone.0350191.g004

Within the animal-type Mo-insertase family, a large number of sequences groups together extremely close (detailed in Fig 5). Interestingly, these are all assigned to jawed vertebrates (Gnathostomata) where the Mo-insertase (named gephyrin here, [21]) is known to possess a neuronal function next to its function in Mo-metabolism.

Download:

Fig 5. Phylogenetic distance tree of animal-type Mo-insertases.

S3 Fig has been modified and is shown here. Mo-insertases identified in invertebrates possess sequence deviations which result in their arrangement within the phylogenetic tree allowing it to assign a phylogenetic affiliation, i.e. Nematodes, Insecta and Mollusca. Jawed vertebrates Mo-insertases do not arrange group-wise, hence preventing the assignment of any phylogenetic affiliation. Clades harboring plant and fungal orthologs have been omitted in this figure due to space constrains (see supplements for details). Numbers refer to the number of sequences forming the respective groups.

https://doi.org/10.1371/journal.pone.0350191.g005

We are fully aware, that this extremely close grouping – at first sight – may suggest corrupted database entries or subsequent processing errors. While this might appear as an artifact, all our investigations support the validity of these sequence records.

To exclude any influence of the phylogenetic age of the vertebrate trait on the degree of gephyrin’s sequence conservation, we next analyzed the patristic distances between species in the taxon Gnathostomata (jawed vertebrates, Fig 6) and compared these with values observed for taxa of similar age, documenting almost no changes in gephyrin during Gnathostomata evolution. To further substantiate this finding we next went on and constructed two additional protein phylogenies as controls, one for the Moco biosynthesis enzyme MOCS2B [40], (S7 Data File) and one for alcohol dehydrogenase ([41], (S8 Data File) and analyzed the respective patristic distances (S7 Fig). As can be seen from S7 Fig, the patristic distances of ADH sequences in the taxon Gnathostomata (1.67) are higher than those of gephyrin (0.04). The MOCS2B protein phylogeny revealed a recent splitting as two large metazoan protein families, i.e. Amphibia/Dinosauria and Mammalia were identified, while no Gnathostomata trait as observed for gephyrin / ADH were observed. Comparison of the patristic distances of both families again revealed a higher patristic distance, i.e. 0.94 for Amphibia/Dinosauria and 0.45 for Mammalia, respectively. Hence even within this – as compared to Gnathostomata (462 MYA, Fig 6) – younger taxa (i.e. ca. 300 MYA, Amphibia/Dinosauria and ca. 180 MYA, Mammalia) a significantly higher patristic distance was identified for an enzyme from the same biosynthesis pathway.

Download:

Fig 6. Average patristic distance between Mo-insertases in different taxa.

(A) Patristic distance of the Mo-insertases within indicated taxa. The estimated age of the compared taxa is indicated (mya = million years ago). Colors correlate with the patristic distance as indicated within the figure. (B) Simplified timetree visualizing the phylogenetic relationship of selected eukaryotic non-plant taxa. The numbers specify the estimated taxon-ages. Asterisks indicate the taxa compared in (A). (A) and (B): Estimates of taxon ages were extracted from the evolutionary time tree of life [42,43].

https://doi.org/10.1371/journal.pone.0350191.g006

In summary, our patristic distance analysis documents that the highest degree of gephyrins sequence conservation is not a result of the phylogenetic age of the vertebrate trait.

In the following known residues that are important for the two functions of gephyrin (i.e. receptor clustering and Moco synthesis) were identified and plotted onto the protein surface. As can be seen from Fig 7, a significant part but not the complete protein surface is associated with these functions while the overall surface conservation was found to be very high (S8 Fig).

Download:

Fig 7. Functionally relevant residues of the gephyrin E-domain.

Surface representation of the mammalian Mo-insertase E-domain (PDB code: 2FU3). (A) Residues involved in receptor binding (according to [44]) are shown in blue. (B) Interfacing residues of the interaction 2 model of the plant Cnx1 Mo-insertase complex [9] are shown in green. (C) Residues involved in molybdenum functionalization of the plant Mo-insertase Cnx1 are shown in red (recently summarized in [6]).

https://doi.org/10.1371/journal.pone.0350191.g007

Discussion

Within this work we carried out an in silico-based approach to identify Mo-insertases in the species of life. Our work revealed in total 327 Mo-insertases from 289 (in total) eukaryotic and prokaryotic geni. The finding that for prokaryotes the diversity of the identified sequences did not result in forming family-level clades may be best explained by the prevalent horizontal gene transfer here [45] leading to conflicting gene histories [46]. However, as known for the model Mo-insertase MoeA from E. coli, the prokaryotic Mo-insertases identified all assemble from a separate E- and G-domain.

Relating to the domain organization, our work classified all novel eukaryotic Mo-insertases identified either as plant-type (N-terminal E-domain, C-terminal G-domain) or animal/fungi-type Mo-insertases (N-terminal G-domain, C-terminal E-domain), with the notable peculiarity of Mo-insertases identified in some protists, algae and invertebrates which – like prokaryotes – all possess the E- and G-domain as separate entities (as documented by 190 manually carried out and inspected pairwise sequence alignments (see S2–S4 Data Files) and when indicated in silico based complementary structure predictions (see S1 Table).

Very recently, Megrian and colleagues [47] described the fusion of E- and G-domain to be present in all studied animal type Mo-insertases which they suggested to be due to the finding that this is mandatory for gephyrin’s neuronal function (i.e. receptor clustering in the post synapse of inhibitory neurons). Our study revealed that in some animals (i.e. C. elegans (see also [38]), T. trichuria and the mollusk P. imbricata, see S1 Table for comparison) the Mo-insertase E- and G-domains exist as separate entities, hence the tendency of E- and G-domain to occur fused together in animals requires a careful review. As the invertebrate-type Mo-insertases reported within this work essentially lack conserved E-domain residues required for receptor interaction (S10 Fig), we reason, that any vertebrate-like neuronal function of invertebrate-type Mo-insertases is unlikely, which is consistent with the lack of critical residues for receptor clustering in C. elegans and Drosophila Mo-insertases [48].

Why do eukaryotes possess inconsistent E-G domain arrangements? Other than eukaryotes, all prokaryotes studied thus far possess the identical domain arrangement: E- and G-domains are separate entities, documenting that domain fusion is not mandatory for Mo-insertase functionality (here). Interestingly, when recombinant plant Cnx1E- and G-domains are expressed separately, the MPT-AMP transfer from G- to E-domain as well as the subsequent metal insertion reaction occur under fully defined in vitro conditions when both enzymes are co-incubated [10,12,33]. Further, the Chlamydomonas reinhardtii E-domain was found to complement an E. coli moeA mutant strain [49]. These findings point towards, that i) eukaryotic Mo-insertase functionality does not mandatorily depend on domain fusion and ii) metabolite transfer from G- to E-domain likely underlies a common principle in eukaryotes and prokaryotes (see also [49]). The reason for domain fusion in eukaryotic Mo-insertases may be best explained by assuming that this allows for a directed [9] and efficient MPT-AMP transfer from G- to E-domain [50], which may be beneficial for the vast majority of eukaryotic species reported within this work (see also [47] for comparison). It is not clear, when the E- G- domain fusion occurred, however it appears to have occurred independently at least twice during the evolution of eukaryotes, documented by the finding that plants possess an inverted domain orientation as compared to animals and fungi [32,51]. While extensive data sets support findings regarding domain fusion, significantly less eukaryotic sequences were identified that encode for separate E- and G-domains. We suggest that the identified separate E- and G-domains in C. elegans, T. trichuria and the mollusk P. imbricata (see also S1 Table) may result from the division of the fused domains during speciation. The other possible explanation – an independently occurred domain fusion during speciation of the various animal species – appears unlikely. Taken together, the current view on eukaryotic Mo-insertase domain arrangement may be refined as obviously domain fusion is not essential for both, in vivo and in vitro functionality. However, the findings about domain fusion prevalence and evolutionary conservation are based on the sequence data that is currently available and may be improved as more genome sequences and derived polypeptide sequences (particularly from invertebrates) become available.

According to our analysis, a number of catalytically significant residues for Mo-insertases of the plant, fungal, and invertebrate types are conserved across species, which is in line with their suggested functional roles. For the exceptions identified we carried out AlphaFold based modelling and subsequently carried out structural plausibility assessments to reveal any impact of these residues on functionality. Doing so (see Figs 4 and S9) identified alterations S400T and S328T to most likely not impact functionality, as the Ser OH-group is essentially ‘replaced’ by the threonine OH-group within the active site and no obvious impact of threonine’s methyl group on active site chemistry could be deduced. Further, our assessment of the respective modelled structures suggests that the exchange of Lys 297 to Arg or Lys 297 to Leu will not have any impact on the interaction with the Mo-center: It is the amide nitrogen atom which is involved in a directed interaction here [11]. However in Cnx1E variant S269D D274S (co-crystallized with Moco-AMP, [12]), the side chain of K297 is involved in a single, directed interaction with the Moco-AMP pterin part. For variant K297R our modelling revealed that a directed interaction likely will occur, while our modelling data suggests, that this will not be the case for variant K297L. However, obviously this does not impair functionality, as all vertebrate type Mo-insertases and 95% of the invertebrate type Mo-insertases reported here (Table 2) possess the K297L variation. Cnx1E residues K294 and R369 are involved in (initial) molybdate binding [11,12] to the active site. Our assessment of the respective modelled structures suggests, that the exchange of Lys 294 to Arg or Arg 369 to Lys will very likely have no impact on the (positive) charge of the binding site which leads us to the assumption that binding of molybdate is most likely not impaired here. Our assessment of the modelled structure(s) suggests that solely the G296A variant likely will possess an impaired functionality: The Cnx1E G296-K297 segment exists in two conformations [11] from which variant G296A cannot adopt the ‘tensed’ conformation as documented by the finding that modelling with the respective template was not possible. As (initial) molybdate binding is linked to the ‘tensed’ conformation [11], we conclude that G296A is impaired in molybdate binding which most likely will exclude the subsequent molybdate insertion. Taken together, the current view on eukaryotic Mo-insertase functionality may be refined as our study suggests, that six out of seven functionally essential residues could be replaced by (type conserved) residues putatively without impairing functionality.

From all Mo-insertases identified and characterized thus far, vertebrate type Mo-insertases possess an extraordinarily high degree of sequence conservation (see also S8 Fig). As other vertebrate enzymes such as ADH and MOCS2B possess a significantly lower degree of sequence conservation (documented by the respective patristic distances, see Fig 6), we conclude that the evolutionary rate of the bifunctional gephyrin is lower. We note, that gephyrin residues involved in its metabolic (Mo-functionalization and complex formation) and receptor-clustering function locate nearly exclusively to the ‘front-side’ side of the enzyme, while merely no known functional relevant residue(s) locate to the ‘back-side’ (Fig 8).

Download:

Fig 8. Functional relevant residues of the gephyrin E-domain.

All functional relevant residues (clustering, interaction, synthesis) shown individually in Fig 7 are shown in a combined image here. Please see legend of Fig 7 for details. Residues shown in yellow result from an overlap of the functional relevant residues.

https://doi.org/10.1371/journal.pone.0350191.g008

We speculate that putatively the remaining conserved surface area (see also S8 Fig) is crucial for other known evolutionary conserved protein interactions and/or yet unidentified post-translational modifications (summarized, e.g. in [52]). However, as hitherto to the best of our knowledge no functional relevant gephyrin ‘backside’ residue is known, it will require systematic mutagenesis work to attribute function(s) of gephyrin’s ‘backside’ residues to its functionality.

Supporting information

S1 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Prokaryotes.

Species name and accession number of the identified MoeA homologous sequence are given next to the branches. *The eukaryotic E-domain sequence XP 682307.1 from Aspergillus nidulans was identified to group better with prokaryotic than eukaryotic (fungal) sequences.

https://doi.org/10.1371/journal.pone.0350191.s001

(PDF)

S2 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Fungi.

Species name and the accession number of the identified MoeA homologous sequence are given next to the branches. *The algal E-domain sequence KAG5179608.1 from Tribonema minus was identified to group better with fungal than plant sequences.

https://doi.org/10.1371/journal.pone.0350191.s002

(PDF)

S3 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Metazoa.

Species name and the accession number of the identified MoeA homologous sequence are given next to the branches.

https://doi.org/10.1371/journal.pone.0350191.s003

(PDF)

S4 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Plants I of II.

Species name and the accession number of the identified MoeA homologous sequence are given next to the branches.

https://doi.org/10.1371/journal.pone.0350191.s004

(PDF)

S5 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Plants II of II.

Species name and the accession number of the identified MoeA homologous sequence are given next to the branches.

https://doi.org/10.1371/journal.pone.0350191.s005

(PDF)

S6 Fig. Schematic representation of identified Mo-insertase domains in Aspergillus nidulans and Geotrichum candidum.

Mo-insertase domains identified in A. nidulans (A) and G. candidum (B). (A) and (B): domains were annotated according to A. nidulans CNXE domain annotation (Probst, C., et al., Genetic characterization of the Neurospora crassa molybdenum cofactor biosynthesis. Fungal Genet Biol, 2014. 66: p. 69–78.). Our initial BLASTp approach (see the materials and methods section for details) identified proteins XP682307.1 and XP661382 (A. nidulans) and KAF5108556 (G. candidum). (A) the A. nidulans Mo-insertase full length Mo-insertase domain organization is shown according to (Probst, C., et al., Genetic characterization of the Neurospora crassa molybdenum cofactor biosynthesis. Fungal Genet Biol, 2014. 66: p. 69–78.). The schematic domain organization shown was taken from Fig 3. (B) the G. candidum full length Mo-insertase (CDO51814.1) was identified by an BLASTp search using the NCBI protein database (non-redundant protein sequences (nr), default settings) in the taxon G. candidum and using full length A. nidulans Mo-insertase (AAK83300.1) as query. The sequence CDO51814.1 is not included in the phylogenetic trees constructed within this work.

https://doi.org/10.1371/journal.pone.0350191.s006

(PDF)

S7 Fig. Patristic distance of the MOCS2B and alcohol dehydrogenase from different taxons.

Patristic distance of MOCS2B and alcohol dehydrogenase from indicated taxons. For comparison the patristic distance determined for gephyrin (GEPH) in the taxon Gnathostomata is given. The estimated age when the compared taxons emerged is indicated (MYA = million years ago). For calculation of patristic distances, the taxons Amphibia (320 MYA) and Sauria (280 MYA) were combined. Estimates of taxon ages were extracted from the evolutionary time tree of life (Kumar, S., et al., TimeTree 5: An Expanded Resource for Species Divergence Times. Mol Biol Evol, 2022. 39(8) and Kumar, S., et al., TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol, 2017. 34(7): p. 1812–1819.).

https://doi.org/10.1371/journal.pone.0350191.s007

(PDF)

S8 Fig. Conserved residues of eukaryotic Mo-insertases.

Surface representations of the R. norvegicus (PDB code: 2FU3, A and B) and A. thaliana (PDB code: 6Q32, C) Mo-insertase E-domain. Colors indicate the degree of conservation of surface exposed amino acids amongst members of the Invertebrate-type Mo-insertase (A) the Gnathostome-type Mo-insertase (B) and the Plant-type Mo-insertase (C). The color of the protein surface correlates with the degree of conservation as indicated. The active site as identified for the plant-type Mo-insertase Cnx1 (Probst, C., et al., Mechanism of molybdate insertion into pterin-based molybdenum cofactors. Nat Chem, 2021. 13(8): p. 758–765.) is encircled.

https://doi.org/10.1371/journal.pone.0350191.s008

(PDF)

S9 Fig. AlphaFold-Based modelling of Mo-insertase active site variants.

Partial representation of modelled variants (blue) superimposed with the wildtype Cnx1E structure (6ETF; Krausze, J., et al., The functional principle of eukaryotic molybdenum insertases. Biochem J, 2018. 475(10): p. 1739–1753., grey). RMSD = root mean square deviation.

https://doi.org/10.1371/journal.pone.0350191.s009

(PDF)

S10 Fig. Highly conserved residues of Invertebrate-type Mo-insertase.

Surface representations of the R. norvegicus (PDB code: 2FU3) Mo-insertase E-domain. Highly conserved (> 70% identity) residues of Invertebrate-type Mo-insertases are shown color coded as specified in S8 Fig. Residues that fall below this threshold are shown in grey. (The receptor binding site (according to Maric, H.M., et al., Gephyrin-mediated gamma-aminobutyric acid type A and glycine receptor clustering relies on a common binding site. J Biol Chem, 2011. 286(49): p. 42105–42114.) is encircled.

https://doi.org/10.1371/journal.pone.0350191.s010

(PDF)

S1 Table. Mo-insertases with a diverging domain arrangement.

Mo-insertases possessing a diverging domain arrangement as compared to the clade (fungi, animals, plants) where these grouped to (see Fig 3 for comparison) are tabulated. The E-domains tabulated were identified by the initial BLASTp search. If indicated (i.e., when a separate existent E-domain was identified, G-domains were identified by using the MogA sequence (QKU47929.1) as query for a BLASTp search and restricted to the respective organism, using the NCBI protein database (non-redundant protein sequences (nr), default settings). As an exception for the identification of the Volvox carteri G-domain BLASTp searches (standard settings) were carried out using the JGI database (Grigoriev, I.V., et al., The genome portal of the Department of Energy Joint Genome Institute. Nucleic Acids Res, 2012. 40 (Database issue): p. D26-32) with queries restricted to Volvox carteri. If indicated the number of the first and last amino acid of the E- and G-domain within the fusion proteins are given. For Tribonema minus, Symbiodinium necroappetens and Heterostelium album the structure of the G-domain has been predicted using (Powell, H.R., et al., Phyre2.2: A Community Resource for Template-based Protein Structure Prediction. J Mol Biol, 2025. 437(15): p. 168960. to refine domain annotation.) For Reticulomyxa filose (ETO18335.1) only a partial sequence was available which showed significant sequence similarities to the H. sapiens gephyrin E-domain. The Pinctada imbricata G-domain is part of a hypothetical protein comprising 1183 aa. In Diacronema lutheri, the E-domain was identified to be part of a hypothetical protein comprising 611 aa (KAG8469581.1) respectively. The G-domains identified in Micromonas commode, Diacronema lutheri and Chrysochromulina tobinii possessed an N-terminal extension comprising ca. 150 residues. Domain classification in fusion proteins was carried out as described in the materials and methods section. The C. reinhardtii Mo-insertase domain organization was described elsewhere (Llamas, A., et al., Molybdenum metabolism in the alga Chlamydomonas stands at the crossroad of those in Arabidopsis and humans. Metallomics, 2011. 3(6): p. 578–90.). For Streptophyta species where separate E-domains were identified, G-domain containing sequences were identified by using the A. thaliana Cnx1G sequence (Krausze, J., et al., Dimerization of the plant molybdenum insertase Cnx1E is required for synthesis of the molybdenum cofactor. Biochem J, 2017. 474(1): p. 163–178. and reference therein) as query for a BLASTp searches and restricted to the respective organism, using the NCBI protein database (non-redundant protein sequences (nr), default settings). The Carya illinoinensis, Tripterygium wilfordii and Nymphaea colorata G-domain containing sequences (KAG2674269.1, XP_038715653.1 and XP_031481496.1 respectively) comprises G- and E-domains with the plant type orientation, indicated by an asterisk.

https://doi.org/10.1371/journal.pone.0350191.s011

(PDF)

S2 Table. Sequences not considered for analysis.

Contaminated sequences and G-domain like sequences identified within the dataset are tabulated. Databases: Pucker et al., 2024 (Pucker, B., Fiene, N., Choudhary, N., Borchert, M., Khatun, N., Collection of plant gene expression data. https://doi.org/10.24355/dbbs.084-202409160820-0. 2024.); O’Leary et al., 2024 (O’Leary, N.A., et al., Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets. Sci Data, 2024. 11(1): p. 732.).

https://doi.org/10.1371/journal.pone.0350191.s012

(PDF)

S1 Data File. Species names and sources of the sequence data sets per species used for BLAST-based analyses for the discovery of MoeA orthologs (xlsx format).

https://doi.org/10.1371/journal.pone.0350191.s013

(XLSX)

S2 Data File. Pairwise sequence alignments of all identified fungal Mo-insertases with the N. crassa E- and G-domain sequences.

Annotation according to Probst, C., et al., Genetic characterization of the Neurospora crassa molybdenum cofactor biosynthesis, Fungal Genet Biol, 2014. 66: p. 69–78. The sequence alignments were carried out using EMBOSS Needle Pairwise Sequence Alignment (Madeira, F., et al., The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res, 2024. 52(W1): p. W521-W525.) and standard settings. We identified KAG6331869.1 from Astraeus odoratus to possesses the fungal domain organization, however the fusion protein was found to be part of a larger protein sized 2058 residues. For XP_007680954.1 from Baudoinia panamericana we confirmed the fungal type domain organization, however the G-domain was identified to be truncated.

https://doi.org/10.1371/journal.pone.0350191.s014

(ZIP)

S3 Data File. Pairwise sequence alignments of all identified invertebrate Mo-insertases with the R. norvegicus E- and G-domain sequences.

Annotation according to Sola, M., et al., Structural basis of dynamic glycine receptor clustering by gephyrin, Embo J, 2004. 23(13): p. 2510–9. The sequence alignments were carried out using EMBOSS Needle Pairwise Sequence Alignment (Madeira, F., et al., The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res, 2024. 52(W1): p. W521-W525.) and standard settings. We identified KAJ8937396.1 from Aromia moschata to possess a N-terminal fused, partial G-domain sequence. For XP_046338115.2 from Haliotis rufescens, structures of E- and G-domain were predicted using (Powell, H.R., et al., Phyre2.2: A Community Resource for Template-based Protein Structure Prediction. J Mol Biol, 2025. 437(15): p. 168960. to refine domain annotation.) to refine domain annotation. According to this, the G-domain spans residues 13–175 and the E-domain residues 302–719. For CDW51849.1 from Trichuris trichiura, NP_509700.2 from Caenorhabditis elegans, KAK3092834.1 from Pinctada imbricata and kAF4523361.1 from Ephemera danica, we identified no G-domain encoding part within the annotated sequence (see S1 Table).

https://doi.org/10.1371/journal.pone.0350191.s015

(ZIP)

S4 Data File. Pairwise sequence alignments of all identified plant type Mo-insertases with the A. thaliana E- and G-domain sequences.

Annotation according to Krausze, J., et al., Dimerization of the plant molybdenum insertase Cnx1E is required for synthesis of the molybdenum cofactor, Biochem J, 2017. 474(1): p. 163–178. and reference therein. The sequence alignments were carried out using EMBOSS Needle Pairwise Sequence Alignment (Madeira, F., et al., The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res, 2024. 52(W1): p. W521-W525) and standard settings. For the Streptophyta sequences CiLak.13G125300.1 from Carya illinoinensis, FSB015820001 from Fagus sylvatica, CM035884.1.g4717.t1 from Gynostemma pentaphyllum, CM029397.1.g6424.t1 from Luffa aegyptiaca, XM_050076571.1 from Nymphaea colorata, NC_052233.1_cds_XP_038715662.1_2868 from Tripterygium wilfordii and SMEL_000g050390.1.01 from Solanum melongena we identified no G-domain encoding part within the annotated sequence (see supplementary S1 Table). For FvH4_3g26260.t4 from Fragaria vesca a C-terminal fused, partial G-domain sequence was identified. For the Chlorophyta sequences XP_002500333.1 from Micromonas commoda, XP_002950060 from Volvox carteri and DQ311646.1 from Chlamydomonas reinhardtii likewise no G-domain encoding part within the annotated sequence was identified (see supplementary S1 Table). For (lcl_CM028325.1.lcl_CM028325.1.g54.t1) from Eucommia ulmoides we identified only a partial E-domain sequence within the annotated sequence, while no G-domain sequence was identified.

https://doi.org/10.1371/journal.pone.0350191.s016

(ZIP)

S5 Data File. MoeA Phylogenetic tree, obtained from maximum likelihood analysis.

Bootstrap values are shown (pdf format).

https://doi.org/10.1371/journal.pone.0350191.s017

(PDF)

S6 Data File. Alignment file used for MoeA tree building (FASTA format).

https://doi.org/10.1371/journal.pone.0350191.s018

(ZIP)

S7 Data File. Phylogenetic tree for MOCS2B obtained from maximum likelihood analysis (pdf format).

https://doi.org/10.1371/journal.pone.0350191.s019

(PDF)

S8 Data File. Phylogenetic tree for ADH obtained from maximum likelihood analysis (pdf format).

https://doi.org/10.1371/journal.pone.0350191.s020

(PDF)

S9 Data File. Local BLAST hits of MoeA homologs used for Alignment (FASTA format).

https://doi.org/10.1371/journal.pone.0350191.s021

(ZIP)

Acknowledgments

This work was supported by the de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) and ELIXIR-DE (Forschungszentrum Jülich and W-de.NBI-001, W-de.NBI-004, W-de.NBI-008, W-de.NBI-010, W-de.NBI-013, W-de.NBI-014, W-de.NBI-016, W-de.NBI-022). We acknowledge support by the Open Access Publication Funds of Technische Universität Braunschweig.

References

1. Mendel RR, Kruse T. Cell biology of molybdenum in plants and humans. Biochim Biophys Acta. 2012;1823(9):1568–79. pmid:22370186
- View Article
- PubMed/NCBI
- Google Scholar
2. Mendel RR. The molybdenum cofactor. J Biol Chem. 2013;288(19):13165–72. pmid:23539623
- View Article
- PubMed/NCBI
- Google Scholar
3. Teschner J, Lachmann N, Schulze J, Geisler M, Selbach K, Santamaria-Araujo J, et al. A novel role for Arabidopsis mitochondrial ABC transporter ATM3 in molybdenum cofactor biosynthesis. Plant Cell. 2010;22(2):468–80. pmid:20164445
- View Article
- PubMed/NCBI
- Google Scholar
4. Shanmugam KT, Stewart V, Gunsalus RP, Boxer DH, Cole JA, Chippaux M, et al. Proposed nomenclature for the genes involved in molybdenum metabolism in Escherichia coli and Salmonella typhimurium. Mol Microbiol. 1992;6(22):3452–4. pmid:1484496
- View Article
- PubMed/NCBI
- Google Scholar
5. Rajagopalan KV, Johnson JL. The pterin molybdenum cofactors. J Biol Chem. 1992;267(15):10199–202. pmid:1587808
- View Article
- PubMed/NCBI
- Google Scholar
6. Kruse T. Function of Molybdenum Insertases. Molecules. 2022;27(17):5372. pmid:36080140
- View Article
- PubMed/NCBI
- Google Scholar
7. Kuper J, Llamas A, Hecht H-J, Mendel RR, Schwarz G. Structure of the molybdopterin-bound Cnx1G domain links molybdenum and copper metabolism. Nature. 2004;430(7001):803–6. pmid:15306815
- View Article
- PubMed/NCBI
- Google Scholar
8. Llamas A, Mendel RR, Schwarz G. Synthesis of adenylated molybdopterin: an essential step for molybdenum insertion. J Biol Chem. 2004;279(53):55241–6. pmid:15504727
- View Article
- PubMed/NCBI
- Google Scholar
9. Hassan AH, Ihling C, Iacobucci C, Kastritis PL, Sinz A, Kruse T. The structural principles underlying molybdenum insertase complex assembly. Protein Sci. 2023;32(9):e4753. pmid:37572332
- View Article
- PubMed/NCBI
- Google Scholar
10. Llamas A, Otte T, Multhaup G, Mendel RR, Schwarz G. The Mechanism of nucleotide-assisted molybdenum insertion into molybdopterin. A novel route toward metal cofactor assembly. J Biol Chem. 2006;281(27):18343–50. pmid:16636046
- View Article
- PubMed/NCBI
- Google Scholar
11. Krausze J, Hercher TW, Zwerschke D, Kirk ML, Blankenfeldt W, Mendel RR, et al. The functional principle of eukaryotic molybdenum insertases. Biochem J. 2018;475(10):1739–53. pmid:29717023
- View Article
- PubMed/NCBI
- Google Scholar
12. Probst C, Yang J, Krausze J, Hercher TW, Richers CP, Spatzal T, et al. Mechanism of molybdate insertion into pterin-based molybdenum cofactors. Nat Chem. 2021;13(8):758–65. pmid:34183818
- View Article
- PubMed/NCBI
- Google Scholar
13. Stallmeyer B, Nerlich A, Schiemann J, Brinkmann H, Mendel RR. Molybdenum co-factor biosynthesis: the Arabidopsis thaliana cDNA cnx1 encodes a multifunctional two-domain protein homologous to a mammalian neuroprotein, the insect protein Cinnamon and three Escherichia coli proteins. Plant J. 1995;8(5):751–62. pmid:8528286
- View Article
- PubMed/NCBI
- Google Scholar
14. Kruse T. Moco Carrier and Binding Proteins. Molecules. 2022;27(19):6571. pmid:36235107
- View Article
- PubMed/NCBI
- Google Scholar
15. Cordas CM, Moura JJG. Molybdenum and tungsten enzymes redox properties – A brief overview. Coordination Chemistry Reviews. 2019;394:53–64.
- View Article
- Google Scholar
16. Campbell WH. Nitrate reductase structure, function and regulation: Bridging the Gap between Biochemistry and Physiology. Annu Rev Plant Physiol Plant Mol Biol. 1999;50:277–303. pmid:15012211
- View Article
- PubMed/NCBI
- Google Scholar
17. Johnson JL, Waud WR, Rajagopalan KV, Duran M, Beemer FA, Wadman SK. Inborn errors of molybdenum metabolism: combined deficiencies of sulfite oxidase and xanthine dehydrogenase in a patient lacking the molybdenum cofactor. Proc Natl Acad Sci U S A. 1980;77(6):3715–9. pmid:6997882
- View Article
- PubMed/NCBI
- Google Scholar
18. Mudd SH, Irreverre F, Laster L. Sulfite oxidase deficiency in man: demonstration of the enzymatic defect. Science. 1967;156(3782):1599–602. pmid:6025118
- View Article
- PubMed/NCBI
- Google Scholar
19. Feng G, Tintrup H, Kirsch J, Nichol MC, Kuhse J, Betz H, et al. Dual requirement for gephyrin in glycine receptor clustering and molybdoenzyme activity. Science. 1998;282(5392):1321–4. pmid:9812897
- View Article
- PubMed/NCBI
- Google Scholar
20. Fritschy J-M, Harvey RJ, Schwarz G. Gephyrin: where do we stand, where do we go? Trends Neurosci. 2008;31(5):257–64. pmid:18403029
- View Article
- PubMed/NCBI
- Google Scholar
21. Prior P, Schmitt B, Grenningloh G, Pribilla I, Multhaup G, Beyreuther K,et al. Primary structure and alternative splice variants of gephyrin, a putative glycine receptor-tubulin linker protein . Neuron. 1992;86:1161–1170. pmid:1319186
- View Article
- PubMed/NCBI
- Google Scholar
22. Sola M, Bavro VN, Timmins J, Franz T, Ricard-Blum S, Schoehn G, et al. Structural basis of dynamic glycine receptor clustering by gephyrin. EMBO J. 2004;23(13):2510–9. pmid:15201864
- View Article
- PubMed/NCBI
- Google Scholar
23. Pizzarelli R, et al. Tuning GABAergic Inhibition: Gephyrin Molecular Organization and Functions. Neuroscience. 2020;439:125–36.
- View Article
- Google Scholar
24. O’Leary NA, Cox E, Holmes JB, Anderson WR, Falk R, Hem V, et al. Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets. Sci Data. 2024;11(1):732. pmid:38969627
- View Article
- PubMed/NCBI
- Google Scholar
25. Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database. 2020;2020.
- View Article
- Google Scholar
26. Pucker B, Iorizzo M. Apiaceae FNS I originated from F3H through tandem gene duplication. PLoS One. 2023;18(1):e0280155. pmid:36656808
- View Article
- PubMed/NCBI
- Google Scholar
27. Katoh K, a.D.M.S. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
- View Article
- Google Scholar
28. Bui Quang Minh HAS, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution. 2020;37(5):1530–4.
- View Article
- Google Scholar
29. Letunic I, Bork P. Interactive Tree of Life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool. Nucleic Acids Res. 2024;52(W1):W78–82. pmid:38613393
- View Article
- PubMed/NCBI
- Google Scholar
30. Sukumaran J, Holder MT. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010;26(12):1569–71. pmid:20421198
- View Article
- PubMed/NCBI
- Google Scholar
31. Corpet F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 1988;16(22):10881–90. pmid:2849754
- View Article
- PubMed/NCBI
- Google Scholar
32. Probst C, Ringel P, Boysen V, Wirsing L, Alexander MM, Mendel RR, et al. Genetic characterization of the Neurospora crassa molybdenum cofactor biosynthesis. Fungal Genet Biol. 2014;66:69–78. pmid:24569084
- View Article
- PubMed/NCBI
- Google Scholar
33. Krausze J, Probst C, Curth U, Reichelt J, Saha S, Schafflick D, et al. Dimerization of the plant molybdenum insertase Cnx1E is required for synthesis of the molybdenum cofactor. Biochem J. 2017;474(1):163–78. pmid:27803248
- View Article
- PubMed/NCBI
- Google Scholar
34. The PyMOL Molecular Graphics System, Version 2.5.4 Schrödinger, LLC.
35. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
- View Article
- PubMed/NCBI
- Google Scholar
36. Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630(8016):493–500. pmid:38718835
- View Article
- PubMed/NCBI
- Google Scholar
37. Choudhary N, Pucker B. Conserved amino acid residues and gene expression patterns associated with the substrate preferences of the competing enzymes FLS and DFR. PLoS One. 2024;19(8):e0305837. pmid:39196921
- View Article
- PubMed/NCBI
- Google Scholar
38. Sewell AK, Han M. Learning from the worm: the effectiveness of protein-bound Moco to treat Moco deficiency. Genes Dev. 2021;35(3–4):177–9. pmid:33526584
- View Article
- PubMed/NCBI
- Google Scholar
39. Llamas A, Tejada-Jiménez M, Fernández E, Galván A. Molybdenum metabolism in the alga Chlamydomonas stands at the crossroad of those in Arabidopsis and humans. Metallomics. 2011;3(6):578–90. pmid:21623427
- View Article
- PubMed/NCBI
- Google Scholar
40. Reiss J, Dorche C, Stallmeyer B, Mendel RR, Cohen N, Zabot MT. Human molybdopterin synthase gene: genomic structure and mutations in molybdenum cofactor deficiency type B. Am J Hum Genet. 1999;64(3):706–11. pmid:10053004
- View Article
- PubMed/NCBI
- Google Scholar
41. Satre MA, Zgombić-Knight M, Duester G. The complete structure of human class IV alcohol dehydrogenase (retinol dehydrogenase) determined from the ADH7 gene. J Biol Chem. 1994;269(22):15606–12. pmid:8195208
- View Article
- PubMed/NCBI
- Google Scholar
42. Kumar S, et al. TimeTree 5: An Expanded Resource for Species Divergence Times. Mol Biol Evol. 2022;39(8).
- View Article
- Google Scholar
43. Kumar S, Stecher G, Suleski M, Hedges SB. TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol. 2017;34(7):1812–9. pmid:28387841
- View Article
- PubMed/NCBI
- Google Scholar
44. Maric H-M, Mukherjee J, Tretter V, Moss SJ, Schindelin H. Gephyrin-mediated γ-aminobutyric acid type A and glycine receptor clustering relies on a common binding site. J Biol Chem. 2011;286(49):42105–14. pmid:22006921
- View Article
- PubMed/NCBI
- Google Scholar
45. Huang W, Li LT, Li Y, Hua N, Sun C, Wei C. Widespread of horizontal gene transfer in the human genome. BMC Genomics. 2017;18(274).
- View Article
- Google Scholar
46. Avni E, Snir S. A New Phylogenomic Approach For Quantifying Horizontal Gene Transfer Trends in Prokaryotes. Sci Rep. 2020;10(1):12425. pmid:32709941
- View Article
- PubMed/NCBI
- Google Scholar
47. Megrian D, Martinez M, Alzari PM, Wehenkel AM. Evolutionary plasticity and functional repurposing of the essential metabolic enzyme MoeA. Commun Biol. 2025;8(1):49. pmid:39809875
- View Article
- PubMed/NCBI
- Google Scholar
48. Kim EY, Schrader N, Smolinsky B, Bedet C, Vannier C, Schwarz G, et al. Deciphering the structural framework of glycine receptor anchoring by gephyrin. EMBO J. 2006;25(6):1385–95. pmid:16511563
- View Article
- PubMed/NCBI
- Google Scholar
49. Llamas A, Tejada-Jimenez M, González-Ballester D, Higuera JJ, Schwarz G, Galván A, et al. Chlamydomonas reinhardtii CNX1E reconstitutes molybdenum cofactor biosynthesis in Escherichia coli mutants. Eukaryot Cell. 2007;6(6):1063–7. pmid:17416894
- View Article
- PubMed/NCBI
- Google Scholar
50. Belaidi AA, Schwarz G. Metal insertion into the molybdenum cofactor: product-substrate channelling demonstrates the functional origin of domain fusion in gephyrin. Biochem J. 2013;450(1):149–57. pmid:23163752
- View Article
- PubMed/NCBI
- Google Scholar
51. Wainright PO, Hinkle G, Sogin ML, Stickel SK. Monophyletic origins of the metazoa: an evolutionary link with fungi. Science. 1993;260(5106):340–2. pmid:8469985
- View Article
- PubMed/NCBI
- Google Scholar
52. Groeneweg FL, Trattnig C, Kuhse J, Nawrotzki RA, Kirsch J. Gephyrin: a key regulatory protein of inhibitory synapses and beyond. Histochem Cell Biol. 2018;150(5):489–508. pmid:30264265
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Mendel RR, Kruse T. Cell biology of molybdenum in plants and humans. Biochim Biophys Acta. 2012;1823(9):1568–79. pmid:22370186
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Mendel RR. The molybdenum cofactor. J Biol Chem. 2013;288(19):13165–72. pmid:23539623
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Teschner J, Lachmann N, Schulze J, Geisler M, Selbach K, Santamaria-Araujo J, et al. A novel role for Arabidopsis mitochondrial ABC transporter ATM3 in molybdenum cofactor biosynthesis. Plant Cell. 2010;22(2):468–80. pmid:20164445
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Shanmugam KT, Stewart V, Gunsalus RP, Boxer DH, Cole JA, Chippaux M, et al. Proposed nomenclature for the genes involved in molybdenum metabolism in Escherichia coli and Salmonella typhimurium. Mol Microbiol. 1992;6(22):3452–4. pmid:1484496
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Rajagopalan KV, Johnson JL. The pterin molybdenum cofactors. J Biol Chem. 1992;267(15):10199–202. pmid:1587808
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Kruse T. Function of Molybdenum Insertases. Molecules. 2022;27(17):5372. pmid:36080140
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Kuper J, Llamas A, Hecht H-J, Mendel RR, Schwarz G. Structure of the molybdopterin-bound Cnx1G domain links molybdenum and copper metabolism. Nature. 2004;430(7001):803–6. pmid:15306815
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Llamas A, Mendel RR, Schwarz G. Synthesis of adenylated molybdopterin: an essential step for molybdenum insertion. J Biol Chem. 2004;279(53):55241–6. pmid:15504727
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Hassan AH, Ihling C, Iacobucci C, Kastritis PL, Sinz A, Kruse T. The structural principles underlying molybdenum insertase complex assembly. Protein Sci. 2023;32(9):e4753. pmid:37572332
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Llamas A, Otte T, Multhaup G, Mendel RR, Schwarz G. The Mechanism of nucleotide-assisted molybdenum insertion into molybdopterin. A novel route toward metal cofactor assembly. J Biol Chem. 2006;281(27):18343–50. pmid:16636046
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Krausze J, Hercher TW, Zwerschke D, Kirk ML, Blankenfeldt W, Mendel RR, et al. The functional principle of eukaryotic molybdenum insertases. Biochem J. 2018;475(10):1739–53. pmid:29717023
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Probst C, Yang J, Krausze J, Hercher TW, Richers CP, Spatzal T, et al. Mechanism of molybdate insertion into pterin-based molybdenum cofactors. Nat Chem. 2021;13(8):758–65. pmid:34183818
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Stallmeyer B, Nerlich A, Schiemann J, Brinkmann H, Mendel RR. Molybdenum co-factor biosynthesis: the Arabidopsis thaliana cDNA cnx1 encodes a multifunctional two-domain protein homologous to a mammalian neuroprotein, the insect protein Cinnamon and three Escherichia coli proteins. Plant J. 1995;8(5):751–62. pmid:8528286
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Kruse T. Moco Carrier and Binding Proteins. Molecules. 2022;27(19):6571. pmid:36235107
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Cordas CM, Moura JJG. Molybdenum and tungsten enzymes redox properties – A brief overview. Coordination Chemistry Reviews. 2019;394:53–64.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref16] 16. Campbell WH. Nitrate reductase structure, function and regulation: Bridging the Gap between Biochemistry and Physiology. Annu Rev Plant Physiol Plant Mol Biol. 1999;50:277–303. pmid:15012211
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref17] 17. Johnson JL, Waud WR, Rajagopalan KV, Duran M, Beemer FA, Wadman SK. Inborn errors of molybdenum metabolism: combined deficiencies of sulfite oxidase and xanthine dehydrogenase in a patient lacking the molybdenum cofactor. Proc Natl Acad Sci U S A. 1980;77(6):3715–9. pmid:6997882
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref18] 18. Mudd SH, Irreverre F, Laster L. Sulfite oxidase deficiency in man: demonstration of the enzymatic defect. Science. 1967;156(3782):1599–602. pmid:6025118
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. Feng G, Tintrup H, Kirsch J, Nichol MC, Kuhse J, Betz H, et al. Dual requirement for gephyrin in glycine receptor clustering and molybdoenzyme activity. Science. 1998;282(5392):1321–4. pmid:9812897
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Fritschy J-M, Harvey RJ, Schwarz G. Gephyrin: where do we stand, where do we go? Trends Neurosci. 2008;31(5):257–64. pmid:18403029
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Prior P, Schmitt B, Grenningloh G, Pribilla I, Multhaup G, Beyreuther K,et al. Primary structure and alternative splice variants of gephyrin, a putative glycine receptor-tubulin linker protein . Neuron. 1992;86:1161–1170. pmid:1319186
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Sola M, Bavro VN, Timmins J, Franz T, Ricard-Blum S, Schoehn G, et al. Structural basis of dynamic glycine receptor clustering by gephyrin. EMBO J. 2004;23(13):2510–9. pmid:15201864
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Pizzarelli R, et al. Tuning GABAergic Inhibition: Gephyrin Molecular Organization and Functions. Neuroscience. 2020;439:125–36.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref24] 24. O’Leary NA, Cox E, Holmes JB, Anderson WR, Falk R, Hem V, et al. Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets. Sci Data. 2024;11(1):732. pmid:38969627
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref25] 25. Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database. 2020;2020.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref26] 26. Pucker B, Iorizzo M. Apiaceae FNS I originated from F3H through tandem gene duplication. PLoS One. 2023;18(1):e0280155. pmid:36656808
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref27] 27. Katoh K, a.D.M.S. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref28] 28. Bui Quang Minh HAS, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution. 2020;37(5):1530–4.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref29] 29. Letunic I, Bork P. Interactive Tree of Life (iTOL) v6: recent updates to the phylogenetic tree display and annotation tool. Nucleic Acids Res. 2024;52(W1):W78–82. pmid:38613393
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref30] 30. Sukumaran J, Holder MT. DendroPy: a Python library for phylogenetic computing. Bioinformatics. 2010;26(12):1569–71. pmid:20421198
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref31] 31. Corpet F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 1988;16(22):10881–90. pmid:2849754
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref32] 32. Probst C, Ringel P, Boysen V, Wirsing L, Alexander MM, Mendel RR, et al. Genetic characterization of the Neurospora crassa molybdenum cofactor biosynthesis. Fungal Genet Biol. 2014;66:69–78. pmid:24569084
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref33] 33. Krausze J, Probst C, Curth U, Reichelt J, Saha S, Schafflick D, et al. Dimerization of the plant molybdenum insertase Cnx1E is required for synthesis of the molybdenum cofactor. Biochem J. 2017;474(1):163–78. pmid:27803248
View Article
PubMed/NCBI
Google Scholar

[125] View Article

[126] PubMed/NCBI

[127] Google Scholar

[ref34] 34. The PyMOL Molecular Graphics System, Version 2.5.4 Schrödinger, LLC.

[ref35] 35. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596(7873):583–9. pmid:34265844
View Article
PubMed/NCBI
Google Scholar

[130] View Article

[131] PubMed/NCBI

[132] Google Scholar

[ref36] 36. Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature. 2024;630(8016):493–500. pmid:38718835
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

[ref37] 37. Choudhary N, Pucker B. Conserved amino acid residues and gene expression patterns associated with the substrate preferences of the competing enzymes FLS and DFR. PLoS One. 2024;19(8):e0305837. pmid:39196921
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref38] 38. Sewell AK, Han M. Learning from the worm: the effectiveness of protein-bound Moco to treat Moco deficiency. Genes Dev. 2021;35(3–4):177–9. pmid:33526584
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref39] 39. Llamas A, Tejada-Jiménez M, Fernández E, Galván A. Molybdenum metabolism in the alga Chlamydomonas stands at the crossroad of those in Arabidopsis and humans. Metallomics. 2011;3(6):578–90. pmid:21623427
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref40] 40. Reiss J, Dorche C, Stallmeyer B, Mendel RR, Cohen N, Zabot MT. Human molybdopterin synthase gene: genomic structure and mutations in molybdenum cofactor deficiency type B. Am J Hum Genet. 1999;64(3):706–11. pmid:10053004
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref41] 41. Satre MA, Zgombić-Knight M, Duester G. The complete structure of human class IV alcohol dehydrogenase (retinol dehydrogenase) determined from the ADH7 gene. J Biol Chem. 1994;269(22):15606–12. pmid:8195208
View Article
PubMed/NCBI
Google Scholar

[154] View Article

[155] PubMed/NCBI

[156] Google Scholar

[ref42] 42. Kumar S, et al. TimeTree 5: An Expanded Resource for Species Divergence Times. Mol Biol Evol. 2022;39(8).
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref43] 43. Kumar S, Stecher G, Suleski M, Hedges SB. TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol. 2017;34(7):1812–9. pmid:28387841
View Article
PubMed/NCBI
Google Scholar

[161] View Article

[162] PubMed/NCBI

[163] Google Scholar

[ref44] 44. Maric H-M, Mukherjee J, Tretter V, Moss SJ, Schindelin H. Gephyrin-mediated γ-aminobutyric acid type A and glycine receptor clustering relies on a common binding site. J Biol Chem. 2011;286(49):42105–14. pmid:22006921
View Article
PubMed/NCBI
Google Scholar

[165] View Article

[166] PubMed/NCBI

[167] Google Scholar

[ref45] 45. Huang W, Li LT, Li Y, Hua N, Sun C, Wei C. Widespread of horizontal gene transfer in the human genome. BMC Genomics. 2017;18(274).
View Article
Google Scholar

[169] View Article

[170] Google Scholar

[ref46] 46. Avni E, Snir S. A New Phylogenomic Approach For Quantifying Horizontal Gene Transfer Trends in Prokaryotes. Sci Rep. 2020;10(1):12425. pmid:32709941
View Article
PubMed/NCBI
Google Scholar

[172] View Article

[173] PubMed/NCBI

[174] Google Scholar

[ref47] 47. Megrian D, Martinez M, Alzari PM, Wehenkel AM. Evolutionary plasticity and functional repurposing of the essential metabolic enzyme MoeA. Commun Biol. 2025;8(1):49. pmid:39809875
View Article
PubMed/NCBI
Google Scholar

[176] View Article

[177] PubMed/NCBI

[178] Google Scholar

[ref48] 48. Kim EY, Schrader N, Smolinsky B, Bedet C, Vannier C, Schwarz G, et al. Deciphering the structural framework of glycine receptor anchoring by gephyrin. EMBO J. 2006;25(6):1385–95. pmid:16511563
View Article
PubMed/NCBI
Google Scholar

[180] View Article

[181] PubMed/NCBI

[182] Google Scholar

[ref49] 49. Llamas A, Tejada-Jimenez M, González-Ballester D, Higuera JJ, Schwarz G, Galván A, et al. Chlamydomonas reinhardtii CNX1E reconstitutes molybdenum cofactor biosynthesis in Escherichia coli mutants. Eukaryot Cell. 2007;6(6):1063–7. pmid:17416894
View Article
PubMed/NCBI
Google Scholar

[184] View Article

[185] PubMed/NCBI

[186] Google Scholar

[ref50] 50. Belaidi AA, Schwarz G. Metal insertion into the molybdenum cofactor: product-substrate channelling demonstrates the functional origin of domain fusion in gephyrin. Biochem J. 2013;450(1):149–57. pmid:23163752
View Article
PubMed/NCBI
Google Scholar

[188] View Article

[189] PubMed/NCBI

[190] Google Scholar

[ref51] 51. Wainright PO, Hinkle G, Sogin ML, Stickel SK. Monophyletic origins of the metazoa: an evolutionary link with fungi. Science. 1993;260(5106):340–2. pmid:8469985
View Article
PubMed/NCBI
Google Scholar

[192] View Article

[193] PubMed/NCBI

[194] Google Scholar

[ref52] 52. Groeneweg FL, Trattnig C, Kuhse J, Nawrotzki RA, Kirsch J. Gephyrin: a key regulatory protein of inhibitory synapses and beyond. Histochem Cell Biol. 2018;150(5):489–508. pmid:30264265
View Article
PubMed/NCBI
Google Scholar

[196] View Article

[197] PubMed/NCBI

[198] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Results

Discussion

Supporting information

S1 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Prokaryotes.

S2 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Fungi.

S3 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Metazoa.

S4 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Plants I of II.

S5 Fig. Partial representation of the phylogenetic distance tree obtained from maximum likelihood: Plants II of II.

S6 Fig. Schematic representation of identified Mo-insertase domains in Aspergillus nidulans and Geotrichum candidum.

S7 Fig. Patristic distance of the MOCS2B and alcohol dehydrogenase from different taxons.

S8 Fig. Conserved residues of eukaryotic Mo-insertases.

S9 Fig. AlphaFold-Based modelling of Mo-insertase active site variants.

S10 Fig. Highly conserved residues of Invertebrate-type Mo-insertase.

S1 Table. Mo-insertases with a diverging domain arrangement.

S2 Table. Sequences not considered for analysis.

S1 Data File. Species names and sources of the sequence data sets per species used for BLAST-based analyses for the discovery of MoeA orthologs (xlsx format).

S2 Data File. Pairwise sequence alignments of all identified fungal Mo-insertases with the N. crassa E- and G-domain sequences.

S3 Data File. Pairwise sequence alignments of all identified invertebrate Mo-insertases with the R. norvegicus E- and G-domain sequences.

S4 Data File. Pairwise sequence alignments of all identified plant type Mo-insertases with the A. thaliana E- and G-domain sequences.

S5 Data File. MoeA Phylogenetic tree, obtained from maximum likelihood analysis.

S6 Data File. Alignment file used for MoeA tree building (FASTA format).

S7 Data File. Phylogenetic tree for MOCS2B obtained from maximum likelihood analysis (pdf format).

S8 Data File. Phylogenetic tree for ADH obtained from maximum likelihood analysis (pdf format).

S9 Data File. Local BLAST hits of MoeA homologs used for Alignment (FASTA format).

Acknowledgments

References