Skip to main content
Advertisement
  • Loading metrics

Adaptations to nitrogen availability drive ecological divergence of chemosynthetic symbionts

  • Isidora Morel-Letelier ,

    Contributed equally to this work with: Isidora Morel-Letelier, Benedict Yuen

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Eco-Evolutionary Interactions Group, Max Planck Institute for Marine Microbiology (MPIMM), Bremen, Germany

  • Benedict Yuen ,

    Contributed equally to this work with: Isidora Morel-Letelier, Benedict Yuen

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Eco-Evolutionary Interactions Group, Max Planck Institute for Marine Microbiology (MPIMM), Bremen, Germany

  • A. Carlotta Kück,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Eco-Evolutionary Interactions Group, Max Planck Institute for Marine Microbiology (MPIMM), Bremen, Germany

  • Yolanda E. Camacho-García,

    Roles Conceptualization, Data curation, Investigation, Resources, Supervision, Writing – original draft, Writing – review & editing

    Affiliations Centro de Investigación en Ciencias del Mar y Limnología (CIMAR), Universidad de Costa Rica, San Pedro, San José, Costa Rica, Centro de Investigación en Biodiversidad y Ecología Tropical (CIBET), Universidad de Costa Rica, San Pedro, San José, Costa Rica, Escuela de Biología, Universidad de Costa Rica, San Pedro, San José, Costa Rica

  • Jillian M. Petersen,

    Roles Conceptualization, Funding acquisition, Investigation, Validation, Writing – original draft, Writing – review & editing

    Affiliation Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria

  • Minor Lara,

    Roles Conceptualization, Data curation, Methodology, Resources, Writing – original draft, Writing – review & editing

    Affiliation Diving Center Cuajiniquil, Provincia de Guanacaste, Cuajiniquil, Costa Rica

  • Matthieu Leray,

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Project administration, Writing – original draft, Writing – review & editing

    Affiliation Smithsonian Tropical Research Institute, Balboa, Ancon, Republic of Panamá

  • Jonathan A. Eisen,

    Roles Conceptualization, Funding acquisition, Investigation, Writing – original draft, Writing – review & editing

    Affiliations Department of Evolution and Ecology, University of California, Davis, Davis, California, United States of America, Department of Medical Microbiology and Immunology, University of California, Davis, Davis, California, United States of America

  • Jay T. Osvatic,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    Affiliations Joint Microbiome Facility of the Medical University of Vienna and the University of Vienna, Vienna, Austria, Department of Laboratory Medicine, Medical University of Vienna, Vienna, Austria

  • Olivier Gros,

    Roles Conceptualization, Data curation, Investigation, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, Université des Antilles, Pointe-à-Pitre, France

  • Laetitia G. E. Wilkins

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing

    lwilkins@mpi-bremen.de

    Affiliation Eco-Evolutionary Interactions Group, Max Planck Institute for Marine Microbiology (MPIMM), Bremen, Germany

Abstract

Bacterial symbionts, with their shorter generation times and capacity for horizontal gene transfer (HGT), play a critical role in allowing marine organisms to cope with environmental change. The closure of the Isthmus of Panama created distinct environmental conditions in the Tropical Eastern Pacific (TEP) and Caribbean, offering a “natural experiment” for studying how closely related animals evolve and adapt under environmental change. However, the role of bacterial symbionts in this process is often overlooked. We sequenced the genomes of endosymbiotic bacteria in two sets of sister species of chemosymbiotic bivalves from the genera Codakia and Ctena (family Lucinidae) collected on either side of the Isthmus, to investigate how differing environmental conditions have influenced the selection of symbionts and their metabolic capabilities. The lucinid sister species hosted different Candidatus Thiodiazotropha symbionts and only those from the Caribbean had the genetic potential for nitrogen fixation, while those from the TEP did not. Interestingly, this nitrogen-fixing ability did not correspond to symbiont phylogeny, suggesting convergent evolution of nitrogen fixation potential under nutrient-poor conditions. Reconstructing the evolutionary history of the nifHDKT operon by including other lucinid symbiont genomes from around the world further revealed that the last common ancestor (LCA) of Ca. Thiodiazotropha lacked nif genes, and populations in oligotrophic habitats later re-acquired the nif operon through HGT from the Sedimenticola symbiont lineage. Our study suggests that HGT of the nif operon has facilitated niche diversification of the globally distributed Ca. Thiodiazotropha endolucinida species clade. It highlights the importance of nitrogen availability in driving the ecological diversification of chemosynthetic symbiont species and the role that bacterial symbionts may play in the adaptation of marine organisms to changing environmental conditions.

Author summary

Approximately three million years ago, the closure of the Isthmus of Panama connected North and South America, leading to species interchange on land but splitting an ancient ocean into the Tropical Eastern Pacific (TEP) and the Caribbean Sea. Today, these two marine habitats are characterized by significantly different environmental conditions. Notably, the Caribbean Sea became highly oligotrophic which caused a massive extinction event.

Our focus on bivalve species pairs that survived on both sides aimed at understanding how their associated bacterial symbionts enabled them to adapt to this massive environmental change. Although both Caribbean and TEP bivalves host Candidatus Thiodiazotropha symbionts, only those on the Caribbean side are capable of nitrogen fixation. This capability does not align with symbiont evolutionary history, indicating convergent evolution due to similar environmental pressures.

Exploring the genetic history of lucinid symbionts across the globe revealed that the ancestor of Ca. Thiodiazotropha lacked nitrogen fixation genes. Populations in nutrient-poor habitats acquired it multiple times through horizontal gene transfer (HGT). Our research underscores the role of HGT in bacterial adaptation and highlights the impact of nitrogen availability on symbiont ecological diversification. It shows how bacterial symbionts can aid marine organisms in adapting to environmental change.

Introduction

Global change gives rise to new environmental conditions and niches that organisms can adapt to and exploit, and the extent and mechanisms of adaptation in marine species may be significantly influenced by their microbiomes [1]. Bacterial symbionts of animals can potentially adapt to changing environments more rapidly and more flexibly than their hosts due to traits such as shorter generation times, enhanced recombination capabilities, and the potential for horizontal gene transfer (HGT) between distantly-related organisms [2]. In addition, horizontally acquired bacterial symbionts—those obtained from the environment in each generation—have access to a larger genetic pool for genetic exchange during their free-living phase, which can lead to faster adaptive responses [3]. Consequently, symbionts can serve as a source of ecological innovation, enabling the symbiosis to tap into novel resources and adapt to novel habitats [46]. Understanding the mechanisms enabling microbial symbionts to adapt to new environmental conditions will provide novel insights into how animal-microbe symbioses respond to changing environments.

The closure of the Isthmus of Panama about 2.8 million years ago had a profound impact on oceanic conditions, altering environmental factors such as ocean currents, salinity, temperature, and nutrient availability on both sides of the Isthmus [7]. The Tropical Eastern Pacific (TEP) continued to experience regular nutrient input due to seasonal upwelling, coupled with increased primary productivity, variable temperatures, and strong tides [7]. In contrast, the Caribbean coast became characterized by stable and warmer temperatures, higher salinity, and a notably low availability of organic nutrients [7]. Animal populations that were once connected became separated by the closure of the Isthmus, ultimately resulting in the emergence of sister species on separate evolutionary trajectories, diverging in response to the markedly different environmental conditions on either side of the Isthmus [8]. The adaptation strategies enabling these sister species to thrive in their respective, contrasting environments have been extensively studied, but these studies have primarily focused on the animals themselves, largely overlooking the potential influence of host-associated microorganisms (reviewed in [1,8,9]). The Isthmus of Panama presents a unique opportunity to investigate the drivers of diversification and adaptation through a "natural experiment” running for millions of years with a taxonomically replicated set of animal-microbe assemblages [10]. This offers valuable insights into the interplay between hosts and their microbial symbionts in the context of environmental change.

The Lucinidae, one of the most species-rich bivalve families, thrive in a wide array of marine environments and is the most diverse group of chemosymbiotic animals [11]. Lucinids house endosymbiotic sulfide-oxidizing Gammaproteobacteria intracellularly within specialized gill cells, where they use the energy derived from oxidizing reduced sulfur compounds to synthesize organic carbon [1113]. This partnership is obligate for lucinids because they rely on their symbionts for a significant portion of their carbon nutritional requirements [14,15]. The bacterial symbionts are acquired from free-living populations in the environment during the early developmental stages of each new generation [16,17]. Recent studies have begun to unveil the metabolic and genomic diversity among symbionts from vastly different environments across the globe, including differences in their abilities to metabolize carbon and inorganic nitrogen [1820]. Notably, some symbiont clades of the genus Ca. Thiodiazotropha were the first documented example of nitrogen-fixing chemosynthetic symbionts [21,22]. However, the precise processes governing the diversification and adaptation of lucinid symbionts to changing environmental parameters, which may lead to the divergence of local populations of globally distributed taxa, remains poorly understood. Sister species of lucinids from the genera Codakia and Ctena have diverged on either side of the Isthmus of Panama [1,11]. This unique relationship allows for a comparative study of symbiont adaptation, free from the confounding effects of host evolutionary history.

We used high-throughput metagenomic sequencing to recover metagenome-assembled genomes (MAGs) of bacterial symbionts associated with Codakia and Ctena sister species from both sides of the Isthmus of Panama. Our primary objective was to investigate how differing environmental conditions on either side of the Isthmus have influenced the selection, diversity, and functional traits of symbionts associated with lucinid clams. We found that all lucinid symbionts from the Caribbean had the potential to fix nitrogen and assimilate nitrate, but these functions were absent in all symbionts from the TEP. To further explore the evolutionary origins of nitrogen fixation in different lucinid symbiont lineages, we compared the genomes of symbionts across the Isthmus of Panama with other lucinid symbiont genomes from around the world. Using phylogenetic reconciliation, we reconstructed the evolutionary history of the nifHDKT operon and identified two distinct HGT events that led to nif gene acquisition and correlated with the colonization of nutrient-poor environments. Within the globally distributed Ca. Thiodiazotropha endolucinida clade, populations in nutrient-poor environments possessed nitrogen fixation genes, whereas those closely related populations in nutrient-rich environments did not. Despite their high Average Nucleotide Identity (ANI; >95%) and evidence of homologous recombination between geographically distant populations, our findings indicate that these Ca. Thiodiazotropha endolucinida populations have diverged ecologically and occupy separate niches that differ in nitrogen availability. We hypothesize that the diversification of these symbionts was facilitated by the acquisition of the nitrogen fixation genes through HGT. Our results provide valuable insights into the dynamic interplay between environmental factors and genetic exchange that shape the ecology, evolution, and diversification of host-associated microorganisms.

Results

Comparing bacterial symbionts in host sister species across the Isthmus of Panama reveals complex phylogenetic relationships and differences in genomic potential

Symbiont clades are exclusive to either side of the Isthmus and are independent of host taxonomy.

We recovered a total of 148 high-quality gammaproteobacterial MAGs from the gill metagenomes of Codakia and Ctena sister species sampled from nine different locations across the Isthmus: three sites in the TEP and six in the Caribbean (CAR; Fig 1A and 1B). Thirteen of these MAGs were retrieved from nine Codakia orbicularis (CAR) metagenomes, four from four Codakia distinguenda (TEP) metagenomes, 44 from 38 Ctena imbricatula (CAR) metagenomes, 44 from 31 Ctena sp. “COSTE” (CAR) metagenomes, and 43 from 37 Ctena cf. galapagana (TEP) metagenomes (metadata and statistics available in S1 and S2 Tables).

thumbnail
Fig 1. Phylogenetic relationships of bacterial symbiont clades associated with lucinid sister pairs across the Isthmus of Panama.

A Sister species from the lucinid genera Codakia (circles) and Ctena (triangles) were sampled on the CAR (Caribbean; turquoise) and TEP (Tropical Eastern Pacific; purple) side of the Isthmus. The map was generated with data from Natural Earth (http://www.naturalearthdata.com/) using the R package "rnaturalearth" (v0.3.2) (https://github.com/ropensci/rnaturalearth). B Schematic representation of the phylogenetic relationships of the Codakia and Ctena species collected, based on the most recent taxonomic study [23]. C Six symbiont lineages, four previously undescribed (bold text), were associated with either Codakia (circles) or Ctena (triangles) hosts across the Isthmus. Maximum likelihood phylogenomic tree of symbiont MAGs recovered from the gills of host sister pairs inferred from GTDB’s (Genome Taxonomy Database) multiple sequence alignment using the best fit model Q.plant+F+I+G4. MAGs of Monitilora ramsayi symbionts [18] were used as an outgroup. Blue squares indicate ultra fast bootstrap (UFB) values above 95% and SH-aLRT values above 80%. All monophyletic clades were collapsed by location to facilitate interpretation. D No symbiont clade was found on both sides of the Isthmus. Heatmap depicting average nucleotide identities (ANI) between the symbiont clades found across the Isthmus. The symbiont clades are colored in tones of green (Caribbean side “CAR”) and purple (TEP side) and the genus of the host from which the MAGs were recovered is indicated by the colors violet (Codakia) and pink (Ctena).

https://doi.org/10.1371/journal.pgen.1011295.g001

The MAGs constituted six distinct clades—three from each side of the Isthmus, but all were taxonomically assigned to the genus Ca. Thiodiazotropha (Fig 1C and 1D). Two clades were identified as previously described Caribbean lucinid symbionts Ca. T. taylori and Ca. T. endolucinida [19,22]. Three of the remaining four clades are previously undescribed symbiont species, based on an ANI threshold of 95% for species delimitation [2426] (S1 Dataset). We discovered one new bacterial species clade from the Caribbean (Ca. Thiodiazotropha fergusoni), for which we propose the name after Walter Ferguson, a Panamanian-born calypso singer and songwriter based in Cahuita, Costa Rica (1919–2023). We designate the two new species clades from the TEP as Ca. Thiodiazotropha larai and Ca. Thiodiazotropha boucheti. These names honor Minor Lara for his contributions to marine research and conservation in the Guanacaste region of Costa Rica, and Dr. Philippe Bouchet for his extensive work on lucinids. The third clade from the TEP had an ANI of ~95.5% to Ca. T. endolucinida, suggesting its inclusion within this species, but it formed a distinct monophyletic sub-clade unique to the TEP. We shall hereafter refer to this clade as Ca. T. endolucinida TEP to distinguish it from the originally described Caribbean clade, which we will hereafter refer to as Ca. T. endolucinida CAR. Two of the MAGs classified as Ca. T. fergusoni and one classified as Ca. T. endolucinida TEP, from the samples sequenced with PacBio, were circularized. No symbiont clade was present on both sides of the Isthmus (Fig 1). However, unlike their hosts, the symbionts on either side of the Isthmus did not share a sister lineage relationship, indicating the absence of co-diversification in the host and symbiont phylogenies (Fig 1B and 1C). MAGs classified as Ca. T. fergusoni, Ca. T. endolucinida CAR and Ca. T. endolucinida TEP were recovered from both Codakia and Ctena specimens, while Ca. T. taylori, Ca. T. boucheti and Ca. T. larai MAGs were only recovered from Ctena specimens (Fig 1C and 1D).

Symbiont clades across the Isthmus differ mainly in nitrogen metabolic capabilities.

We compared the metabolic potential of the symbionts from the TEP and Caribbean to identify genomic adaptations to the different environmental conditions across the Isthmus of Panama (Fig 1B). Core metabolic capabilities were shared among all clades (as shown in S2 Dataset) and with previously described members of Ca. Thiodiazotropha [19,27]. All clades possessed genes responsible for sulfur oxidation and carbon fixation via the Calvin Cycle. Notably, Ca. T. fergusoni MAGs only encoded the RubisCo type I, while Ca. T. larai and boucheti MAGs encoded both forms I and II, as did the MAGs of closely related Ca. T. endolucinida.

The metabolic enrichment analysis across the Isthmus revealed that nitrogen fixation (found in 99% of Caribbean MAGs), nitrate assimilation (which includes nitrate transport), and assimilatory nitrate reduction (both found in 96% of Caribbean MAGs) pathways were enriched in the MAGs of all three symbiont clades found in the Caribbean (Fig 2). All three Caribbean symbiont clades (Ca. T. endolucinda CAR, Ca. T. taylori, and Ca. T. fergusoni) consistently encoded these three nitrogen metabolic pathways that were absent in symbiont MAGs from the TEP, even though Ca. T. endolucinida CAR clade was more closely related to clades from the TEP (Ca. T. larai, Ca. T. boucheti and Ca. T. endolucinida TEP) than the two other Caribbean symbiont clades (Fig 2 and S3 Dataset). These enrichment patterns therefore do not correlate with the phylogenetic relationships of the lucinid hosts or their symbionts. The genes enriched in the Caribbean MAGs belonged to the same nitrogen metabolic pathways that were identified through the module enrichment analysis (Table 1 and S3 and S4 Datasets), namely nitrogen fixation and nitrate assimilation. Besides the minimum gene set for nitrogen fixation [28] (nifHDKENB), we observed enrichment of a varied repertoire of genes involved in the process, which included predicted functions in regulation, biosynthesis, assembly and structure. Additionally, a gene annotated as an H2O-forming NADH oxidase was enriched in Caribbean MAGs and was often located in the same genomic region as the nitrogen fixation genes. Both the mapping of the metagenomic reads to the nif genes and an HMM search of the nitrogenase against the metagenome assemblies supported the conclusion that the TEP symbionts did not have the potential to fix nitrogen (S5 Dataset). No metabolic modules were found to be enriched in TEP MAGs (Fig 2 and S3 Dataset), but genes encoding an electron-transferring-flavoprotein dehydrogenase and genes involved in gamma-polyglutamate biosynthesis were enriched in all three symbiont clades from the TEP, a pattern which also did not correlate with symbiont or host phylogeny (Table 1 and S4 Dataset).

thumbnail
Fig 2. Metabolic pathways associated with nitrogen acquisition were enriched in the symbiont clades from the Caribbean side of the Isthmus.

Nitrogen fixation (star), assimilatory nitrate reduction (circle) and nitrate assimilation (square) were highly prevalent in the three symbiont clades from the CAR (Caribbean) and absent in all the MAGs extracted from the TEP (Tropical Eastern Pacific). The metabolic pathway enrichment analysis results are superimposed on the phylogenomic tree of the symbionts found across the Isthmus. Blue squares indicate UFB values above 95% and SH-aLRT values above 80%. The symbiont clades are colored in tones of green (Caribbean) and purple (TEP) as in Fig 1C and 1D.

https://doi.org/10.1371/journal.pgen.1011295.g002

thumbnail
Table 1. Predicted metabolic functions enriched in the MAGs of symbionts associated with lucinid sister species collected from either side of the Isthmus of Panama.

The annotated functions in this table were present in more than 90% of the MAGs in one group and in less than 10% in the other.

https://doi.org/10.1371/journal.pgen.1011295.t001

The evolutionary history of nitrogen fixation potential in lucinid symbionts is explained by horizontal gene transfer

Placing Isthmus symbionts in a global symbiont phylogeny reveals an intermittent distribution of nitrogen fixation genes.

To further investigate the distribution of the nitrogen fixation pathway in lucinid symbionts, we generated a phylogenomic tree combining our newly obtained MAGs with those used in the most recent global tree of lucinid symbionts [18]. In addition, we added 23 new high-quality MAGs from both the Ca. Thiodiazotropha and Sedimenticola genera (S2 Table). These MAGs originated from specimens of 11 lucinid species collected at various shallow water and low latitude sites to include other oligotrophic and nutrient rich sites beyond the Isthmus (Fig 3A and S1 Table). Seven MAGs clustered within Ca. Sedimenticola endophacoides and three MAGs formed a Sedimenticola sister clade composed exclusively of symbionts from lucinids of the Pegophyseminae subfamily, which will be provisionally referred to as “PEGO”. Others clustered with Ca. T. boucheti, Ca. T. fergusoni and Ca. T. endolucinida. Three MAGs did not form clades with any other MAGs (Austriella corrugata, Ctena bella Hawaii 21, Ctena bella French Polynesia symbionts), and thus represent novel species. The genomic potential for nitrogen fixation was widely, albeit heterogeneously, distributed across the lucinid symbiont tree and found in both Sedimenticola and Ca. Thiodiazotropha symbionts (Fig 3B). The presence/absence of nitrogen fixation genes varied even within the two species clades Ca. S. endophacoides and Ca. T. endolucinida. For example, despite their genetic similarity (>95% ANI, S1 Dataset), nitrogen fixation genes were present in the MAGs of the Ca. Thiodiazotropha endolucinida lineage associated with Ctena bella from Hawaii (Ca. T. endolucinida HAW) but not in the Ca. Thiodiazotropha endolucinida MAGs retrieved from Lucina adansonia from Cape Verde (Fig 3 and S6 Dataset). Similarly, Ca. Sedimenticola endophacoides associated with Phacoides pectinatus from Florida lacked nitrogen fixation genes, even though this ability was present in closely related lineages of Ca. Sedimenticola endophacoides associated with P. pectinatus from Guadeloupe (>96% ANI, S1 and S6 Datasets) and Panama (>98% ANI, S1 and S6 Datasets).

thumbnail
Fig 3. Placement of new MAGs in the global lucinid symbiont tree reveals discontinuous distribution of nitrogen fixation genes, even within species-level clades.

A Geographic origins of the MAGs included in this analysis. Points were colored based on the clade they belong to. Different shapes indicate whether the samples were obtained in this study (circles) or in previous studies (triangles). The magnified map of the Isthmus—where the density of sampling sites was high—shows these sampling sites in detail. The map was generated with data from Natural Earth (http://www.naturalearthdata.com/) using the R package "rnaturalearth" (v0.3.2) (https://github.com/ropensci/rnaturalearth). B Maximum likelihood phylogenomic tree inferred from GTDB’s multiple sequence alignment using the best fit model Q.plant+F+I+G4. The names of clades found in the host sister species across the Isthmus of Panama are in turquoise (CAR; Caribbean) or purple (TEP; Tropical Eastern Pacific) font, while globally-distributed symbiont clades are highlighted in gray. Previously described clades were collapsed and annotated in the same way as in the most recent phylogenetic analysis of lucinid symbionts [18] (S2 Table). New clades were collapsed based on ANI (>95%) and/or location. Clades with the potential for nitrogen fixation are indicated with a star and colored symbols match the symbology of the map. Black squares indicate UFB values above 95% and SH-aLRT values above 80%.

https://doi.org/10.1371/journal.pgen.1011295.g003

Multiple transfer events account for the sporadic distribution of nitrogen fixation genes amongst the lucinid symbionts.

After dereplication, we analyzed a total of 242 MAGs from the order Chromatiales (including Ca. Thiodiazotropha and Sedimenticola genera), from which we identified 139 complete nifHDKT operons. Trees constructed for each individual nif gene had consistent topologies for the strongly supported lucinid symbiont nif clades, suggesting that nifH, nifD, nifK and nifT are co-inherited (S1 Fig). The resulting nifHDKT tree revealed that all the lucinid symbiont nifHDKT sequences are closer to each other than to any non-symbiont relative (Figs S2 and 4A). Moreover, our analysis revealed the presence of three distinct major symbiont nifHDKT clades, which we have denoted as Clade A, Clade B, and Clade C (Figs S2 and 4A). Clade A comprises genes from Ca. Sedimenticola endophacoides. Clade B consists of genes from Ca. Thiodiazotropha endolucinida, Ctena4, Monit1 (Monitilora ramsayi symbionts), and Pegophyseminae symbionts “PEGO”. Lastly, Clade C encompasses genes from Ca. Thiodiazotropha taylori, lotti, weberae, and fergusoni, as well as Ctena2 and Ctena3 (Fig 4A and 4B). The topology of these clades was inconsistent with the phylogenomic tree (Figs 3B and 4A). Although the Ca. T. endolucinida, Ctena4, and Monit1 symbionts belong to the genus Ca. Thiodiazotropha, the nifHDKT genes from these symbionts formed a clade with the nifHDKT genes of the “PEGO”, which belongs to the Sedimenticola genus (Fig 4A).

thumbnail
Fig 4. Horizontal gene transfer events among lucinid symbionts explain incongruence between nifHDKT phylogeny and phylogenomic tree.

A The symbiont nifHDKT genes formed three major clades that were incongruent with the phylogenetic relationships of the symbionts. Pruned maximum likelihood phylogenetic tree depicting the clades of lucinid symbionts’ nifHDKT genes. The tree was inferred using the best fit model GTR+F+I+R6 from a concatenated alignment of these genes. B The last common ancestor of the Ca. Thiodiazotropha genus lacked nitrogen fixation genes, which were subsequently independently acquired through horizontal gene transfer by different symbiont lineages. Ancestral reconstruction of the presence/absence of nifHDKT (orange—nifHDKT present, blue—nifHDKT absent, black—ambiguous state) mapped onto a pruned cladogram based on a maximum likelihood phylogenomic tree inferred using the best fit model Q.insect+F+R9 from a GTDB’s multiple sequence alignment from dereplicated genomes (at 99.5% ANI). Symbiont clades are annotated as in Fig 3B and the clades of their corresponding nifHDKT genes are annotated according to A. Robust horizontal gene transfer events inferred from the reconciliation of the gene tree with the phylogenomic tree were superimposed and are depicted as pink arrows.

https://doi.org/10.1371/journal.pgen.1011295.g004

We investigated the ancestral states of nifHDKT presence or absence to understand the evolutionary processes that could explain the incongruence between the symbiont phylogenomic tree and the nifHDKT tree (Fig 4B). According to this analysis, the last common ancestor (LCA) of the Sedimenticolaceae, as well as the LCAs of both Ca. Thiodiazotropha and Sedimenticola, did not possess the potential to fix nitrogen (Figs S4 and 4B); independent gene gain or loss events explain the sporadic distribution of this metabolic function across the symbiont tree. The last common ancestor (LCA) of Ca. Sedimenticola endophacoides was inferred to possess nifHDKT genes, indicating a subsequent loss of the nitrogen fixation potential in the Florida symbiont lineage. Conversely, we identified three well-supported instances of nifHDKT horizontal gene transfer (Figs 4B and S3) to and from the Ca. Thiodiazotropha endolucinida clades. The LCA of the Ca. T. endolucinida clade lacked nifHDKT genes but the genes were subsequently acquired from an ancestral node of the Ca. Sedimenticola “PEGO” lineage before the LCA Ca. T. endolucinida HAW and CAR lineages diverged (Figs 4B and S3). Additionally, the LCA of the Ctena4 lineage acquired the ability to fix nitrogen from an ancestral node of the Ca. T. endolucinida Hawaii lineage, while a Monitilora ramsayi symbiont (Monit1), acquired the nifHDKT genes from an ancestral node of Ca. T. endolucinida CAR and HAW (Figs S3 and 4B). The ancestral reconstruction and gene reconciliation analyses were, however, unable to resolve the patterns of nitrogenase gene loss and gain across the deep-branching nodes of the other Ca. Thiodiazotropha symbiont clades (e.g. the LCA of Ca. T. taylori, Ca. T. lotti and Ca. T. weberae).

Homologous recombination contributes to cohesion of Ca. T. endolucinida populations around the world

To investigate whether homologous recombination might play a role in maintaining the genetic connectivity of the Ca. T. endolucinda populations from different geographic locations, we measured the relative rates of recombination to mutation events from core genome alignments of all the Ca. T. endolucinda lineages. The R/θ of the Ca. T. endolucinda core genome—a 2,710,509 base pairs (bp) alignment—was 0.0999 while the r/m ratio was 0.589 (Table 2), which is significantly higher than the value previously measured for this species (0.082) [19]. This notable difference could be explained by a lack of resolution in the data, as Osvatic and colleagues analyzed Ca. T. endolucinda MAGs from a single site (Bocas del Toro, Panama) [19]. Furthermore, we explored how these rates of homologous recombination compared with those observed in Ca. T. gloverae, a globally-distributed symbiont inhabiting deep-water or temperate environments, as well as the rates previously reported for Ca. T. taylori, a symbiont found in tropical oligotrophic shallow-water environments around the world [18,19]. The R/θ of the Ca. T. gloverae core genome (1,619,656 bp) was 0.121 while the r/m ratio was 1.138 (Table 2). The rates of recombination to mutation events in Ca. T. endolucinda was therefore the lowest of the three globally distributed lucinid symbiont species and highest in Ca. T. gloverae (Table 2).

thumbnail
Table 2. Ratios of the rates of homologous recombination to mutation events in globally distributed lucinid symbionts.

Geographic distribution of the clades can be found in S5 Fig.

https://doi.org/10.1371/journal.pgen.1011295.t002

Discussion

Chemosymbiotic sister species separated by the Isthmus of Panama reveal symbiont adaptation to changing environments

Studies on the adaptations enabling animals to cope with environmental changes resulting from the closure of the Isthmus of Panama (reviewed in [8]) have largely overlooked the role of host-associated microbes [1]. Our investigation focused on understanding how environmental conditions on either side of the Isthmus of Panama have influenced the distribution, diversity and metabolic functions of symbionts associated with lucinid sister species separated by the closure of the Isthmus. Codakia and Ctena sister species from either side of the Isthmus hosted distinct clades of symbionts from the genus Ca. Thiodiazotropha. Despite the absence of symbiont clade overlap across the Isthmus, we observed a lack of specificity between hosts and symbionts, which was evident in the incongruence of their respective phylogenetic relationships. We observed multiple instances where different host species of the Codakia and Ctena genera on the same side of the Isthmus share the same symbiont groups. This is consistent with horizontal symbiont acquisition from the environment and indicates that the environment is a key factor influencing symbiont selection and distribution [19,29,30].

We compared the metabolic potential of the symbionts from either side of the Isthmus to investigate how the vastly different environmental conditions of the Caribbean Sea (CAR) and TEP have shaped the evolution of lucinid symbionts and their metabolic capabilities. We discovered that nitrogen fixation genes were encoded in the MAGs of all Caribbean symbionts but absent in all the MAGs of symbionts from the TEP. Similarly, we observed that the capacity for assimilatory nitrate reduction was ubiquitous in symbionts from the Caribbean, but not in those from the TEP (Fig 2). Assimilatory nitrate reductases are similarly absent in many deep-water symbiont lineages, which also lack nitrogen fixation genes [18], suggesting that these two capabilities may be linked. The coastal waters of the TEP are frequently enriched with nutrients due to seasonal upwelling, leading to nitrate levels that are roughly ten times higher than those found in seagrass beds in the Caribbean [31,32]. Hence, the absence of nitrogen fixation genes in the symbiont MAGs from the nitrogen-rich TEP region is consistent with the hypothesis that lucinid symbiont diazotrophy has evolved as an adaptation to life in nitrogen-poor oligotrophic habitats like tropical coral reefs and seagrass beds [33]. Although the TEP symbionts did not possess the capacity to fix nitrogen, the MAGs of all three TEP symbiont lineages encoded unique accessory metabolic capabilities that were lacking in the Caribbean symbionts. Specifically, TEP symbionts had the genetic potential for synthesizing gammapolyglutamate (Table 1), a storage compound produced by bacteria during nutrient limitation [34]. Electron-transferring-flavoprotein (ETF) dehydrogenase genes were also enriched in the MAGs from the TEP symbionts (Table 1); this enzyme is upregulated in Pseudomonas aeruginosa and Pseudomonas syringae in response to low temperatures [35,36] and in Neisseria gonorrhoeae under anaerobic conditions [37]. The TEP environment is characterized by drastic seasonal changes in physical conditions and nutrient availability. For example, nitrate concentrations in the upper layer of water differ by approximately one order of magnitude between the wet and dry seasons [3841]. It is intriguing to speculate that the gammapolyglutamate synthesis and ETF dehydrogenase genes are beneficial to the TEP symbionts during the seasonal environmental changes typical of upwelling regions, which include changes in nutrient levels, colder temperatures, or reduced oxygen levels [3841].

Lucinid symbionts have convergently evolved the ability to fix nitrogen on multiple occasions

We reconstructed the phylogenetic relationships of the nifHDKT genes to gain insights into the evolution of nitrogen fixation in the lucinid symbionts and the factors underlying the sporadic distribution of this metabolic capability across the lucinid symbiont tree. While clades A and C of the nifHDKT gene tree mirrored the phylogenetic relationships of the symbionts, the incongruence of clade B with symbiont tree topology suggests the nifHDKT genes of the Ca. T. endolucinida, Monitilora ramsayi symbionts (Monit1), and Ctena4 lineages have not co-evolved and/or co-diversified with the single-copy core genes in their respective genomes (Fig 4A). To further investigate this incongruence, we inferred the ancestral states and horizontal gene transfer events of the nifHDKT genes throughout the evolution of lucinid symbionts. This analysis indicated that diazotrophy was most likely not an ancestral trait of either the Sedimenticola or Ca. Thiodiazotropha genera, but was acquired independently by different symbiont lineages that inhabit nutrient-poor environments (Figs S4 and 4B). For example, nifHDKT genes were absent in the last common ancestor (LCA) of the Ca. T. endolucinida species clade (Fig 4B). Consequently, the TEP, Cape Verde L. adansoni and Cardiolucina cf. quadrata lineages, which either originate from nutrient-rich upwelling regions or deep waters (S2 Table), are incapable of fixing nitrogen, possibly due to the absence of selection pressure that would drive the acquisition and maintenance of this function. Our analysis further predicted that nifHDKT genes were later acquired by horizontal gene transfer from a Sedimenticola bacterium to the LCA of the Ca. T. endolucinida HAW and Ca. T. endolucinida CAR lineages. The source of this transfer could have been either an ancestral population of the Pegophyseminae symbionts “PEGO” or an unsampled closely related lineage. Our analyses also indicated a well-supported transfer event from an ancestral population of the Ca. T. endolucinida HAW clade to the Ctena4 clade, which comprises samples from the Florida Keys and the Caribbean (Fig 4B). These findings indicate that both these clades have convergently acquired nitrogen fixation capabilities in an oligotrophic environment, and further suggests the horizontal transfer of nitrogen fixation genes might be a major factor enabling lucinid symbiont adaptation to nitrogen-poor conditions. The third well-supported transfer event was inferred from an ancestral node of the clade consisting of Ca. T. endolucinida CAR and HAW to Monit1. Both the Monit1 clade and nif-lacking Monit2 clade each consisted of a single Monitilora ramsayi symbiont MAG. Given that both MAGs were obtained from samples of the same host species and location (Queensland, Australia) (Fig 3B), and no additional metadata is available, further investigation is required to understand evolution of nitrogen fixation in the Monitilora ramsayi symbiosis. The acquisition of nifHDKT genes by Ca. T. endolucinida from a Sedimenticola lineage, rather than one of the Ca. Thiodazotropha is unexpected, given that transfers are more likely within the same clade as gene flow tends to occur more frequently among genetically similar bacteria [42]. A possible explanation for this could be the higher abundance of Sedimenticola OTUs in sediment bacterial communities [43] compared to Ca. Thiodiazotropha-like OTUs [44]. This disparity in abundance could reduce the frequency of physical encounters between the different Ca. Thiodiazotropha lineages—thereby reducing the likelihood of genetic exchange—while increasing the likelihood of encounters with members of the Sedimenticola lineage. However, the abundance of free-living “PEGO” symbionts and their close relatives has not been measured, and further investigation of the free-living microbial communities in lucinid habitats is required to gain a more comprehensive understanding of these dynamics. Finally, the ambiguous ancestral states of the deeper branching nodes of the other Ca. Thiodiazotropha clades hinder our interpretation of gene loss and/or reacquisition events in these symbiont lineages. Filling the gaps in the lucinid symbiont phylogeny by including novel symbiont and/or free-living relative genomes may resolve the uncertainties in the ancestral state reconstruction.

Nitrogen availability and HGT drive the ecological diversification of lucinid symbionts

The genes for fixing nitrogen were ubiquitous in the genomes of Ca. T. endolucinida clades from the oligotrophic waters of the Caribbean and Hawaii but absent in the TEP and Cape Verde clades originating from upwelling regions characterized by sporadically high levels of bioavailable nitrogen [38,45]. Our findings suggest that the ability to fix nitrogen, acquired through HGT of the nif operon, likely played a crucial role in altering the ecological niche of the last common ancestor (LCA) of the Ca. T. endolucinida CAR and HAW lineages, thus shaping their evolutionary trajectory and divergence. However, all Ca. T. endolucinida lineages cross the 95% ANI threshold, widely used to delimit bacterial species, which suggests that these clades, in spite of their essential differences in nitrogen metabolism, represent a single species [2426]. While ANI by itself does not provide evidence of gene flow between the geographically distinct populations, reconstruction of recombination events within the Ca. T. endolucinda species clade further showed that the ratio of the effects of recombination and mutation (0.6) surpassed the theoretical threshold (0.25) necessary to hinder population divergence [46,47]. This suggests there are low barriers to gene flow between geographically separated populations of Ca. T. endolucinda and that homologous recombination contributes to the genetic cohesion of the clade. Nevertheless, barriers to genetic exchange or sexual isolation are not prerequisites for ecological and genetic divergence [48]. Indeed, we observed geographic and phylogenetic differentiation of the Ca. T. endolucinida populations (Figs 3B and S5) without reproductive isolation (Table 2). A similar phenomenon has been described for closely-related populations of Synechococcus [49] and Vibrio [50], for which there is evidence of ecological and genetic divergence, even though these populations also exhibit elevated rates of recombination. Our findings align with these observations, suggesting that during the microbial speciation process, ecological divergence is more likely to precede the emergence of genetic barriers that eventually lead to sexual isolation [51].

The discovery of Ca. T. endolucinida populations in the TEP, Hawaii and Cape Verde, substantially increases the distribution range of Ca. T. endolucinida, which, with fewer samples, was thought to be restricted to the Caribbean, and also makes this species the third globally-distributed lucinid symbiont from the genus Ca. Thiodiazotropha after Ca. T. taylori [19], and Ca. T. gloverae [18] (S5 Fig). It is also interesting to note that the r/m ratios within Ca. T. gloverae nearly double those observed for both Ca. T. taylori and Ca. T. endolucinida. Furthermore, in contrast to Ca. T. endolucinida, there were no major differences in the ability to fix nitrogen across different lineages within Ca. T. taylori, and Ca. T. gloverae [18,19]. These differences in diversification patterns within the Ca. T. taylori, Ca. T. endolucinida, and Ca. T. gloverae symbiont species, together with their global distribution ranges, present an interesting opportunity to study the origins of genetically cohesive units with species-like properties and characterize the ecological factors underlying bacterial bacterial diversification. For example, further studies on Ca. T. endolucinida could provide new insights into how acquiring new genes can drive ecologically important changes in tolerances to physical and chemical conditions (ETF dehydrogenase) or changes in resources consumed (nitrogen metabolism). Conversely, the absence of the ability to fix nitrogen in the Ca. Sedimenticola endophacoides clade associated with P. pectinatus from a region in Florida that experiences seasonal upwelling events [5254] presents the opportunity to study how a change in access to abundant nutrient resources could drive gene loss and eventually lead to diversification of this clade.

Our findings highlight the complex interplay between the environment and gene exchange that shapes the evolution and diversification of bacterial symbiont populations. The remarkable flexibility in the partnerships between lucinid clams and diverse symbiont candidates, all sharing the same core metabolic capabilities but different accessory genes, could enhance the resilience of these associations to changing environmental conditions. The capacity to acquire novel metabolic capabilities in response to different nutrient conditions could have important implications for how animal-bacteria symbioses might respond to anthropogenic changes in the environment, such as nutrient loading due to changes in land use. Future investigations should focus on whether location-specific metabolic traits, such as nitrogen fixation, directly influence host fitness or whether symbiont selection is predominantly determined by partner availability.

Methods

Sample collection

Fresh samples.

Live clams were collected with a hand-held trowel in seagrass beds (except for Phacoides pectinatus which was collected in mangrove mud) at ~30 cm sediment depth. A colander was used to sieve out the sediment and separate the clams. Gills were dissected directly in the field with a razor blade upon returning to the beach, preserved in DNA/RNA Shield (Cat. No. R1100-250; ZymoBiomics, USA) according to manufacturer’s instructions and kept at room temperature during travel and at -20°C for long-term storage. Specimens of Anodontia alba, Codakia orbicularis, Codakia distinguenda, Clathrolucina costata, Radiolucina Jessicae, Ctena sp. COSTE, P. pectinatus and Ctena cf. galapagana from Costa Rica and Panama were collected during the #istmobiome Project sampling campaign (https://istmobiome.net) in 2018 and 2019. Specimens from Guadeloupe were collected by hand in 2019 from seagrass beds of Thalassia testudinum (Ctena imbricatula, C. orbicularis) and from mangrove mud (P. pectinatus) (S1 Table).

Museum samples.

Specimens of C. distinguenda, Lucinisca fenestrata, Lucina adansoni, Ctena bella, Euanodontia ovum, Cryptophysema vesicula and Austriella corrugata were acquired from the collections of the Florida Natural History Museum (FLMNH), Gainesvillae, FL, USA, the California Academy of Sciences in San Francisco, CA, USA and the Natural History Museum (NHM) London, UK (S1 Table). Access was granted and organized by Dr. John Taylor (NHM), Dr. Gustav Paulay and Dr. Amanda Bemis (FLMNH); and Dr. Elizabeth Kools and Dr. Christina Piotrowski (California Academy of Sciences) (S1 Table).

DNA extraction and sequencing

DNA was extracted from gill tissues using the Qiagen DNeasy Blood and Tissue kit (Cat. No. 69506; Qiagen, USA) following the manufacturer’s instructions. Proteinase K digestion of the gills was performed for 48 hours at 56° Celsius. Extracted DNA was treated with RNase A (Cat. No. 19101; Qiagen, USA) for 30 minutes at 25° Celsius. Paired-end Illumina sequencing produced reads of 150bp or 250 bp in length (S1 Table). Illumina sequencing generated a minimum of 3,000,000 reads per sample. Samples ctemesantah004, cteimcahuit014, and cteimbast119 were also sequenced using PacBio Sequel II long-read technology. One cell was sequenced for each of the three gill samples. For these three samples, only PacBio bins were included for downstream analysis.

Read quality filtering, assembly, and binning

Illumina.

Illumina read libraries were trimmed, PhiX contamination filtered, and quality checked using BBMap v37.61’s BBDuk feature [55]. Individual read libraries were assembled using SPAdes v3.13.1 [56]. The assembly statistics were assessed with the BBTools (https://sourceforge.net/projects/bbmap/) script “stats.sh”. The resulting metagenomic assembly scaffolds were binned with a combination of anvi’o v6.1 [57,58] using CONCOCT v1.1.0 binning [59], and metabat v2.15 [60]. The bins were then compared using dRep v2.4.2’s dereplicate workflow [61].

PacBio.

Samples ctemesantah004, cteimcahuit014, and cteimbast119, for which Illumina MAGs of the symbiont lineages Ca. T. endolucinida TEP, Ca. T. taylori, and Ca. T. fergusoni had been respectively recovered, were selected for long-read sequencing with the goal of obtaining circularized symbiont MAGs of some of the most prevalent clades present across the Isthmus. HiFi reads were produced using circular consensus sequencing (CSS) mode on the PacBio long-read Sequel II system (S1 Table). BBMap v37.61’s Seal feature was used to split reads into host and symbiont, based on kmer distributions [55]. A minimum kmer fraction of 0.5 and an input quality offset of 31 eliminating the hamming distance were applied for splitting the reads. MAGs binned out from Illumina paired-end sequencing of 250bp were used as a reference to split host and symbiont reads. Duplicate reads were subsequently removed using BBMap’s reformat.sh script. Flye v.2.9.2, a de novo assembler for single-molecule sequencing reads [62], was used with default parameters and the -pacbio-hifi flag to assemble PacBio reads and bin out symbiont genomes. Each Flye assembly resulted in one fully circularized contig that was extracted for downstream analyses (S1 and S2 Tables).

Bin quality check and taxonomic assignment.

Bins obtained both with Illumina and PacBio sequencing were checked for completion using the CheckM2 v1.0.1 [63] and manually refined using ‘anvi-refine’. Bins that were determined to be 90% or more complete and less than 10% contaminated post refinement were considered to be high-quality MAGs [64]. Only high-quality MAGs classified as Gammaproteobacteria by the GTDB-Tk v2.1.1 (Genome Taxonomy Database Toolkit) classify workflow [6568] were used for further analyses. MAG depth and breadth of coverage statistics was obtained using CoverM v0.4.0 (https://github.com/wwood/CoverM) by mapping the metagenomes to their corresponding MAGs.

MAG annotation and metabolic reconstruction

High-quality Gammaproteobacteria MAGs recovered in this study and lucinid symbiont high-quality MAGs published in Osvatic et al. 2023 [18] were functionally annotated through DRAM v1.4.6 [69] using the Kyoto Encyclopedia of Genes and Genomes (KEGG), UniRef90 and PFAM databases. The anvi’o platform (development version) [57] was also used to infer metabolic pathways encoded in the MAGs as an additional strategy for functional annotation. The MAGs were formatted using “anvi-script-reformat-fasta”, after which contigs-databases were generated by “anvi-gen-contigs-database”. Functional annotations were assigned to the open reading frames of the contigs-databases using the KOfam HMM database of KEGG orthologs [70,71] with “anvi-run-kegg-kofams”. The “anvi-estimate-metabolism” (with—module-completion-threshold 0.9) script was used to estimate the presence of complete KEGG modules in each contigs-database.

Phylogenetics, relatedness and functional potential of symbionts across the Isthmus of Panama

Symbiont phylogenomics.

A lucinid symbiont tree was inferred from an alignment containing all the high quality Gammaproteobacteria MAGs recovered in this study, and all publicly available lucinid symbionts MAGs included in the phylogenomic tree published in Osvatic et al. 2023 [18], with Allochromatium vinosum (GCA_000025485) included as an outgroup. The concatenated alignment of 120 conserved bacterial marker genes from the MAGs was obtained with the GTDB-Tk v2.1.1 classify workflow [6568]. A maximum likelihood phylogenomic tree was inferred using IQ-Tree v2.2.2.1 [7274] with auto substitution model detection, 1,000 ultrafast bootstrap (UFB) replicates and 1,000 samples for SH-aLRT branch testing. Nodes with values of UFB greater or equal to 95% and of SH-aLRT greater or equal to 80% were considered to be strongly supported. The tree was visualized, rooted, and annotated with Interactive Tree Of Life (iTOL) [75]. Symbiont clades were collapsed, and lineages were annotated according to the tree published in Osvatic et al. 2023 [18]. Previously undescribed clades were collapsed based on an ANI threshold of 95% and/or sampling location. Additionally, the tree was pruned using Newick Utils v1.6 [76], leaving only leaves which represented symbionts associated with sets of lucinid sister species present across the Isthmus of Panama: Codakia orbicularis (Caribbean) and Codakia distinguenda (TEP) and Ctena imbricatula—Ctena sp. (Caribbean) and Ctena galapagana (TEP). Monitilora sp. symbionts were also left in the pruned tree as an outgroup.

ANI values were calculated with fastANI v1.1 [25]. To visualize the levels of relatedness between the clades present across the Isthmus, the resulting values were used to build a heatmap with the R package pheatmap v1.0.12 [77].

Functional comparison across the Isthmus.

Anvi’o’s pangenomic workflow [57,78] was used for comparing the functional potential of the symbionts across the Isthmus, using anvio’s development version. A pangenome database was computed from all the annotated contigs-databases with “anvi-pan-genome” and mcl-inflation parameter set at 6. The functional enrichment analysis [79] was performed on the pangenome to identify genes enriched in the MAGs of symbionts from either side of the Isthmus. For this, MAGs were classified according to their geographic origin (TEP or Caribbean). The enrichment of the modules was computed on the annotated contigs-databases (see section MAG annotation and metabolic reconstruction) by using “anvi-compute-metabolic-enrichment”. Genes or modules were deemed differentially enriched if they exhibited a presence in over 90% of one group and less than 10% in the other group, while also being present across all clades belonging to their respective groups. The presence and absence of enriched metabolic pathways was plotted on the phylogenomic tree using the Interactive Tree Of Life (iTOL) [75] binary dataset template. To confirm the lack of nitrogen fixation potential in TEP symbionts’ genomes and ensure that the genes were not missed in the binning process, the proteins were directly predicted from both Caribbean and TEP metagenome assemblies with Prodigal v2.6.3 [80]. Subsequently, a search was conducted using hmmsearch [81] v3.1b2 with an expectation value (E-value) threshold of 1e-20 against the Pfam model Fer4_NifH (PF00142), which is accessible at https://www.ebi.ac.uk/interpro/entry/pfam/PF00142/curation/. In addition, raw reads from the Caribbean and TEP metagenomes were mapped with bowtie2 v2.5.3 [82] to nifHDKT genes extracted from Ca. T. fergusoni, taylori, and endolucinida CAR MAGs, and were subsequently quantified.

Ancestral reconstruction and phylogenetic reconciliation—resolving phylogenetic conflicts

For the reconstruction of ancestral states and the phylogenetic reconciliation, all genomes and MAGs of the order Chromatiales available in GTDB were included in the analysis. Genomes classified as Desulfuromonas in GTDB were also included as an outgroup. The GTDB data was accessed on 2024-02-12. The diversity of MAGs was reduced by following the dRep v3.2.2 [61] dereplicate workflow with the S_ANI parameter set to 0.995 and a phylogenomic tree was inferred from the dereplicated MAGs. The methods and parameters used to build the alignment and infer the tree were the same as described in the section Symbiont phylogenomics and the resulting tree was rooted with Gotree v0.4.4 [83]. Ancestral states of presence/absence of the nitrogen fixation potential were inferred with PastML [84], applying the marginal posterior probabilities approximation (MPPA). For visualization purposes, the ancestral reconstruction was pruned using Newick Utils v1.6 [76], leaving only the monophyletic clade containing both Ca. Thiodiazotropha and Sedimenticola symbiont MAGs. The ancestral reconstruction results were visualized with Interactive Tree Of Life (iTOL) [75], where the tree was displayed as a cladogram for clarity.

Sequences of the nifH, nifD, nifK and nifT genes were chosen to infer the evolutionary history of nitrogen fixation because they form an operon with the minimum catalytic gene set for the reaction [28]. Sequences were extracted from the anvi’o annotation [57] (described in the section MAG annotation and metabolic reconstruction) of the dereplicated MAGs. MAGs that did not contain the full set of genes were excluded from the analysis (S2 Table). Each gene extraction [85] was aligned independently with FSA v1.15.9 [86] using the—fast flag. A tree was inferred from each alignment to ensure that the genes have evolved together. Subsequently, the alignments were concatenated using SeqKit v2.3.0 [87] and a tree inferred from it. The methods and parameters used to infer the trees, define well-supported nodes and visualize the tree were the same as described in the section Symbiont phylogenomics and the resulting tree was rooted with Gotree v0.4.4 [83].

The nifHDKT tree was reconciled with the dereplicated phylogenomic tree (considered as the species tree) using AleRax v1.0.1 [88] and by employing the UndatedDTL reconciliation model the SPR strategy for gene tree correction and the transfer constrain to parent species. Transfer events observed in the reconciliation, which aligned with a change in the ancestral state from absence to presence of nifHDKT (inferred in the ancestral state reconstruction) and had strong support in the gene tree, were incorporated manually into the ancestral state reconstruction.

Recombination rates of the bacterial symbionts

We used ClonalFrameML to infer the recombination events across the genomes of globally distributed symbionts based on the workflow described in [19]. To improve computational efficiency, high quality MAGs of Ca. T. endolucinda were first de-replicated using dRep v3.2.2 with the S_ANI parameter set to 0.995 [61]. The final sets of de-replicated Ca. T. endolucinda MAGs and publicly available Ca. T. gloverae MAGs were each separately aligned using the progressiveMauve in Mauve v2.0 [89,90]. Core genomes represented by locally collinear blocks (LCBs) of at least 500 bp were extracted with the Mauve v2.0 command stripSubsetLCBs [90], and then re-aligned using MAFFT v7.304 [91] in auto mode. The LCB alignment was then trimmed with trimAl v1.4 [92] (parameters -resoverlap 0.75 -seqoverlap 80) and a phylogenetic tree was generated using FastTree v2.1.11 [93,94] using the parameters -gtr -nt. The alignment and tree were used as input for ClonalFrameML [95], which was run with 100 (-emsim) pseudo-bootstrap replicates. Recombination events were visualized in R v4.2.3 using the ‘cfml_results.R’ script available at https://github.com/xavierdidelot/ClonalFrameML; date of accession: 1 September 2023).

Maps and visualization

Maps were plotted in R using the following packages: rnaturalearth v0.3.2 [96], ggspatial v1.1.7 [97] and ggplot2 v3.4.1 [98]. Final modifications for publication of all figures were done in Adobe Illustrator (https://adobe.com/products/illustrator) to improve readability while preserving the original information from the different methods.

Supporting information

S1 Table. Metadata of metagenomes obtained in this study.

https://doi.org/10.1371/journal.pgen.1011295.s001

(XLSX)

S2 Table. Metadata of MAGs used in this study.

https://doi.org/10.1371/journal.pgen.1011295.s002

(XLSX)

S1 Dataset. ANI values of all MAGs versus all MAGs.

https://doi.org/10.1371/journal.pgen.1011295.s003

(XLSX)

S2 Dataset. Metabolic reconstruction of new symbiont clades.

https://doi.org/10.1371/journal.pgen.1011295.s004

(XLSX)

S3 Dataset. Symbiont’s functional potential across the Isthmus: Anvi’o’s module enrichment analysis.

https://doi.org/10.1371/journal.pgen.1011295.s005

(XLSX)

S4 Dataset. Symbiont’s functional potential across the Isthmus: Anvi’o’s gene enrichment analysis.

https://doi.org/10.1371/journal.pgen.1011295.s006

(XLSX)

S5 Dataset. Read mapping to nifHDKT and nitrogenase HMM search result.

https://doi.org/10.1371/journal.pgen.1011295.s007

(XLSX)

S6 Dataset. Presence and absence of nitrogen fixation potential in symbiont MAGs.

https://doi.org/10.1371/journal.pgen.1011295.s008

(XLSX)

S1 Fig. Maximum likelihood phylogenetic trees of different nitrogen fixation genes of lucinid symbionts.

A nifH gene tree B nifD gene tree C nifK gene tree D nifT gene tree.

https://doi.org/10.1371/journal.pgen.1011295.s009

(TIF)

S2 Fig. Complete maximum likelihood nifHDKT phylogenetic tree of the order Chromatiales.

https://doi.org/10.1371/journal.pgen.1011295.s010

(TIF)

S3 Fig. Gene-species tree reconciliation of nifHDKT operon of lucinid symbionts obtained with AleRax.

https://doi.org/10.1371/journal.pgen.1011295.s011

(TIF)

S4 Fig. Complete ancestral reconstruction of the order Chromatiales.

https://doi.org/10.1371/journal.pgen.1011295.s012

(TIF)

S5 Fig. Geographic distribution of global symbionts.

The map was generated with data from Natural Earth (http://www.naturalearthdata.com/) using the R package "rnaturalearth" (v0.3.2) (https://github.com/ropensci/rnaturalearth)

https://doi.org/10.1371/journal.pgen.1011295.s013

(TIF)

Acknowledgments

Codakia orbicularis individuals from Florida were collected and sent to us by Dr. Diana Chin (codorflorid129run1). Divalinga sp. Individuals from Jamaica were collected and sent to us by Dr. Amber Stubler (divaljamaic004). Part of the sequencing was carried out by the DNA Technologies and Expression Analysis Core at the UC Davis Genome Center and at the Joint Microbiome Facility (JMF) of the Medical University of Vienna and the University of Vienna (project IDs JMF-1911-9, JMF-2104-13, and JMF-2002-8). We thank Petra Pjevac and Gudrun Kohl of the JMF for processing the samples. We are grateful to the Life Science Computer Cluster at the University of Vienna for the computational resources used for parts of the analyses. We also want to specifically thank Minor Lara’s sons Minor and Steven for their participation during fieldwork in Cuajiniquil. We would like to acknowledge the use of ChatGPT, an AI language model, for its assistance in correcting grammar and enhancing the clarity of the writing in this manuscript.

Collection permits: Sampling in Panama was performed complying with the Panama-Nagoya regulations under the identifier ABSCH-IRCC-PA-254919-1; collection and export were performed under the collection permit SE/AO-4-19 and export permit SEX/A-22-2020. Collection in Costa Rica, and following export and sequencing, were performed under permits R-004-2019-OT-CONAGEBIO and R-017-2022-OT-CONAGEBIO from the Comisión Nacional para la Gestión de la Biodiversidad (CONAGEBIO). Collection in Guadeloupe was performed under permit number TREL2302365S/676.

References

  1. 1. Wilkins LGE, Leray M, O’Dea A, Yuen B, Peixoto RS, Pereira TJ, et al. Host-associated microbiomes drive structure and function of marine ecosystems. PLoS Biol (2019) 17(11): e3000533. pmid:31710600
  2. 2. Wiedenbeck J, Cohan FM. Origins of bacterial diversity through horizontal genetic transfer and adaptation to new ecological niches. FEMS Microbiol Rev. 2011;35: 957–976. pmid:21711367
  3. 3. López-Madrigal S, Gil R. Et tu, brute? Not even intracellular mutualistic symbionts escape horizontal gene transfer. Genes. 2017;8. https://doi.org/10.3390/genes8100247
  4. 4. Manzano-Marín A, Coeur d’acier A, Clamens A-L, Orvain C, Cruaud C, Barbe V, et al. Serial horizontal transfer of vitamin-biosynthetic genes enables the establishment of new nutritional symbionts in aphids’ di-symbiotic systems. ISME J. 2020;14: 259–273. pmid:31624345
  5. 5. Sudakaran S, Kost C, Kaltenpoth M. Symbiont acquisition and replacement as a source of ecological innovation. Trends Microbiol. 2017;25: 375–390. pmid:28336178
  6. 6. Tsuchida T, Koga R, Fukatsu T. Host plant specialization governed by facultative symbiont. Science. 2004;303: 1989. https://www.science.org/doi/ pmid:15044797
  7. 7. O’Dea A, Lessios HA, Coates AG, Eytan RI, Restrepo-Moreno SA, Cione AL, et al. Formation of the Isthmus of Panama. Sci Adv. 2016;2: e1600883. pmid:27540590
  8. 8. Lessios HA. The Great American Schism: Divergence of marine organisms after the rise of the Central American Isthmus. Annu Rev Ecol Evol Syst. 2008;39: 63–91. https://doi.org/10.1146/annurev.ecolsys.38.091206.095815
  9. 9. Carrier TJ, Lessios HA, Reitzel AM. Eggs of echinoids separated by the Isthmus of Panama harbor divergent microbiota. Mar Ecol Prog Ser. 2020;648: 169–177. https://doi.org/10.3354/meps13424
  10. 10. Leray M, Wilkins LGE, Apprill A, Bik HM, Clever F, Connolly SR, et al. Natural experiments and long-term monitoring are critical to understand and predict marine host-microbe ecology and evolution. PLoS Biol. 2021;19: e3001322. pmid:34411089
  11. 11. Taylor JD, Glover E. Biology, evolution and generic review of the chemosymbiotic bivalve family Lucinidae. Ray Society; 2021.
  12. 12. Cavanaugh CM. Symbiotic chemoautotrophic bacteria in marine invertebrates from sulphide-rich habitats. Nature. 1983;302: 58–61. https://doi.org/10.1038/302058a0
  13. 13. Felbeck H, Childress JJ, Somero GN. Calvin-Benson cycle and sulphide oxidation enzymes in animals from sulphide-rich habitats. Nature. 1981;293: 291–293. https://doi.org/10.1038/293291a0
  14. 14. Le Pennec M, Beninger PG, Herry A. Feeding and digestive adaptations of bivalve molluscs to sulphide-rich habitats. Comp Biochem Physiol A Physiol. 1995;111: 183–189. https://doi.org/10.1016/0300-9629(94)00211-B
  15. 15. Spiro B, Greenwood PB, Southward AJ, Dando PR. 13C/12C ratios in marine invertebrates from reducing sediments: confirmation of nutritional importance of chemoautotrophic endosymbiotic bacteria. Mar Ecol Prog Ser. 1986;28: 233–240. https://www.jstor.org/stable/24817439
  16. 16. Gros O, Duplessis MR, Felbeck H. Embryonic development and endosymbiont transmission mode in the symbiotic clam Lucinoma aequizonata (Bivalvia: Lucinidae). Invertebr Reprod Dev. 1999;36: 93–103. https://doi.org/10.1080/07924259.1999.9652683
  17. 17. Gros O, Frenkiel L, Mouëza M. Embryonic, larval, and post-larval development in the symbiotic clam Codakia orbicularis (Bivalvia: Lucinidae). Invertebr Biol. 1997;116: 86–101. https://doi.org/10.2307/3226973
  18. 18. Osvatic JT, Yuen B, Kunert M, Wilkins LGE, Hausmann B, Girguis P, et al. Gene loss and symbiont switching during adaptation to the deep sea in a globally distributed symbiosis. ISME J. 2023;17: 453–466. pmid:36639537
  19. 19. Osvatic JT, Wilkins LGE, Leibrecht L, Leray M, Zauner S, Polzin J, et al. Global biogeography of chemosynthetic symbionts reveals both localized and globally distributed symbiont groups. Proc Natl Acad Sci U S A. 2021;118. pmid:34272286
  20. 20. Lim SJ, Davis BG, Gill DE, Walton J, Nachman E, Engel AS, et al. Taxonomic and functional heterogeneity of the gill microbiome in a symbiotic coastal mangrove lucinid species. ISME J. 2019;13: 902–920. pmid:30518817
  21. 21. Petersen JM, Kemper A, Gruber-Vodicka H, Cardini U, van der Geest M, Kleiner M, et al. Chemosynthetic symbionts of marine invertebrate animals are capable of nitrogen fixation. Nat Microbiol. 2016;2: 16195. pmid:27775707
  22. 22. König S, Gros O, Heiden SE, Hinzke T, Thürmer A, Poehlein A, et al. Nitrogen fixation in a chemoautotrophic lucinid symbiosis. Nat Microbiol. 2016;2: 16193. pmid:27775698
  23. 23. Taylor JD, Glover EA, Yuen B. Closing the gap: a new phylogeny and classification of the chemosymbiotic bivalve family Lucinidae with molecular evidence for 73% of living genera. Journal of Molluscan. 2022; 88: eyac025. https://doi.org/10.1093/mollus/eyac025
  24. 24. Olm MR, Crits-Christoph A, Diamond S, Lavy A, Matheus Carnevali PB, Banfield JF. Consistent metagenome-derived metrics verify and delineate bacterial species boundaries. mSystems. 2020;5. pmid:31937678
  25. 25. Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9: 5114. pmid:30504855
  26. 26. Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57: 81–91. pmid:17220447
  27. 27. Lim SJ, Alexander L, Engel AS, Paterson AT, Anderson LC, Campbell BJ. Extensive thioautotrophic gill endosymbiont diversity within a single Ctena orbiculata (Bivalvia: Lucinidae) population and implications for defining host-symbiont specificity and species recognition. mSystems. 2019;4. pmid:31455638
  28. 28. Dos Santos PC, Fang Z, Mason SW, Setubal JC, Dixon R. Distribution of nitrogen fixation and nitrogenase-like sequences amongst microbial genomes. BMC Genomics. 2012;13: 162. pmid:22554235
  29. 29. Gros O, Liberge M, Felbeck H. Interspecific infection of aposymbiotic juveniles of Codakia orbicularis by various tropical lucinid gill-endosymbionts. Mar Biol. 2003;142: 57–66. https://doi.org/10.1007/s00227-002-0921-7
  30. 30. Durand P, Gros O. Bacterial host specificity of Lucinacea endosymbionts: interspecific variation in 16S rRNA sequences. FEMS Microbiol Lett. 1996;140: 193–198. pmid:8764482
  31. 31. Samper-Villarreal J, Sagot-Valverde JG, Gómez-Ramírez EH, Cortés J. Water quality as a potential factor influencing seagrass change over time at Cahuita national park, Costa Rica. Caribbean J of Science. 2021;51: 72–85. https://doi.org/10.18475/cjos.v51i1.a9
  32. 32. d’Croz L, Robertson DR. Coastal oceanographic conditions affecting coral reefs on both sides of the Isthmus of Panama. Proc 8th Int Coral Reef Sym. 1997;2: 2053–2058. Available from: https://stri-sites.si.edu/docs/publications/pdfs/ross_67.pdf
  33. 33. Petersen JM., Yuen B. The symbiotic “all-rounders”: Partnerships between marine animals and chemosynthetic nitrogen-fixing bacteria. Appl Environ Microbiol. 2021;87: e02129–20. pmid:33355107
  34. 34. Kimura K, Tran L-SP, Uchida I, Itoh Y. Characterization of Bacillus subtilis gamma-glutamyltransferase and its involvement in the degradation of capsule poly-gamma-glutamate. Microbiology. 2004;150: 4115–4123. https://doi.org/10.1099/mic.0.27467-0
  35. 35. Bouffartigues E, Si Hadj Mohand I, Maillot O, Tortuel D, Omnes J, David A, et al. The temperature-regulation of Pseudomonas aeruginosa cmaX-cfrX-cmpX operon reveals an intriguing molecular network involving the sigma factors AlgU and SigX. Front Microbiol. 2020;11: 579495. pmid:33193206
  36. 36. Arvizu-Gómez JL, Hernández-Morales A, Aguilar JRP, Álvarez-Morales A. Transcriptional profile of P. syringae pv. phaseolicola NPS3121 at low temperature: Physiology of phytopathogenic bacteria. BMC Microbiol. 2013;13: 1–16. https://doi.org/10.1186/1471-2180-13-81
  37. 37. Isabella VM, Clark VL. Deep sequencing-based analysis of the anaerobic stimulon in Neisseria gonorrhoeae. BMC Genomics. 2011;12: 51. https://doi.org/10.1186/1471-2164-12-51
  38. 38. D’Croz LO’Dea A. Variability in upwelling along the Pacific shelf of Panama and implications for the distribution of nutrients and chlorophyll. Estuar Coast Shelf Sci. 2007;73: 325–340. https://doi.org/10.1016/j.ecss.2007.01.013
  39. 39. Roth F, Stuhldreier I, Sánchez-Noguera C, Morales-Ramírez Á, Wild C. Effects of simulated overfishing on the succession of benthic algae and invertebrates in an upwelling-influenced coral reef of Pacific Costa Rica. J Exp Mar Bio Ecol. 2015;468: 55–66. https://doi.org/10.1016/j.jembe.2015.03.018
  40. 40. Rodríguez A, Alfaro E-J, Cortés J. Spatial and temporal dynamics of the hydrology at Salinas Bay, Costa Rica, Eastern Tropical Pacific. Revista de Biología Tropical. 2021;69: 105–126. http://dx.doi.org/10.15517/rbt.v69is2.48314
  41. 41. Stuhldreier I, Sánchez-Noguera C, Rixen T, Cortés J, Morales A, Wild C. Effects of seasonal upwelling on inorganic and organic matter dynamics in the water column of Eastern Pacific coral reefs. PLoS One. 2015;10: e0142681. pmid:26560464
  42. 42. VanInsberghe D, Arevalo P, Chien D, Polz MF. How can microbial population genomics inform community ecology? Philos Trans R Soc Lond B Biol Sci. 2020;375: 20190253. pmid:32200748
  43. 43. Dyksma S, Bischof K, Fuchs BM, Hoffmann K, Meier D, Meyerdierks A, et al. Ubiquitous Gammaproteobacteria dominate dark carbon fixation in coastal sediments. ISME J. 2016;10: 1939–1953. pmid:26872043
  44. 44. Martin BC, Middleton JA, Fraser MW, Marshall IPG, Scholz VV, Hausl B, et al. Cutting out the middle clam: lucinid endosymbiotic bacteria are also associated with seagrass roots worldwide. ISME J. 2020;14: 2901–2905. pmid:32929207
  45. 45. Cropper TE, Hanna E, Bigg GR. Spatial and temporal seasonal trends in coastal upwelling off Northwest Africa, 1981–2012. Deep Sea Res Part I. 2014;86: 94–111. https://doi.org/10.1016/j.dsr.2014.01.007
  46. 46. Fraser C, Hanage WP, Spratt BG. Recombination and the nature of bacterial speciation. Science. 2007;315: 476–480. pmid:17255503
  47. 47. Doroghazi JR, Buckley DH. A model for the effect of homologous recombination on microbial diversification. Genome Biol Evol. 2011;3: 1349–1356. pmid:22071790
  48. 48. Cohan FM, Perry EB. A systematics for discovering the fundamental units of bacterial diversity. Curr Biol. 2007;17: R373–86. pmid:17502094
  49. 49. Melendrez MC, Becraft ED, Wood JM, Olsen MT, Bryant DA, Heidelberg JF, et al. Recombination does not hinder formation or detection of ecological species of Synechococcus inhabiting a hot spring cyanobacterial mat. Front Microbiol. 2015;6: 1540. pmid:26834710
  50. 50. Shapiro BJ, Friedman J, Cordero OX, Preheim SP, Timberlake SC, Szabó G, et al. Population genomics of early events in the ecological differentiation of bacteria. Science. 2012;336: 48–51. pmid:22491847
  51. 51. Shapiro BJ, Polz MF. Microbial speciation. Cold Spring Harb Perspect Biol. 2015;7: a018143. pmid:26354896
  52. 52. Walker BK, Gilliam DS. Determining the extent and characterizing coral reef habitats of the northern latitudes of the Florida Reef Tract (Martin County). PLoS One. 2013;8: e80439. pmid:24282542
  53. 53. McCarthy DA, Lindeman KC, Snyder DB, Holloway-Adkins KG. Nearshore hardbottom reefs of East Florida and the regional shelf setting. In: McCarthy DA, Lindeman KC, Snyder DB, Holloway-Adkins KG, editors. Islands in the sand: Ecology and management of nearshore hardbottom reefs of East Florida. Cham: Springer International Publishing; 2020. pp. 23–43. https://doi.org/10.1007/978-3-030-40357-7_2
  54. 54. Pitts PA, Smith NP. An investigation of summer upwelling across Central Florida’s Atlantic coast: The case for wind stress forcing. J Coast Res. 1997;13: 105–110. https://www.jstor.org/stable/4298596
  55. 55. Bushnell B. BBMap: A fast, accurate, splice-aware aligner. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); 2014 Mar. Report No.: LBNL-7065E. Available from: https://www.osti.gov/biblio/1241166.
  56. 56. Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes de novo assembler. Curr Protoc Bioinformatics. 2020;70: e102. https://doi.org/10.1002/cpbi.102
  57. 57. Eren AM, Kiefl E, Shaiber A, Veseli I, Miller SE, Schechter MS, et al. Community-led, integrated, reproducible multi-omics with anvi’o. Nat Microbiol. 2021;6: 3–6. pmid:33349678
  58. 58. Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, et al. Anvi’o: an advanced analysis and visualization platform for ’omics data. PeerJ. 2015;3: e1319. pmid:26500826
  59. 59. Alneberg J, Bjarnason BS, de Bruijn I, Schirmer M, Quick J, Ijaz UZ, et al. Binning metagenomic contigs by coverage and composition. Nat Methods. 2014;11: 1144–1146. pmid:25218180
  60. 60. Kang DD, Li F, Kirton E, Thomas A, Egan R, An H, et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ. 2019;7: e7359. pmid:31388474
  61. 61. Olm MR, Brown CT, Brooks B, Banfield JF. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 2017;11: 2864–2868. pmid:28742071
  62. 62. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019;37: 540–546. pmid:30936562
  63. 63. Chklovski A, Parks DH, Woodcroft BJ, et al. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods. 2022;20: 1203–1212. https://doi.org/10.1038/s41592-023-01940-w
  64. 64. Bowers RM, Kyrpides NC, Stepanauskas R, Harmon-Smith M, Doud D, Reddy TBK, et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol. 2017;35: 725–731. pmid:28787424
  65. 65. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics. 2019;36: 1925–1927. pmid:31730192
  66. 66. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics. 2022;38: 5315–5316. pmid:36218463
  67. 67. Parks DH, Chuvochina M, Chaumeil P-A, Rinke C, Mussig AJ, Hugenholtz P. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol. 2020;38: 1079–1086. pmid:32341564
  68. 68. Parks DH, Chuvochina M, Waite DW, Rinke C, Skarshewski A, Chaumeil P-A, et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol. 2018;36: 996–1004. pmid:30148503
  69. 69. Shaffer M, Borton MA, McGivern BB, Zayed AA, La Rosa SL, Solden LM, et al. DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res. 2020;48: 8883–8900. pmid:32766782
  70. 70. Aramaki T, Blanc-Mathieu R, Endo H, Ohkubo K, Kanehisa M, Goto S, et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics. 2020;36: 2251–2252. pmid:31742321
  71. 71. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28: 27–30. pmid:10592173
  72. 72. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37: 1530–1534. pmid:32011700
  73. 73. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35: 518–522. pmid:29077904
  74. 74. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14: 587–589. pmid:28481363
  75. 75. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49: W293–W296. pmid:33885785
  76. 76. Junier T, Zdobnov EM. The Newick utilities: high-throughput phylogenetic tree processing in the UNIX shell. Bioinformatics. 2010;26: 1669–1670. pmid:20472542
  77. 77. Kolde R. pheatmap: Pretty heatmaps. 2015. Available from: https://cran.ms.unimelb.edu.au/web/packages/pheatmap/pheatmap.pdf.
  78. 78. Delmont TO, Eren AM. Linking pangenomes and metagenomes: the Prochlorococcus metapangenome. PeerJ. 2018;6: e4320. pmid:29423345
  79. 79. Shaiber A, Willis AD, Delmont TO, Roux S, Chen L-X, Schmid AC, et al. Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome. Genome Biol. 2020;21: 292. pmid:33323122
  80. 80. Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11: 119. pmid:20211023
  81. 81. Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol. 2011;7: e1002195. pmid:22039361
  82. 82. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9: 357–359. pmid:22388286
  83. 83. Lemoine F, Gascuel O. Gotree/Goalign: toolkit and Go API to facilitate the development of phylogenetic workflows. NAR Genom Bioinform. 2021;3: lqab075. pmid:34396097
  84. 84. Ishikawa SA, Zhukova A, Iwasaki W, Gascuel O. A fast likelihood method to reconstruct and visualize ancestral scenarios. Mol Biol Evol. 2019;36: 2069–2085. pmid:31127303
  85. 85. Rodriguez-R LM, Konstantinidis KT. The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes. PeerJ Preprints; 2016 Mar. Report No.: e1900v1. https://doi.org/10.7287/peerj.preprints.1900v1
  86. 86. Bradley RK, Roberts A, Smoot M, Juvekar S, Do J, Dewey C, et al. Fast statistical alignment. PLoS Comput Biol. 2009;5: e1000392. pmid:19478997
  87. 87. Shen W, Le S, Li Y, Hu F. SeqKit: A cross-platform and ultrafast Toolkit for FASTA/Q file manipulation. PLoS One. 2016;11: e0163962. pmid:27706213
  88. 88. Morel B, Williams TA, Stamatakis A, Szöllősi G. AleRax: A tool for gene and species tree co-estimation and reconciliation under a probabilistic model of gene duplication, transfer, and loss. bioRxiv. 2023. https://doi.org/10.1101/2023.10.06.561091
  89. 89. Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5: e11147. pmid:20593022
  90. 90. Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14: 1394–1403. pmid:15231754
  91. 91. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30: 772–780. pmid:23329690
  92. 92. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25: 1972–1973. pmid:19505945
  93. 93. Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26: 1641–1650. pmid:19377059
  94. 94. Price MN, Dehal PS, Arkin AP. FastTree 2—Approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5: e9490. pmid:20224823
  95. 95. Didelot X, Wilson DJ. ClonalFrameML: Efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol. 2015;11: e1004041. pmid:25675341
  96. 96. South A. rnaturalearth. World map data from natural Earth. 2017. Available from: https://github.com/ropensci/rnaturalearth.
  97. 97. Dunnington D. ggspatial: Spatial Data Framework for ggplot2. 2023. Available from: https://paleolimbot.github.io/ggspatial/, https://github.com/paleolimbot/ggspatial.
  98. 98. Wickham H. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York; 2016. Available from: https://ggplot2.tidyverse.org