Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Structural and evolutionary constraints shape adaptive landscapes of immune-related genes across mammalian phylogeny

  • Zhengtian Li ,

    Contributed equally to this work with: Zhengtian Li, Hafiz Ishfaq Ahmad

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Validation, Writing – original draft, Writing – review & editing

    lizhengtian@mail.qjnu.edu.cn (ZL); ishfaq.ahmad@iub.edu.pk (HIA)

    Affiliation College of Biological Resource and Food Engineering, Qujing Normal University, Yunnan, 655011, China

  • Mubbashar Hassan,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Clinical Sciences, Theriogenology Section, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Hafiz Ishfaq Ahmad ,

    Contributed equally to this work with: Zhengtian Li, Hafiz Ishfaq Ahmad

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – original draft, Writing – review & editing

    lizhengtian@mail.qjnu.edu.cn (ZL); ishfaq.ahmad@iub.edu.pk (HIA)

    Affiliation Department of Animal Breeding and Genetics, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur, Pakistan

  • Muhammad Adnan Ashraf,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Institute of Microbiology, Faculty of Veterinary Science, University of Veterinary and Animal Sciences, Lahore, Pakistan

  • Akhtar Rasool Asif,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliations Department of Animal Sciences, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan, College of Animal Sciences and Technology, Huazhong Agricultural University, Wuhan, China

  • Iram Qadeer,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Zoology, The Govt. Sadiq College Women University, Bahawalpur, Pakistan

  • Abid Hussain Shahzad,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Clinical Sciences, Theriogenology Section, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Shaista Abbas,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Visualization, Writing – review & editing

    Affiliation Department of Basics Sciences, Physiology Section, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Muhammad Sajid,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Pathobiology, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Abdul Mateen,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Clinical Sciences, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Irfan Ahmed,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Animal Nutrition, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur, Pakistan

  • Jamal Muhammad,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Parasitology, Faculty of Veterinary Sciences, Cholistan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan

  • Sayyed Aun Muhammad,

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Clinical Sciences, College of Veterinary and Animal Sciences, Jhang, Sub Campus UVAS, Lahore, Pakistan

  • Farid S. Ataya

    Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliation Department of Biochemistry, College of Science, King Saud University, PO Box 2455, Riyadh, 11451, Saudi Arabia

Abstract

The evolutionary dynamics of immune-related genes GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 were investigated through comprehensive phylogenetic and selection analyses across mammalian species. Using concatenated gene sequences, we applied advanced methods including PAML (a software tool that analyzes evolutionary selection pressures by comparing rates of genetic changes), MEME (a method to identify patterns in protein sequences that may indicate functional sites), and structural modeling (a technique to predict 3D protein shapes) to assess co-evolution and adaptation. Site- and branch-specific selection tests revealed widespread positive selection (ω > 1), with 15–26 branches showing statistically significant adaptive evolution (p < 0.05), particularly in functional domains critical for pathogen recognition and immune regulation. Recombination analysis identified gene-specific patterns, with GBP5, GZMB, and IRF7 exhibiting significant recombination breakpoints, while IFNG and TNFSF4 remained conserved. Functional annotation highlighted the biological relevance of selected sites, linking them to inflammasome activation (GBP5), apoptotic pathways (GZMB), interferon signaling (IFNG, IRF7), and lymphocyte regulation (KLRD1, TNFSF4). Tissue-specific expression analysis confirmed these genes’ roles in immune-active tissues, with enriched pathways including Th1/Th2 differentiation (KEGG hsa04658) and cytokine regulation. These findings underscore the persistent evolutionary arms race between hosts and pathogens, with immune genes adapting to maintain effective defense mechanisms. The study provides a framework for understanding mammalian immune gene evolution, offering insights into conserved functional domains that may inform therapeutic targeting and vaccine design. By integrating phylogenetics, selection analysis, and functional genomics, we elucidate the molecular signatures of adaptation in key immune regulators, advancing our knowledge of host-pathogen coevolution.

Introduction

The gastrointestinal tract of mammals is a dynamic ecosystem with a high population of billions of bacteria known as the gut microbiota. Animal gut microbiota is a complex symbiotic ecosystem that undergoes continuous fluctuations [1]. Environmental factors, including energy sources and changes in the niche induced by microbial colonizers, influence dynamic changes [2]. Microorganism growth is supported by carbon sources in the host’s food and shedding epithelial cells. The quantity of viable microbial biomass is restricted by intestinal secretions and peristalsis [3]. Understanding the boundaries of microbiota stability is essential for appropriately modeling biological systems and diseases in live organisms [4]. Inbreeding can create animals with identical genes in the host. Still, the microbiota, the microorganism community in the host, varies based on factors like the supplier, housing facility, and specific cages used for experiments. Differences in the microbiota in various colonies of inbred or targeted mouse models can explain the differences in observed phenotypic outcomes across different research facilities [5]. In other cases, the dominance of potent traits from one hazardous microbe could surpass any differences in the microbiota’s makeup. Yet, in other cases, variations in strains within a single organism might affect the interaction between the host and microbes in a mutually advantageous manner [6]. Establishing a genetically homogeneous colony of mice can be achieved by standardizing the microbiota in their intestines, which has been comprehensively sequenced and encompasses a wide variety of microbial species [7]. This aimed to provide uniform and reproducible research projects across different periods and research institutes. Existing studies on the microbiota have mostly been carried out for brief durations and have predominantly concentrated on certain species [810]. Numerous seminal studies have established the importance of gut microbiota for the growth and operation of the adaptive immune system. Segmented filamentous bacteria (SFB) in the gut play a key role in the differentiation of Th17 cells, a subset of T helper cells implicated in autoimmune disease development and pathogen defense [11]. This study demonstrated that Th17 cells were significantly lower in germ-free mice, who do not have a normal microbiota, highlighting the importance of gut bacteria in fostering the development of particular immune responses. Round and Mazmanian (2010) further demonstrated that the commensal bacterium Bacteroides fragilis produces polysaccharide A (PSA), which is essential for controlling the ratio of pro-inflammatory Th17 cells to anti-inflammatory regulatory T cells (Tregs). This study highlighted the importance of the microbiota in preserving immunological homeostasis and offered insights into how particular microbial compounds can alter the immune response [12]. Despite the fact that most studies highlight the advantageous function of gut bacteria in fostering adaptive immunity, there is some contradicting data. For example, Brown et al. (2019) questioned the universality of gut microbiota-induced Th17 cell activation, speculating that host factors or other microbial communities may have varying effects on this process. It suggests that the interaction between microbiota and adaptive immunity may be more context-dependent than previously believed. Their study found that in certain germ-free mice colonized with human microbiota, there was no rise in Th17 cells [13].

The humoral immune response, comprising antibodies, cytokines, and other soluble proteins, is an essential component of the host immune system that interacts with the gut flora. It is crucial in protecting against infections [14]. The reciprocal relationship between the host and the gut microbiota has garnered increasing attention in the scientific community as a co-evolutionary link. Internal and external evolutionary pressures over millions of years have shaped the genetic compositions of host and microbial communities [15]. This has resulted in a delicate equilibrium that improves the overall health and flexibility of the host organism [16]. A thorough understanding of how the molecular evolution of the gut microbiota intersects with the selection pressures impacting the host immune system, particularly concerning humoral immunity, is lacking despite distinct investigations on these topics. The guanylate-binding protein family, which includes GBP5 (Guanylate Binding Protein 5), is involved in some physiological functions, including the immunological response.

The guanylate-binding protein family, which includes GBP5 (Guanylate Binding Protein 5), is involved in some physiological functions, including the immunological response. GBP5 controls inflammasome activation, a critical mechanism for cleaving and activating inflammatory cytokines like IL-1β, in response to bacterial and viral pathogens [17]. The main sources of the protease GZMB (Granzyme B) are natural killer cells and cytotoxic T lymphocytes. GZMB is a key effector molecule that induces apoptosis in virus-infected cells and tumor cells by cleaving and activating executioner caspases [18]. IFNG (Interferon Gamma) is a cytokine that plays a central role in innate and adaptive immune responses. It is a potent activator of macrophages, driving antimicrobial activity and antigen presentation primarily through the JAK-STAT signaling pathway [19]. A transcription factor called IRF7 (Interferon Regulatory Factor 7) controls how type I interferons are expressed in response to viral infections. IRF7 is the master regulator of the type I interferon (IFN-α/β) response, amplifying interferon production upon viral detection [20]. Natural killer cells and certain T cell subsets carry a protein encoded by KLRD1 (Killer Cell Lectin-Like Receptor Subfamily D, Member 1). KLRD1 (CD94) forms complexes with NKG2 family members to recognize HLA-E molecules, playing a vital role in regulating NK cell cytotoxicity and cytokine production [21]. Immune signaling pathways involving RTP4 (Receptor Transporter Protein 4) may impact how the body reacts to gut microorganisms or microbial antigens. TNFSF4 (Tumor Necrosis Factor Superfamily Member 4) is a co-stimulatory molecule expressed on activated antigen-presenting cells. It binds to the OX40 receptor on T cells, providing a critical secondary signal that promotes T cell survival, effector function, and the development of memory [22]. TRAT1 (T Cell Receptor-Associated Transmembrane Adaptor 1) is involved in T cell receptor signaling and development. It plays crucial roles role in gut microbe interactions T cells in maintaining gut immune homeostasis and regulating responses to gut microbes [23].

Humoral immunity is a crucial element of the adaptive immune system. It produces antibodies and orchestrates immunological reactions against various illnesses [24]. The immune system faces continuous challenges, leading to an ongoing interaction between the host and its gut flora. The host’s immune system influences the microbial populations in the gastrointestinal tract through selection forces. The gut microbiota influences the host’s immune system through many mechanisms that regulate immunological equilibrium and tolerance [25,26]. Advancements in high-throughput sequencing technology and bioinformatics tools have greatly enhanced our ability to analyze the intricate molecular processes involved in host-microbiota interactions. Utilizing these approaches provides extraordinary opportunities to examine the genetic traits of both hosts and the microorganisms within them [27,28]. This enables a more in-depth research of the mutually important dynamics that have altered the ecology of the mammalian gut. These methods allow for a thorough examination of how the host’s humoral immune system affects the gut microbiota’s genetic development [29]. This study investigates several fundamental problems related to the co-evolution of the host and the microbiota. What molecular changes do gut microbiota undergo in response to the host’s humoral immune responses? How do these adaptations differ among various mammalian species? Do preservative mechanisms or distinctive characteristics exist that establish the co-evolutionary connection between the host’s humoral immunity and the gut microbiota? To investigate these inquiries, we will extensively examine the genetic variation present in the gut microbiota of several mammalian species. We aim to use advanced bioinformatics approaches to find genetic patterns that show positive selection. This will help us understand how the host’s immune system affects the gut microbiota in terms of evolution. This research aimed to investigate the rapid evolution of GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes and explain the significant sequence divergence found in different animal species. We showed that these genes underwent rapid evolution due to positive selection. We examined that genes evolved rapidly due to positive selection. These genes influence the balance between gut homeostasis and immunological-mediated disease, which play various roles in the immune response to gut bacteria, including immune cell activation, cytokine synthesis, and immune control. It is essential to comprehend their roles in gut microbiota to clarify the mechanisms behind host-microbe interactions and their effects on health and disease,

Materials and methods

Data collection

The sequences of the coding sections of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes utilized in the analysis were obtained from NCBI. To capture a broad spectrum of mammalian evolutionary history, we analyzed a total of 42 species representing 12 different orders (S1 Table). This approach aimed to reduce phylogenetic bias and provide a more generalizable understanding of the evolutionary processes under investigation. The included orders represent major clades such as Euarchontoglires (Primates, Rodentia, Lagomorpha, Scandentia, Dermoptera), Laurasiatheria (Cetartiodactyla, Carnivora, Chiroptera, Perissodactyla, Eulipotyphla), and Afrotheria, as well as representatives of marsupials (Didelphimorphia). We used gene sequences from the genomes of representatives of various mammalian species. The whole-genome sequences were retrieved from the Ensembl database. The amino acid and nucleotide sequences of GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1, which are important for gut microbiota adaptation, are expressed in mammalian species. Ensembl was accessed to get the coding sequences for these genes, all covering various mammalian species. This was done based on the gene annotation (two conserved neighboring genes) [30]. The BLASTn v2.2.29 + algorithm chose the most effective scaffold [31]. Checking for a start and stop codon was part of the annotation process that was carried out with MITOS [32]. MACSE v1.01b [33] and ClustalW v2 aligned the protein-coding and ribosomal genes [34]. We eliminated any genes that were less than a third of the length of the overall locus alignment. To capture a broad spectrum of mammalian evolutionary history, we prioritized species representing different branches of the mammalian phylogenetic tree. This approach aimed to reduce phylogenetic bias and provide a more generalizable understanding of the evolutionary processes under investigation. We included species from various orders, such as Primates, Carnivora, Rodentia, and Artiodactyla, ensuring various ecological and physiological adaptations.

Interspecific sequences alignments

The nucleotide sequences of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes were aligned separately using the ClustalW tool within the MEGA software v.7.0.14 [35] using the default parameters. The Gblocks v.0.91b software [36] was employed with standard settings to eliminate inaccurately aligned regions and differing segments.

Phylogenetic analysis

PartitionFinder v.1.1.1 [37] was utilized to determine each partition’s optimal partitioning scheme and substitution models before conducting phylogenetic analysis. This was based on the Akaike (AIC), corrected Akaike (AICc), and Bayesian (BIC) information criteria. The GTR + 0 + I model is the most suitable for molecular evolution [38]. RAxML (version 8.2.12) [39] generated the maximum probability unrooted tree with 10,000 bootstrap replicates. It was unable to designate an outgroup for tree construction since orthologs of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes were not found in the genomes of other organisms. Phylogenetic trees were constructed using gene sequence data from the mammalian GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes to show the evolutionary connections and alterations in these genes over time. Phylogenetic trees were created using MEGA (Molecular Evolutionary Genetics Analysis) version 10.0.5 [35] using a maximum likelihood method. The topology of the tree we built with the neighbour-joining method was evaluated by applying the maximum likelihood method to the Whelan and Goldman (WAG) substitution model [40]. To further evaluate the stability of the tree structure, we conducted 1000 bootstrap repetitions. Gene trees and other phylogenetic trees can be compared and evaluated precisely using the TreeBeST-generated species tree as a benchmark (http://treesoft.svn.sourceforge.net/viewrc/treesoft/) [41].

Recombination analysis

Recombination analysis was performed using the GARD (Genetic Algorithm for Recombination Detection) method [42] to identify potential recombination breakpoints that could confound selection analyses. The AICc score was used for model selection. A finding of significant recombination was supported if the multi-tree model (allowing different topologies between segments) was favor\ed over the single-tree model by an evidence ratio of 100:1 or greater (ΔAICc ≥ 10), indicating genuine topological incongruence rather than mere rate variation [43]. Using likelihood ratio tests (LRTs), opposing models were compared to identify the best-fitting model among M8 vs. M8a and M2 vs. M2a. The degrees of freedom (df) were calculated by subtracting the free parameters in the compared models [44]. Positive selection was shown by identifying codons with more than one dN/dS ratio (ω).

Codon-based positive selection analyses

We used site models (M1, M2, M8a, and M8) that allowed variation between sites to determine the chance of each site in each gene being under positive selection [45]. This was done to assess the probability of occurrence for each location inside each gene. This model detected signs of positive selection within the gene at a few specific locations during brief periods of evolutionary time. Both the alternative model of positive selection (ω > 1) and the null model of neutral evolution (ω = 1) were employed in the branch-site test to assess selective pressure on each branch. The alternative model of positive selection was selected due to its prediction that each branch will experience greater degrees of selection compared to the null model. We used this methodology to identify examples of positive selection on several genomic sites across all grasshopper lineages. We employed the likelihood ratio test (LRT) to assess each paired model and chose the one that most closely fits our data.

The optimal value of the Codon parameter was determined in the M1 model, with Hominidae as a foreground clade, based on AIC, AICc, and BIC criteria. The parameter specifies the equilibrium codon frequencies in the codon substitution model. The correct branch lengths of the phylogenetic tree for the codon-based analysis of positive selection were calculated using model M0 with fixed length = 0. The branch lengths were then fixed for all experiments with fix length = 2. Branch-site tests were conducted on 35 specific branches and clades of the mammalian phylogenetic tree using strict (χ 2-distribution of LRT statistics, P < 0.01) and relaxed conditions (50:50 mixture distribution of the χ 2-distribution and a point mass of zero, P < 0.05). The lenient settings were used to reduce the chance of a false-negative error, as the test is cautious under strict conditions [46]. The Bonferroni correction and the Benjamini-Hochberg procedure were used to minimize the chances of a false-positive error caused by multiple tests [47]. The BEB technique was utilized over a significant period of the LRT to identify codons likely to undergo positive selection, with PP criteria set at 0.9 and 0.95 [48]. The IBS program version 1.0 showed the localization of areas experiencing positive selection pressure in protein primary structures [49]. We conducted further positive selection tests using the MEME program within the HyPhy software package v.2.2.4 [50] to confirm the reliability of our results.

We compared the ratio of synonymous to non-synonymous substitutions (dN/dS) to determine whether selective pressure acted on homologous nutritional pathway genes. The ω was calculated using the PAML (version 4.9j) codon-based ML approach called CODEML [51]. We used two different PAML models to determine whether there was a difference in the selective pressures exerted on the various grasshopper lineages. In this analysis, we focused on the ω values at the ends of the branches. We focused on the rate at which mutations have accumulated between modern species and their closest reconstructed relatives. According to the free-ratio model [52], the & values at each branch are predictably random. Initially, positive selection was detected using the branch-site model in PAML [53]. The parameters for testing the null hypothesis were ω = 1. The level of statistical significance was determined by employing a chi-square distribution, with the difference in the number of parameters for the two models being equal to two times the difference in the log-likelihood values and the degrees of freedom. The identification of positive selection is frequently inconsistent due to different approaches regarding periods, assumptions, methodologies, and gene conversion bias [54]. The PAML site-branch model has been adjusted for multiple tests using Bonferroni’s correction with various parameters.

Furthermore, we validated these findings using various independent tools, including the HyPhy package (version 2.5.31) [55]. We used site models (M1, M2, M8a, and M8) that allowed variation between sites to determine the chance of each site in each gene being under positive selection. This was done to assess the probability of each position within each gene. This model detected signs of positive selection within the gene at a few specific locations during brief periods of evolutionary time [56]. Both the alternative model of positive selection (ω > 1) and the null model of neutral evolution (ω = 1) were employed in the branch-site test to assess if each branch experienced selective pressure. The alternative model of positive selection was selected for its prediction of higher selection levels on each branch compared to the null model. We used this methodology to identify examples of positive selection on several genomic sites across all grasshopper lineages. We employed the likelihood ratio test (LRT) to assess each paired model and chose the one that most closely fits our data.

Statistical tests were conducted using the CODEML algorithm in the PAML software package v.4.8 [42] to assess adaptive evolution... Site models (M8, M8a) and branch-site models (Test 1: Model A vs. A; Test 2: Model A1 vs. A) were employed [43]. Likelihood ratio tests (LRTs) were used to compare these nested models. The significance of the LRT statistic was assessed using a χ² distribution with the appropriate degrees of freedom. To account for multiple testing, we applied both the Bonferroni correction and the Benjamini-Hochberg FDR procedure [46]. For branch-site tests, we used both a strict significance threshold (p < 0.01) and a more lenient threshold (p < 0.05) to reduce the chance of false negatives [45]. Positively selected sites were identified using the Bayes Empirical Bayes (BEB) method [47], with posterior probabilities PP > 0.90 considered significant. To confirm the reliability of our PAML results, we conducted additional tests using the aBSREL method within the HyPhy software package v.2.2.4 [49], which tests for episodic diversifying selection on a per-branch basis. For aBSREL, significance was assessed at p < 0.05 after correction for multiple comparisons (FDR).

Protein domain and structure analysis

The positively selected sites from the previous stage were used for future structural analysis. We utilized the protein secondary structure prediction program PSIPRED 4.0 (version 4.0) [57] and the AlphaFold2 protein structure database [58] to generate educated guesses about the degree of similarity between the mammalian proteins’ predicted secondary and tertiary structures. SCANSITE 4.0 was used to develop predictions for the specific sites of kinase phosphorylation and binding domains, given a database of 81 mammalian kinases/domains [59]. The output was then filtered through an additional phase with the rigor level set to “high.” After that, the linker sections and the domains were examined by hand. To further understand the functional significance of the putatively selected locations, we superimposed them on the 3D structures of the proteins. Using the homology modeling software made available by the MODELLER (version 9.23) and I-TASSER server (version 5.1), we made predictions about the 3D gene structures [60]. The mammalian genome, received from GenBank, was used to deduce the protein sequences of positively chosen genes. From UniProt [61], we collected functional information regarding the presumptively recognized genes as being positively selected.

Functional analysis

The protein sequences were evaluated using two free tools found online: Clustal W was used for sequence alignment, and the LPIcom server was used to annotate amino acid similarities. This protein was analyzed with the help of the online LPIcom server [62]. We classified the detected proteins based on their projection at a particular gene ontology (GO) hierarchy level, emphasizing the GO ‘Biological Process’ (GOBP) class. To do this, we used the function ‘groupGO,’ which you can find here. The ‘enrichGO’ function was then used to execute enrichment tests for GOBP keywords based on a protein kinase distribution against a background list of all proteins in the relevant annotation database. These tests were conducted against a protein kinase distribution. To visualize all GO terms related to nutrient metabolism, we used g: GOSt, a web tool in the g: Profiler suite (http://biit.cs.ut.ee/gprofiler/), [63] in conjunction with Cytoscape’s Enrichment Map program (http://www.baderlab.org/Software/EnrichmentMap) [64]. We integrated information from these large-scale transcriptome investigations with that from the Genotype-Tissue Expression (GTEx) database Release V8 (dbGaP Accession phs000424.v8.p2) [65]. This database offers information on gene-level associations that explain how gene expression levels test and mediate impacts on phenotypes [66].

Results

We employed the MirrorTree method (Ochoa and Pazos, 2010) to confirm the co-evolution of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes. We computed the Pearson correlation coefficients for the evolutionary distance matrices of phylogenetic trees obtained from multiple sequence alignments of orthologous genes from several mammalian species. The Pearson correlation coefficient varied between 0.84 and 0.95, with a significance level 0.001. Phylogenetic trees were compared in pairs, showing high Pearson correlation coefficients, which confirmed the co-evolution of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes. We utilized concatenated gene sequences that were matched to create a phylogenetic tree. Employing concatenated gene sequences instead of individual gene sequences increased the statistical power of the molecular evolution study. It improved the accuracy of the resulting phylogenetic tree by analyzing a more significant number of substitutions. We generated an unrooted phylogenetic tree by merging the coding sections of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes (Fig 1). The terminal nodes of the phylogenetic tree were strongly supported by bootstrap values and closely matched known mammalian evolutionary relationships, with minor discrepancies.

thumbnail
Fig 1. Analysis of the domain structure and selection of the proteins Trat1, Gbp5, Ifng, Irf7, Klrd1, Rtp4, and Tnfsf4.

This diagram, generated using the DOG 1.0 illustrator, displays the structural organization of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 proteins, with a focus on the examination of their conserved domains. Emphasis is given to the protein domains, specifically identifying regions subject to positive selection. These sites are correlated with the three-dimensional configuration of proteins, exposing the adaptive evolution occurring at the molecular scale.

https://doi.org/10.1371/journal.pone.0332734.g001

Codon-based positive selection analyses

Before performing positive selection tests, we examined the sequences for recombination events since recombination might lead to inaccurate positive outcomes. The maximum likelihood method was employed to study molecular evolution. Nucleotide sequences encoding proteins can help identify evolutionary events involving episodic or persistent positive selection. Positive selection processes were tested during the molecular evolution of genes in the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 gene clusters. We utilized the CODEML tool to obtain log-likelihood function values for site models M8 and M8a. We conducted a likelihood ratio test (LRT) to identify sites experiencing positive selection pressure (ω > 1) across all branches of the mammalian evolutionary tree. There was statistical significance with an LRT value of 175.19 and a p-value of 0.01. By merging the sequences of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes, in silico studies demonstrated that every node of the mammalian evolutionary tree contains areas subject to positive selection. After that, we found the locations using the Bayes empirical Bayes (BEB) method. During their evolution, sites were deemed to have been subject to positive selection if their posterior probability (PP) was greater than 0.9. All branches of the evolutionary tree point to the same amino acid sites that are subject to positive selection and these places have been given probabilities and values. Most possible sites were identified inside the conserved domain regions of the proteins in the cluster (Fig 1).

We observed widespread positive selection events in these genes’ molecular history and explored the potential impact of episodic positive selection on the genes’ molecular evolution. Positive selection typically happens by influencing particular sites within specific clades and branches of a phylogenetic tree. After calculating the log-likelihood values for two branch-site models, we conducted Likelihood Ratio Tests (LRT) on specific clades and branches of the mammalian phylogenetic tree under strict and lenient conditions. To investigate whether certain sites are under positive selection (ω > 1) or relaxed negative selection in specific branches (foreground branches) of the mammalian phylogeny compared to other branches, we initially used the branch-site test 1 (Zhang et al., 2005). Multiple verified phylogenetic branches and clades exhibited statistically significant likelihood ratio test (LRT) values. Under stringent criteria, selection events were identified in 20 test branches, but under lenient conditions, they were recognized in 27 test branches (Table 1). Even in lenient testing conditions, the likelihood ratio test (LRT) scores for the branches and clades of the phylogenetic tree were not statistically significant. We did not find any relaxed negative or positive selection for these branches.

thumbnail
Table 1. Detailed site-by-site results from the FEL analysis.

https://doi.org/10.1371/journal.pone.0332734.t001

Test 1 could not distinguish between positive selection and relaxation of selective constraint, so we utilized test 2, developed by the authors, to directly assess the presence of positive selection in the lineages of interest. We tested the hypothesis that certain branches or groups of branches in the phylogenetic tree are under positive selection pressure (ω > 1) compared to other branches (A1 vs. A) for the branches and clades that passed test 1 (Zhang et al., 2005). We identified positive selection events in 15 of 20 branches using strict criteria and 26 out of 27 under less rigorous conditions (Table 1). An in silico investigation of the molecular evolution of the GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes (Fig 2) revealed independent positive selection events in most branches of the mammalian phylogenetic tree.

thumbnail
Fig 2. Results for synonymous (α) and non-synonymous (β) rates at each site are displayed as bars, representing maximum probability estimates.

The line under the null model (α = β) shows the estimations. This value censors estimates that are more than 10.

https://doi.org/10.1371/journal.pone.0332734.g002

The Bayes Empirical Bayes (BEB) method identified several codons under positive selection (PP > 0.95) for each gene. To interpret the functional potential of these evolutionary signatures, we mapped the positively selected sites onto known functional protein domains (S2 Table). Notably, many selected sites were located within critical functional domains: for example, sites in GBP5 were found within its GTPase domain, suggesting adaptation in its core enzymatic and oligomerization functions, while selected sites in GZMB clustered in the serine protease active site region, potentially influencing its substrate specificity and catalytic efficiency during cytotoxic immune responses.

The increased favourable selection rates on these sequences may be due to dS saturation or inadequate taxon sampling, impacting the reconstruction of the ancestral sequence and the calculation of several model parameters. It is widely known that this problem can sometimes yield inaccurately positive outcomes. The positions in the primary protein structure of the genes GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 were identified using the corresponding Figs and table (Fig 3). Closely related populations and species of animals appear to have had rapid evolutionary repercussions triggered by their microbiota, and these effects may have even impacted the recent evolution of humans. The gut microbiota has helped in the adaptive evolution of mammalian gut shapes designed to accommodate helpful microbes. The gut microbiota probably contributed to the development of both innate and adaptive immune systems in mammals.

thumbnail
Fig 3. Three-hit replacements frequently occur in non-synonymous substitutions.

Three-hit replacements with 3H+ support are substitutions at sites with an ER (3H + :2H) configuration. Three-hit substitutions with a 2H but not 3H+ support are replacements that happen at locations where the ratio of ER (3H + :2H) is less than 1 and the ratio of ER (2H: 1H) is not specified. The histogram displays the branch lengths where the two substitutions are estimated to occur.

https://doi.org/10.1371/journal.pone.0332734.g003

Adaptation selection analysis

aBSREL detected evidence of episodic diversifying selection on two out of the 35 branches in the GBP5 phylogeny. A total of 35 branches underwent official testing to diversify their selection. The significance of the results was evaluated using the Likelihood Ratio Test at a significance level of p < 0.05 after adjusting for multiple comparisons (Fig 4). The comprehensive findings table provides information on the significance and number of rate categories inferred at each branch (Table 1). aBSREL detected episodic diversifying selection on 9 out of 29 branches in the GZMB phylogeny (Fig 4); 29 branches underwent formal testing for diversifying selection. The significance was evaluated using the Likelihood Ratio Test at a threshold of p < 0.05, following adjustment for multiple testing. The full findings table contains information about the significance and number of rate categories inferred at each branch.

thumbnail
Fig 4. An aBSREL adaptive model tree was applied to analyze the full-length GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes across mammalian species.

The inferred ω distribution determines the shade of the branches. Alleles determined to be under positive selection, with a statistical significance of less than 0.05 after adjustment, are visually represented by thick black branches.

https://doi.org/10.1371/journal.pone.0332734.g004

aBSREL detected episodic diversifying selection on two branches out of 35 branches in the IFNG phylogeny (Fig 4). Thirty-five branches underwent official testing for diversifying selection. The significance of the results was evaluated using the Likelihood Ratio Test at a significance level of p < 0.05, with adjustments made for multiple testing (Fig 5). The comprehensive findings table contains information on the significance and number of rate categories inferred at each branch (Table 1). aBSREL detected episodic diversifying selection on 7 out of the 24 branches in the IRF7 phylogeny (Fig 4). 24 branches underwent official testing for diversifying selection. The significance of the results was evaluated using the Likelihood Ratio Test at a significance level of p < 0.05 after adjusting for multiple comparisons. The comprehensive findings table provides information on the significance and number of rate categories inferred at each branch (Fig 4). aBSREL detected evidence of episodic diversifying selection on four out of 35 branches in the KLRD1 phylogeny (Fig 4). A total of 35 branches underwent official testing to diversify their selection. The significance of the results was evaluated using the Likelihood Ratio Test at a threshold of p < 0.05, with adjustments made for multiple testing. The full findings table contains information about the significance and number of rate categories inferred at each branch. aBSREL detected episodic diversifying selection on 8 of the 29 branches in the RTP4 phylogeny (Fig 4). 29 branches underwent formal testing for diversifying selection. The significance was evaluated using the Likelihood Ratio Test at a threshold of p < 0.05, following adjustment for multiple testing. The full findings table contains information about the significance and number of rate categories inferred at each branch. aBSREL detected episodic diversifying selection on two out of the 58 branches in the TNFSF4 phylogeny (Fig 4). A total of 58 branches underwent formal testing to assess the presence of diversifying selection. After adjusting for multiple tests, the significance was evaluated using the Likelihood Ratio Test with a p < 0.05 threshold. The comprehensive findings table provides information on the significance and number of rate categories inferred at each branch. aBSREL detected evidence of episodic diversifying selection on two out of the 41 branches in the TRAT1 phylogeny (Fig 4). Forty-one branches underwent formal testing for a diverse selection. The significance of the results was evaluated using the Likelihood Ratio Test at a significance level of p < 0.05 while accounting for multiple testing (Fig 5). The full findings table contains information about the significance and number of rate categories inferred at each branch (Table 1).

thumbnail
Fig 5. The Site Log-Likelihood analyses provide estimated ω rate distributions for the relative rate distribution (mean 1) for site-to-site non-synonymous rate variation that fits MG94 with double and triple instantaneous substitutions.

https://doi.org/10.1371/journal.pone.0332734.g005

Recombination analysis

The GARD analysis detected recombination breakpoints in the GBP5 gene. GARD analyzed a total of 13,556 models at a speed of 21.42 models per second. The alignment consisted of 1183 possible breakpoints, resulting in a search space of 635810244030937500 models with a maximum of 7 breakpoints. However, the genetic algorithm only searched 0.00% of this search space (Fig 6). The AICc score of the best-fitting GARD model, which permits different topologies between segments (29983.2), is compared to that of the model assuming the same tree for all partitions inferred by GARD but allowing different branch lengths between partitions (30120.2). This suggests that the multiple-tree model may be preferred over the single-tree model by an evidence ratio of 100 or greater. This indicates that at least one of the breakpoints represents a genuine topological incongruence. The GARD analysis detected recombination breakpoints within the GZMB gene. GARD analyzed a total of 9905 models at a speed of 50.54 models per second. The alignment consisted of 458 possible breakpoints, resulting in a search space of 166123556333 models with a maximum of 5 breakpoints. However, the genetic algorithm only searched 0.00% of this search space (Fig 6).

thumbnail
Fig 6. Left: the algorithm’s best estimate of where to put breakpoints for each number of breakpoints considered.

Correct: the increase in the c-AIC score (log scale) when breakpoint numbers increase.

https://doi.org/10.1371/journal.pone.0332734.g006

The AICc score of the best-fitting GARD model, which permits different topologies between segments (10967.6), is compared to that of the model assuming the same tree for all partitions inferred by GARD but allowing different branch lengths between partitions (11723.4). This suggests that the multiple-tree model may be preferred over the single-tree model by an evidence ratio of 100 or greater. This indicates that at least one of the breakpoints represents a genuine topological incongruence. GARD did not detect any signs of recombination in IFNG. GARD analyzed a total of 2630 models at a speed of 49.62 models per second. The alignment consisted of 409 possible breakpoints, resulting in a search space of 409 models with a maximum of 1 breakpoint (Fig 6). The genetic algorithm examined 643.03% of this search space. The comparison of the AICc scores between the best-fitting GARD model, which permits different topologies between segments (9309.1), and the model that assumes the same tree for all partitions inferred by GARD but allows different branch lengths between partitions (9309.1), indicates that the multiple tree model cannot be favored over the single tree model by an evidence ratio of 100 or more. This suggests that some or all of the breakpoints may indicate rate variation rather than topological incongruence. Notably, GARD detected evidence of recombination breakpoints in the IRF7 and KLRD1 genes. GARD analyzed a total of 12,831 models at a speed of 41.52 models per second. The alignment consisted of 1253 possible breakpoints, resulting in a search space of 25635663809007 models with a maximum of 5 breakpoints (Fig 6). However, the genetic algorithm only searched 0.00% of this search space. The AICc score of the best-fitting GARD model, which permits different topologies between segments (23104.9), is compared to that of the model assuming the same tree for all partitions inferred by GARD but allowing different branch lengths between partitions (23164.6). This suggests that the multiple tree model may be preferred over the single tree model by an evidence ratio of 100 or more, indicating that at least one of the breakpoints represents a genuine topological incongruence. GARD analyzed a total of 1419 models at a speed of 43.00 models per second. The alignment consisted of 717 possible breakpoints, resulting in a search space of 257,403 models that might have up to 2 breakpoints (Fig 6).

The genetic algorithm examined only 0.55% of this search area. An evidence ratio of 100 or more favors the multiple tree model over the single tree model is shown by comparing the AICc scores of the best-fitting GARD model (16444.6) and the model that assumes the same tree for all partitions inferred by GARD (16511.0), which allows different branch lengths between partitions. This points to the possibility that a real topological incongruence is represented by one of the breakpoints. Finding recombination breakpoints in the RTP4 gene was accomplished by the Genetic Algorithm for Recombination Detection (GARD). GARD processed 11,981 models at a rate of 19.45 per second. The alignment consisted of 1041 possible breakpoints, resulting in a search space of 10,138,915,336,889 models with a maximum of 5 breakpoints. However, the genetic algorithm only examined 0.00% of this search space. The Genetic Analysis and Recombination Detection (GARD) method did not detect any evidence of recombination in the TNFSF4 gene.

GARD analyzed 1793 models at a speed of 23.29 models per second. The alignment consisted of 559 possible breakpoints, resulting in a search space of 559 models with a maximum of 1 breakpoint (Fig 6). The genetic algorithm examined 320.75% of this search area. The GARD analysis detected recombination breakpoints in the TRAT1 gene. GARD analyzed 2298 models at a speed of 69.64 models per second. The alignment consisted of 442 possible breakpoints, resulting in a search space of 97903 models with a maximum of 2 breakpoints. The genetic algorithm searched 2.35% of this search area.

Functional analysis

Initially, we identified all statistically enriched terms, such as GO/KEGG terms, canonical pathways, and hallmark gene sets. This was done based on either the default choices under Express Analysis or those made during Custom Analysis. We then calculated accumulative hypergeometric p-values and enrichment factors used for filtering. The remaining important phrases were further organized into a hierarchical tree structure using Kappa-statistical similarities among their gene memberships, similar to the approach employed in the NCI DAVID site. A threshold of 0.3 kappa score was used to convert the tree into term clusters. The terms contained in each cluster are exported in the Excel worksheet titled “Enrichment Analysis.” We extracted a subset of key phrases from the entire cluster and transformed them into a network layout. Each word is depicted as a circular node, sized according to the number of input genes associated with that term. The node’s color indicates its cluster identification, meaning nodes of the same color belong to the same cluster. Terms with a similarity score greater than 0.3 are connected by an edge, with the thickness of the edge representing the similarity score.

The network is shown using Cytoscape, with a “force-directed” structure and edge bundling to enhance clarity. Positively chosen sites have been found in the GBP5 protein, which contains the Guanylate-binding protein (GBP) and the N-terminal domain of Interferon (IFN)-inducible GTPase. These pathogens are a wide variety of bacteria, viruses, and protozoa, playing significant roles in innate immunity against them. After infection, it is drawn to bacteria that have escaped from vacuoles or contain pathogens, and it functions as a positive regulator of the assembly of inflammasomes by encouraging the release of ligands from the bacteria. This releases ligands that are recognized by inflammasomes, such as double-stranded DNA (dsDNA), that activates the AIM2 inflammasome or lipopolysaccharide (LPS), which activates the non-canonical CASP4/CASP11 inflammasome.

The GZMB protein has a trypsin-like serine protease domain that contains the active site and is found in members of the trypsin family. The serine proteases from the trypsin family exhibit catalytic activity through a charge relay system. This system involves an aspartic acid residue that forms a hydrogen bond with a histidine, forming a hydrogen bond with a serine. The IFNG protein, which possesses the IFN-gamma domain, exhibits antiviral properties and regulates the immune system. This substance is highly effective at stimulating macrophages and can inhibit altered cell growth. It can enhance the antiviral and anticancer effects of type I interferons. The interferon-regulatory factor 7 includes the truncated CREB-binding protein domain. The DRAF1 (double-stranded RNA-activated factor 1) consists of these two subunits (Fig 8). The production of viral double-stranded RNA (dsRNA) during viral transcription or replication results in the activation of DRAF1. The DNA-binding specificity of DRAF1 is directly related to the transcriptional stimulation of ISGs (interferon-alpha, beta-stimulated genes). The protein IRF-3 is initially present in the cytoplasm of cells that have not been infected, but it moves to the nucleus after a viral infection occurs.

thumbnail
Fig 7. Tissue-specific expression of GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 genes in humans used as a reference genome.

The expression data of these genes were revealed across various tissues courtesy of the GTEx consortium.

https://doi.org/10.1371/journal.pone.0332734.g007

The translocation of IRF-3 is accompanied by an elevation in the phosphorylation of serine and threonine residues, and its interaction with the CREB co-activator only occurs following infection. The carbohydrate-recognition domain (CRD), often called the C-type lectin domain (CTL), comprises around 110–130 amino acid residues. Two disulfide connections are formed by four cysteines, all of which are fully conserved. Lectins have a very diverse range of structural characteristics and functional attributes. The capacity to bind carbohydrates may have independently and sporadically developed in several unrelated families, each producing a conserved structure for a distinct purpose. Animal lectins function as immune system recognition molecules and are involved in cell migration, pathogen protection, immunological regulation, and the avoidance of autoimmunity (Fig 8). TNF is the name of a domain found in the protein known as tumor necrosis factor. Families of cytokines can assemble into complexes with three subunits that are the same or distinct. The mature T-cell receptor uses TNF to induce apoptosis, and the p75 TNF receptor mediates this process. An essential tool found in the GTEx database is an expression quantitative trait locus (eQTL) browser. This browser functions as a storage and graphical display of data collected from a nationwide research initiative that sought to discover links between genetic variants and high-throughput molecular-level expression phenotypes (Fig 9). It is worth mentioning that a considerable number of genes display connections with different tissues. Gbp5, Gzmb, Ifng, Irf7, Klrd1, Rtp4, and Trat1 genes exhibit significant expression in whole blood, whereas Klrd1, Rtp4, and Trat1 genes revealed expression in the spleen (Fig 7). Tnfsf4 and Rtp4 have shown expression in lymphocytes. Nevertheless, our examination of the mean expression levels of all (significant) genes using various enrichment techniques yielded inconclusive results about the tissues expected to have a higher prevalence of diseases and well-established biological processes (Table 2).

thumbnail
Table 2. The network performed GO enrichment analysis to identify the underlying “biological meanings”.

https://doi.org/10.1371/journal.pone.0332734.t002

thumbnail
Fig 8. Detected all significantly enriched terms, including GO/KEGG terms, canonical pathways, and gene sets for GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1.

Selected key phrases from the entire cluster and transformed them into a network arrangement. A protein-protein network was built by extracting connections among the genes GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 from a data source of protein-protein interactions.

https://doi.org/10.1371/journal.pone.0332734.g008

thumbnail
Fig 9. Expression analysis of GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1 proteins in tissues.

https://doi.org/10.1371/journal.pone.0332734.g009

Discussion

The gut mucosal immune system interfaces the internal body and the external environment. The microorganisms present in the gut environment continuously impact the immune system. In return, the host’s immune system influences the microbiome’s makeup [67]. The widespread and persistent positive selection observed across the eight immune genes (GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1) strongly suggests an ongoing evolutionary arms race with a diverse array of pathogens. The specific functional roles of these genes allow us to hypothesize the classes of pathogens that have likely exerted the strongest selective pressures [68]. We analyzed a group of 8 protein-coding orthologs (GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1) that are present in the genomes of humans, monkeys, dogs, cats, cows, mice, and domestic yaks. Our goal was to identify signs of positive selection in these genes. Within these genes exhibiting statistically significant signals (P < 0.05 corrected), further analysis of branch sites indicated the presence of positive selection, specifically along the mammalian lineages. The M8 model, which employs positive selection, was utilized to detect variations at the codon level. A Markov Chain Monte Carlo (MCMC) model, implemented in MrBayes on the Selecton server, was utilized to ascertain the disparity at the codon level. Values were calculated for each place in both cases. The results of our study demonstrate that the coding sequences of eight genes exhibit domain conservation when analyzed using MAFFT protein alignments (Fig 1). These findings indicate that non-identical protein switches in areas undergoing purifying selection are detrimental to health and, therefore, unlikely to become established during evolution. The significant positive selection in GBP5, a key activator of inflammasomes against cytosolic bacteria (e.g., Listeria, Salmonella) and a regulator of antiviral responses, points to intense pressure from intracellular bacteria and viruses [68]. Pathogens that escape phagosomes or inject effectors to manipulate host cell machinery would create a strong selective advantage for mutations in GBP5 that enhance detection or circumvent pathogen inhibition [69]. Similarly, the selection in IRF7 and IFNG, central orchestrators of the antiviral interferon response, implicates a history of conflict with viruses. RNA viruses, with their high mutation rates and rapid evolution, are particularly potent drivers of such adaptation, constantly presenting new ligands and decoys that select for counter-adaptations in the host’s interferon signaling pathway.

Genetic transfers and duplications are fundamental in developing all major adaptive immune molecular systems on a horizontal scale [69]. Prior research on the evolution of immune genes in birds mostly examined the co-evolution of disease hotspots, such as MAVS, in the context of influenza virus infection. They play a role in activating lymphocytes, regulating the immune system, stimulating T regulatory cells, and influencing the development and tolerance of autoimmunity [53]. The impact of selection on host organisms in regulating gut microbiota during the adaptive evolution of mammalian species remains inadequately understood despite the intricate mechanisms developed by our ancestors’ predecessors. Immune genes in mammalian genomes are evolving rapidly, suggesting the presence of pathogens and co-evolutionary dynamics known as red-queen dynamics [70]. The extent to which genetic variation in immune genes is affected by differences in gut microbiota among mammalian species remains uncertain. Bacteria can change rapidly during an evolutionary arms race, making it difficult for mammalian hosts to constantly adapt to control the microbiota, which also evolves to compete among itself [71]. Multiple verified phylogenetic branches and clades exhibited statistically significant likelihood ratio test (LRT) values. Under stringent criteria, selection events were identified in 20 test branches, but under lenient conditions, they were recognized in 27 test branches (Table 1). Even in lenient testing conditions, the likelihood ratio test (LRT) scores for the branches and clades of the phylogenetic tree were not statistically significant. We did not find evidence of relaxed negative or positive selection for these branches. Test 1 could not distinguish between positive selection and relaxation of selective constraint, so we utilized test 2, developed by the authors, to directly assess the presence of positive selection in the lineages of interest. We tested the hypothesis that certain branches or groups of branches in the phylogenetic tree are under positive selection pressure (ω > 1) compared to other branches (M2a vs. M2) for the branches and clades that passed test 1 [72]. The adaptive evolution in TNFSF4 (OX40L) and TRAT1, both involved in fine-tuning T-cell receptor signaling and co-stimulation, likely reflects a struggle to optimize the adaptive immune response against a broad spectrum of challenges. This could include pressure from persistent pathogens that exhaust T cells (e.g., chronic viruses like HIV or HCV) or from pathogens that manipulate co-stimulatory pathways to evade immunity [10,11]. Selection on these genes would favor variants that strengthen effective T-cell responses while potentially limiting immunopathology or countering pathogen-derived immunosuppressive factors

The functional repertoires in mammalian gut microbiotas have likely supported the evolution and diversification of chitin-eating and herbivory, the specialization of mammalian species and communities on hazardous diets, and potentially even recent dietary changes in human evolution [73]. Furthermore, there is increasing evidence that animals have adapted to depend on signals from the specific gut bacteria of their hosts during postnatal development and functioning. House mice colonized with the gut microbiota of rats or humans did not exhibit fully differentiated T cell repertoires, unlike those colonized with the gut microbiota of other house mice [74]. Mammals have evolved to rely on the specific gut microbiota of their hosts for guidance during postnatal growth and function, as supported by a growing body of research. These findings suggest that the immunological development of house mice has evolved to include components of their particular gut flora since the divergence of mice and rats [75]. The adaptation of a species to a new food is a significant driving force in the evolution of that species. The dietary modifications that occurred throughout the development of several primate species, including humans, have been extensively recorded across time [73]. The investigation also covers the extent of taxonomic conservation of specific immune-modulatory mechanisms, including the synthesis of antimicrobial peptides and the function of microbial metabolites. We have emphasized data indicating that, whereas certain systems are preserved, others may be more specialized and represent responses to certain ecological niches or evolutionary forces. We have found both distinct and maybe universal patterns in the interactions between the immune system and gut bacteria in several species. For example, the need for gut microbiota to support immune system maturation seems to be a universal feature across taxa, even though the precise microbial species and immunological pathways implicated may vary.

Additionally, some genes have been identified as being involved in positive selection driven by nutrition. An extensively researched instance can be observed in pancreatic ribonuclease in old-world monkey species [41]. Due to the monkey’s dietary changes, the protein in this species has developed an enhanced capacity to break down bacterial DNA. Another instance is lysozyme, which facilitates the breakdown of intestinal microorganisms. This protein has demonstrated positive selection in various primate groups, including humans [76]. The langur monkey, a species that has developed a foregut fermentation mechanism of digesting comparable to ruminants, provides the clearest understanding of the nature of this selection [77]. However, it has been hypothesized that the transition to a diet mostly consisting of meat, which would have necessitated adaptations in bacterial digestion, may have been a contributing factor [73]. Studies of alanine-glyoxylate aminotransferase (AGT), a gene that has different functions in herbivores and carnivores, support this opinion. Additionally, there is evidence of positive selection for this gene among simian primates [78]. aBSREL detected evidence of episodic diversifying selection on two out of the 35 branches in the GBP5 phylogeny. A total of 35 branches underwent official testing to diversify their selection. The significance of the results was evaluated using the Likelihood Ratio Test at a significance level of p < 0.05 after adjusting for multiple comparisons (Fig 4). While episodic diversifying selection was found on 9 out of 29 branches in the GZMB phylogeny, aBSREL detected episodic diversifying selection on two branches out of a total of 35 branches in the IFNG phylogeny and 7 out of the 24 branches in the IRF7 phylogeny (Fig 4). aBSREL detected evidence of episodic diversifying selection on four out of 35 branches in the KLRD1 phylogeny and 8 out of the 29 branches in the RTP4 phylogeny (Fig 4).

Furthermore, the mammalian innate and adaptive immune systems are examples of evolutionary adaptations that were most likely motivated, at least in part, by the need to control the gut microbiota’s composition in ways that enhance host fitness [79]. The immune system offers mechanisms for distinguishing and eliminating harmful germs while allowing healthy or commensal microbes to coexist [69]. Studies have demonstrated that the body’s lack of immunological components might harm the gut microbiota composition in hosts. For instance, it has been demonstrated that removing Toll-like receptors from the host genome causes disruptions in the makeup of the gut microbiota in mice, which modifies the host’s energy harvesting and metabolism in likely maladaptive ways [80]. The GARD analysis detected recombination breakpoints in the GBP5, GZMB, IRF7, KLRD1, RTP4, and TRAT1 genes. GARD analyzed a total of 13,556 models at a speed of 21.42 models per second. The alignment consisted of 1183 possible breakpoints, resulting in a search space of 6358 models with a maximum of 7 breakpoints. However, the genetic algorithm only searched 0.00% of this search space (Fig 6). The AICc score of the best-fitting GARD model, which permits different topologies between segments (29983.2), is compared to that of the model assuming the same tree for all partitions inferred by GARD but allowing different branch lengths between partitions (30120.2). This suggests that the multiple-tree model may be preferred over the single-tree model by an evidence ratio of 100 or greater. This indicates that at least one of the breakpoints represents a genuine topological incongruence. GARD did not detect any signs of recombination in the IFNG and TNFSF4 genes. GARD analyzed a total of 2630 models at a speed of 49.62 models per second. The alignment consisted of 409 possible breakpoints, resulting in a search space of 409 models with a maximum of 1 breakpoint (Fig 6). The genetic algorithm examined 643.03% of this search space. The comparison of the AICc scores between the best-fitting GARD model, which permits different topologies between segments (9309.1), and the model that assumes the same tree for all partitions inferred by GARD but allows different branch lengths between partitions (9309.1), indicates that the multiple tree model cannot be favored over the single tree model by an evidence ratio of 100 or more. This suggests that some or all breakpoints may indicate rate variation rather than topological incongruence. Similarly, the makeup of the gut microbiota is different in RAG1−/− mice, who do not have adaptive immune systems. While all mammalian species have gut microbiota, the evolutionary consequences of interacting with gut microbiota probably vary from mammalian to mammalian taxon [81]. According to recent investigations, mammalian orders exhibit varying degrees of concordance between the gut microbiota makeup and the host’s phylogenetic history [82]. While the gut microbiotas of certain animal orders like bats show relatively minor relationships with host phylogeny, the gut microbiotas of most other orders exhibit robust evidence of phylogenetic signal. Mammalian species may be spared the risk of evolutionary reliance on a particular gut microbiota if there is no species-specific gut microbiome [83]. In these circumstances, hosts might only evolve to incorporate signals from ambient or non-specific microorganisms into their growth, as opposed to signals from particular bacteria or microbes. On the other hand, hosts might stop depending on microbes for development [84]. These theories highlight the necessity for manipulation studies, including a greater variety of mammalian species with gut microbiotas that differ in phylogenetic signal. While this study has highlighted key interactions between gut microbiota and the adaptive immune system in various mammalian species, vast unexplored territory remains concerning how these interactions vary across ecological niches and evolutionary histories. Future research could focus on a broader array of species, particularly those from understudied environments, such as deep-sea mammals or high-altitude species. Investigating how extreme environmental pressures influence microbiota-immune system co-evolution could provide new insights into adaptive strategies. Another promising direction is the longitudinal study of immune system development from infancy to adulthood in various mammalian species. This would allow researchers to observe how the immune system evolves and adapts in response to environmental changes, infections, and diet across different life stages. Such studies could shed light on the timing and triggers of key immune system adaptations and how these might differ between species with varying life histories.

The specificity of the immune system and gut bacteria interactions represents a major knowledge gap. The entire range of microbial species and their distinct roles in immunological regulation are yet unknown, despite some, such as Bacteroides fragilis and segmented filamentous bacteria (SFB), have been identified as important participants. More thorough research is required to investigate the function of lesser-known microbial species and their interactions with various immune system components. Furthermore, it is yet unclear how particular microbial metabolites affect the development and functionality of immune cells.

Conclusion

This study provides a comprehensive evolutionary and functional analysis of eight critical immune-related genes—GBP5, GZMB, IFNG, IRF7, KLRD1, RTP4, TNFSF4, and TRAT1—revealing their adaptive evolution across mammalian species. Our findings demonstrate that these genes have undergone significant positive selection, particularly in functional domains essential for pathogen recognition and immune regulation. The high correlation in their phylogenetic trees (Pearson r = 0.84–0.95) supports their co-evolution, likely driven by shared selective pressures from pathogens. The identification of positively selected sites within conserved domains, such as GBP5’s GTPase region and GZMB’s serine protease active site, suggests that these genes have fine-tuned their functions to enhance host defense while maintaining core immune activities. A key discovery was the variability in recombination rates among these genes. While GBP5, GZMB, and IRF7 exhibited significant recombination breakpoints—potentially facilitating rapid adaptation—IFNG and TNFSF4 remained stable, possibly due to stronger functional constraints. This dichotomy highlights the balance between evolutionary flexibility and conserved immune mechanisms. The branch-site selection analysis further revealed episodic diversifying selection, with 15–26 branches under significant positive selection, indicating that these genes have repeatedly adapted to emerging pathogenic threats throughout mammalian evolution. Functional annotation linked the selected sites to critical immune processes, including inflammasome activation (GBP5), cytotoxic T-cell function (GZMB), and interferon-mediated antiviral responses (IFNG, IRF7). Tissue-specific expression analysis reinforced their roles in immune-active tissues, with notable enrichment in whole blood, spleen, and lymphocytes. Pathway analysis further connected these genes to Th1/Th2 differentiation and cytokine regulation, underscoring their importance in adaptive immunity. These findings have important implications for biomedical research. The positively selected sites identified in this study may serve as targets for immunotherapy or vaccine development, particularly in diseases where immune evasion is a challenge. Additionally, the observed co-evolutionary patterns suggest that these genes function as an integrated network, warranting further investigation into their synergistic roles in host defense.

Supporting information

S1 Table. Mapping of Positively Selected Sites to Functional Protein Domains.

This table maps the amino acid sites identified under positive selection (Posterior Probability > 0.95 from PAML BEB analysis) to known functional domains and discusses their potential functional implications in the context of host-pathogen co-evolution.

https://doi.org/10.1371/journal.pone.0332734.s001

(DOCX)

S2 Table. Data Source Accession Numbers.

This table provides the NCBI Gene ID and Ensembl Gene ID for the eight immune-related genes analyzed across the 42 mammalian species in this study. These accession numbers serve as unique identifiers for the gene sequences retrieved from the NCBI and Ensembl databases. Blank cells indicate that a standard, curated gene record for that species was not available in the respective database at the time of data collection and an alternative genomic scaffold was used for analysis.

https://doi.org/10.1371/journal.pone.0332734.s002

(DOCX)

References

  1. 1. Moran NA, Ochman H, Hammer TJ. Evolutionary and ecological consequences of gut microbial communities. Annu Rev Ecol Evol Syst. 2019;50(1):451–75. pmid:32733173
  2. 2. Thomas F, Hehemann J-H, Rebuffet E, Czjzek M, Michel G. Environmental and gut bacteroidetes: the food connection. Front Microbiol. 2011;2:93. pmid:21747801
  3. 3. Sonnenburg JL, Xu J, Leip DD, Chen C-H, Westover BP, Weatherford J, et al. Glycan foraging in vivo by an intestine-adapted bacterial symbiont. Science. 2005;307(5717):1955–9. pmid:15790854
  4. 4. Hornung B, Martins Dos Santos VAP, Smidt H, Schaap PJ. Studying microbial functionality within the gut ecosystem by systems biology. Genes Nutr. 2018;13:5. pmid:29556373
  5. 5. Mamantopoulos M, Ronchi F, Van Hauwermeiren F, Vieira-Silva S, Yilmaz B, Martens L, et al. Nlrp6- and ASC-Dependent Inflammasomes Do Not Shape the Commensal Gut Microbiota Composition. Immunity. 2017;47(2):339-348.e4. pmid:28801232
  6. 6. Mitri S, Foster KR. The genotypic view of social interactions in microbial communities. Annu Rev Genet. 2013;47:247–73. pmid:24016192
  7. 7. Ganobis C. Characterizing the mouse gut microbiome and improving mouse gut-derived microbial communities for mouse model studies. University of Guelph; 2022.
  8. 8. Sousa A, Ramiro RS, Barroso-Batista J, Güleresi D, Lourenço M, Gordo I. Recurrent Reverse Evolution Maintains Polymorphism after Strong Bottlenecks in Commensal Gut Bacteria. Mol Biol Evol. 2017;34(11):2879–92. pmid:28961745
  9. 9. Ramiro RS, Durão P, Bank C, Gordo I. Low mutational load and high mutation rate variation in gut commensal bacteria. PLoS Biol. 2020;18(3):e3000617. pmid:32155146
  10. 10. Pinto C, Melo-Miranda R, Gordo I, Sousa A. The Selective Advantage of the lac Operon for Escherichia coli Is Conditional on Diet and Microbiota Composition. Front Microbiol. 2021;12:709259. pmid:34367115
  11. 11. Ivanov II, Atarashi K, Manel N, Brodie EL, Shima T, Karaoz U, et al. Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell. 2009;139(3):485–98. pmid:19836068
  12. 12. Round JL, Mazmanian SK. Inducible Foxp3+ regulatory T-cell development by a commensal bacterium of the intestinal microbiota. Proc Natl Acad Sci U S A. 2010;107(27):12204–9. pmid:20566854
  13. 13. Brown EM, Kenny DJ, Xavier RJ. Gut Microbiota Regulation of T Cells During Inflammation and Autoimmunity. Annu Rev Immunol. 2019;37:599–624. pmid:31026411
  14. 14. Ahmed A, Saha B, Patwardhan A, Shivprasad S, Nandi D. The major players in adaptive immunity. Reson. 2009;14(5):455–71.
  15. 15. Maritan E, Quagliariello A, Frago E, Patarnello T, Martino ME. The role of animal hosts in shaping gut microbiome variation. Philos Trans R Soc Lond B Biol Sci. 2024;379(1901):20230071. pmid:38497257
  16. 16. Ecklu-Mensah G, Gilbert J, Devkota S. Dietary Selection Pressures and Their Impact on the Gut Microbiome. Cell Mol Gastroenterol Hepatol. 2022;13(1):7–18. pmid:34329765
  17. 17. Li Y, Lin X, Wang W, Wang W, Cheng S, Huang Y, et al. The Proinflammatory Role of Guanylate-Binding Protein 5 in Inflammatory Bowel Diseases. Front Microbiol. 2022;13:926915. pmid:35722277
  18. 18. Heutinck KM, ten Berge IJ, Hack CE, Hamann J, Rowshani AT. Serine proteases of the human immune system in health and disease. Molecular Immunology. 2010;47:1943–55.
  19. 19. Amoroso C, Perillo F, Strati F, Fantini MC, Caprioli F, Facciotti F. The Role of Gut Microbiota Biomodulators on Mucosal Immunity and Intestinal Inflammation. Cells. 2020;9(5):1234. pmid:32429359
  20. 20. Zhao G-N, Jiang D-S, Li H. Interferon regulatory factors: at the crossroads of immunity, metabolism, and disease. Biochim Biophys Acta. 2015;1852(2):365–78. pmid:24807060
  21. 21. Berntman E. Immuno-modulatory functions of CD1d-restricted natural killer T cells. 2006.
  22. 22. Dostert C, Grusdat M, Letellier E, Brenner D. The TNF Family of Ligands and Receptors: Communication Modules in the Immune System and Beyond. Physiol Rev. 2019;99(1):115–60. pmid:30354964
  23. 23. Choudhury R, Gu Y, Bolhuis JE, Kleerebezem M. Early feeding leads to molecular maturation of the gut mucosal immune system in suckling piglets. Front Immunol. 2023;14:1208891. pmid:37304274
  24. 24. Montalban-Arques A, De Schryver P, Bossier P, Gorkiewicz G, Mulero V, Gatlin DM 3rd, et al. Selective Manipulation of the Gut Microbiota Improves Immune Status in Vertebrates. Front Immunol. 2015;6:512. pmid:26500650
  25. 25. Garud NR, Good BH, Hallatschek O, Pollard KS. Evolutionary dynamics of bacteria in the gut microbiome within and across hosts. PLoS Biol. 2019;17(1):e3000102. pmid:30673701
  26. 26. Kogut MH, Lee A, Santin E. Microbiome and pathogen interaction with the immune system. Poult Sci. 2020;99(4):1906–13. pmid:32241470
  27. 27. Christley S, Cockrell C, An G. Computational Studies of the Intestinal Host-Microbiota Interactome. Computation (Basel). 2015;3(1):2–28. pmid:34765258
  28. 28. Blanco-Míguez A, Fdez-Riverola F, Sánchez B, Lourenço A. Resources and tools for the high-throughput, multi-omic study of intestinal microbiota. Brief Bioinform. 2019;20(3):1032–56. pmid:29186315
  29. 29. Karaduta O, Dvanajscak Z, Zybailov B. Metaproteomics-An Advantageous Option in Studies of Host-Microbiota Interaction. Microorganisms. 2021;9(5):980. pmid:33946610
  30. 30. Kikuta H, Laplante M, Navratilova P, Komisarczuk AZ, Engström PG, Fredman D, et al. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res. 2007;17(5):545–55. pmid:17387144
  31. 31. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. pmid:2231712
  32. 32. Bernt M, Donath A, Jühling F, Externbrink F, Florentz C, Fritzsch G, et al. MITOS: improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2013;69(2):313–9. pmid:22982435
  33. 33. Ranwez V, Harispe S, Delsuc F, Douzery EJP. MACSE: Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons. PLoS One. 2011;6(9):e22594. pmid:21949676
  34. 34. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(22):2947–8.
  35. 35. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol Biol Evol. 2018;35(6):1547–9. pmid:29722887
  36. 36. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17(4):540–52. pmid:10742046
  37. 37. Lanfear R, Frandsen PB, Wright AM, Senfeld T, Calcott B. PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses. Mol Biol Evol. 2017;34(3):772–3. pmid:28013191
  38. 38. Posada D, Crandall KA. Selecting the best-fit model of nucleotide substitution. Syst Biol. 2001;50(4):580–601. pmid:12116655
  39. 39. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. pmid:24451623
  40. 40. Le SQ, Dang CC, Gascuel O. Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol Biol Evol. 2012;29(10):2921–36. pmid:22491036
  41. 41. Ahmad HI, Mahmood S, Hassan M, Sajid M, Ahmed I, Shokrollahi B, et al. Genomic insights into Yak (Bos grunniens) adaptations for nutrient assimilation in high-altitudes. Sci Rep. 2024;14(1):5650. pmid:38453987
  42. 42. Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. GARD: a genetic algorithm for recombination detection. Bioinformatics. 2006;22(24):3096–8. pmid:17110367
  43. 43. Yang Z, Nielsen R. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol. 2002;19(6):908–17. pmid:12032247
  44. 44. Gutierrez J, Platt R, Opazo JC, Ray DA, Hoffmann F, Vandewege M. Evolutionary history of the vertebrate Piwi gene family. PeerJ. 2021;9:e12451. pmid:34760405
  45. 45. Zheng C, Ferrari D, Yang Y. Model Selection confidence sets by likelihood ratio testing. STAT SINICA. 2019.
  46. 46. Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22(12):2472–9. pmid:16107592
  47. 47. White T, van der Ende J, Nichols TE. Beyond Bonferroni revisited: concerns over inflated false positive research findings in the fields of conservation genetics, biology, and medicine. Conserv Genet. 2019;20(4):927–37.
  48. 48. Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22(4):1107–18. pmid:15689528
  49. 49. Dyachkova MS, Chekalin EV, Danilenko VN. Positive Selection in Bifidobacterium Genes Drives Species-Specific Host-Bacteria Communication. Front Microbiol. 2019;10:2374. pmid:31681231
  50. 50. Wu Y, Wang H. Convergent evolution of bird-mammal shared characteristics for adapting to nocturnality. Proc Biol Sci. 2019;286(1897):20182185. pmid:30963837
  51. 51. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91. pmid:17483113
  52. 52. Yang Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol. 1998;15(5):568–73. pmid:9580986
  53. 53. Ahmad HI, Afzal G, Iqbal MN, Iqbal MA, Shokrollahi B, Mansoor MK, et al. Positive Selection Drives the Adaptive Evolution of Mitochondrial Antiviral Signaling (MAVS) Proteins-Mediating Innate Immunity in Mammals. Front Vet Sci. 2022;8:814765. pmid:35174241
  54. 54. Ahmad HI, Khan FA, Khan MA, Imran S, Akhtar RW, Pandupuspitasari NS, et al. Molecular Evolution of the Bactericidal/Permeability-Increasing Protein (BPIFA1) Regulating the Innate Immune Responses in Mammals. Genes (Basel). 2022;14(1):15. pmid:36672756
  55. 55. Pond SLK, Frost SDW, Muse SV. HyPhy: hypothesis testing using phylogenies. Bioinformatics. 2005;21(5):676–9. pmid:15509596
  56. 56. Morgan CC, Loughran NB, Walsh TA, Harrison AJ, O’Connell MJ. Positive selection neighboring functionally essential sites and disease-implicated regions of mammalian reproductive proteins. BMC Evol Biol. 2010;10:39. pmid:20149245
  57. 57. Buchan DWA, Jones DT. The PSIPRED Protein Analysis Workbench: 20 years on. Nucleic Acids Res. 2019;47(W1):W402–7. pmid:31251384
  58. 58. Bagdonas H, Fogarty CA, Fadda E, Agirre J. The case for post-predictional modifications in the AlphaFold Protein Structure Database. Nat Struct Mol Biol. 2021;28(11):869–70. pmid:34716446
  59. 59. Obenauer JC, Cantley LC, Yaffe MB. Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res. 2003;31(13):3635–41. pmid:12824383
  60. 60. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:40. pmid:18215316
  61. 61. Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. The Universal Protein Resource (UniProt). Nucleic Acids Res. 2005;33(Database issue):D154-9. pmid:15608167
  62. 62. Singh H, Srivastava HK, Raghava GPS. A web server for analysis, comparison and prediction of protein ligand binding sites. Biol Direct. 2016;11(1):14. pmid:27016210
  63. 63. Zafeiropoulos H, Paragkamian S, Ninidakis S, Pavlopoulos GA, Jensen LJ, Pafilis E. PREGO: A Literature and Data-Mining Resource to Associate Microorganisms, Biological Processes, and Environment Types. Microorganisms. 2022;10(2):293. pmid:35208748
  64. 64. Isserlin R, Merico D, Voisin V, Bader GD. Enrichment Map - a Cytoscape app to visualize and explore OMICs pathway enrichment results. F1000Res. 2014;3:141. pmid:25075306
  65. 65. Consortium G. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318–30. pmid:32913098
  66. 66. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369(6509):1318–30. pmid:32913098
  67. 67. Tlaskalova-Hogenova H, Tuckova L, Mestecky J, Kolinska J, Rossmann P, Stepankova R, et al. Interaction of mucosal microbiota with the innate immune system. Scand J Immunol. 2005;62 Suppl 1:106–13. pmid:15953193
  68. 68. Goto Y, Kiyono H. Epithelial barrier: an interface for the cross-communication between gut flora and immune system. Immunol Rev. 2012;245(1):147–63. pmid:22168418
  69. 69. Müller V, de Boer RJ, Bonhoeffer S, Szathmáry E. An evolutionary perspective on the systems of adaptive immunity. Biol Rev Camb Philos Soc. 2018;93(1):505–28. pmid:28745003
  70. 70. Lighten J, Papadopulos AST, Mohammed RS, Ward BJ, G Paterson I, Baillie L, et al. Evolutionary genetics of immunological supertypes reveals two faces of the Red Queen. Nat Commun. 2017;8(1):1294. pmid:29101318
  71. 71. Moeller AH, Sanders JG. Roles of the gut microbiota in the adaptive evolution of mammalian species. Philos Trans R Soc Lond B Biol Sci. 2020;375(1808):20190597. pmid:32772670
  72. 72. Jiang H, Li J, Li L, Zhang X, Yuan L, Chen J. Selective evolution of Toll-like receptors 3, 7, 8, and 9 in bats. Immunogenetics. 2017;69(4):271–85. pmid:28013457
  73. 73. Luca F, Perry GH, Di Rienzo A. Evolutionary adaptations to dietary changes. Annu Rev Nutr. 2010;30:291–314. pmid:20420525
  74. 74. Sprouse ML, Bates NA, Felix KM, Wu H-JJ. Impact of gut microbiota on gut-distal autoimmunity: a focus on T cells. Immunology. 2019;156(4):305–18. pmid:30560993
  75. 75. Rosshart SP, Herz J, Vassallo BG, Hunter A, Wall MK, Badger JH, et al. Laboratory mice born to wild mice have natural microbiota and model human immune responses. Science. 2019;365(6452):eaaw4361. pmid:31371577
  76. 76. Zhang J, Zhang Y, Rosenberg HF. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat Genet. 2002;30(4):411–5. pmid:11925567
  77. 77. Amato KR, Yeoman CJ, Kent A, Righini N, Carbonero F, Estrada A, et al. Habitat degradation impacts black howler monkey (Alouatta pigra) gastrointestinal microbiomes. ISME J. 2013;7(7):1344–53. pmid:23486247
  78. 78. Birdsey GM, Lewin J, Holbrook JD, Simpson VR, Cunningham AA, Danpure CJ. A comparative analysis of the evolutionary relationship between diet and enzyme targeting in bats, marsupials and other mammals. Proc Biol Sci. 2005;272(1565):833–40. pmid:15888416
  79. 79. Ward AE, Rosenthal BM. Evolutionary responses of innate immunity to adaptive immunity. Infect Genet Evol. 2014;21:492–6. pmid:24412725
  80. 80. Peterson DA, Cardona RAJ. Specificity of the adaptive immune response to the gut microbiota. Adv Immunol. 2010;107:71–107. pmid:21034971
  81. 81. Barroso-Batista J, Demengeot J, Gordo I. Adaptive immunity increases the pace and predictability of evolutionary change in commensal gut bacteria. Nat Commun. 2015;6:8945. pmid:26615893
  82. 82. Muegge BD, Kuczynski J, Knights D, Clemente JC, González A, Fontana L, et al. Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science. 2011;332(6032):970–4. pmid:21596990
  83. 83. Brown BRP, Goheen JR, Newsome SD, Pringle RM, Palmer TM, Khasoha LM, et al. Host phylogeny and functional traits differentiate gut microbiomes in a diverse natural community of small mammals. Mol Ecol. 2023;32(9):2320–34. pmid:36740909
  84. 84. Lerouge I, Vanderleyden J. O-antigen structural variation: mechanisms and possible roles in animal/plant-microbe interactions. FEMS Microbiol Rev. 2002;26(1):17–47. pmid:12007641