The evolution of UDP-glycosyl/glucuronosyltransferase 1E (UGT1E) genes in bird lineages is linked to feeding habits but UGT2 genes is not

UDP-glycosyltransferase (UGT) catalyzes the transfer of glycosyl groups (e.g., glucuronic acid) to exogenous or endogenous chemicals and plays an important role in conjugation reactions. In vertebrates, UGT genes are divided into 5 families: UGT1, UGT2, UGT3, UGT5, and UGT8. Among these UGT enzymes, UGT1 and UGT2 enzymes are known to be important xenobiotic metabolizing enzymes in mammals. However, little is known about UGT1 and UGT2 genes in avian species. In this study, we therefore aimed to classify avian UGT1 and UGT2 genes based on their evolutionary relationships. We also investigated the association between UGT molecular evolution and ecological factors, specifically feeding habits, habitat, and migration. By examining the genomes of 43 avian species with differing ecology, we showed that avian UGT1E genes are divided into 6 groups and UGT2 genes into 3 groups. Correlations between UGT gene count and ecological factors suggested that the number of UGT1E genes is decreasing in carnivorous species. Estimates of selection pressure also support the hypothesis that diet influenced avian UGT1E gene evolution, similar to mammalian UGT1A and UGT2B genes.


Introduction
UDP-glycosyltransferase (UGT) catalyzes the transfer of glycosyl groups (including glucuronic acid, glucose, glycoside, and galactose) to exogenous or endogenous chemicals [1]. Vertebrate UGT genes are classified into 5 groups: 1, 2, 3, 5, and 8 [2][3][4]. In each UGT subfamily, genes were amplified by tandem duplication, with some of them specifically amplified (or even absent) of in lineages. For instance, UGT5 family genes are only in teleost fishes [3], while the UGT3 family is absent in chicken, turkey, and zebra finch [4], suggesting that avian species have 3 UGT families (UGT1, UGT2, and UGT8). Among these UGT enzymes, UGT1, UGT2, and UGT5 family enzymes were reported to catalyze the exogenous chemicals in zebrafish [5]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 UGT1 and UGT2 family enzymes are known to be related to xenobiotic metabolism in mammals as well [6]. Therefore, UGT1 and UGT2 enzyes would be considered to be related to xenobiotic metabolism also in avian species.
UGT1 enzymes use UDP-glucronic acid (UDPGA) to engage in glucronic-acid transfer [1,6]. Functional differences in UGT1As derive from a variable first exon among the 5 exons that constitute these genes (exon2-5 are conserved), with a prime example being immunoglobulin variation generating a robust immune defense [7]. In humans, UGT1A genes are divided into two functional groups, although this division is imperfect because of the genes' complex roles. Bilirubin-like-associated enzymes (UGT1A1, UGT1A3, UGT1A4, UGT1A5) comprise the first group, whereas phenol-like-associated enzymes (UGT1A6, UGT1A7, UGT1A8, UGT1A9, UGT1A10) comprise the second [6]. Although nearly all mammals possess UGT1A1 for conjugating bilirubin, mammalian UGT1A6 (important for xenobiotic metabolism) has become a pseudogene in carnivorous mammals, including cats, brown hyenas, and northern fur seals [8,9], likely because their diet does not contain harmful plant compounds. Moreover, our previous study indicated that the number of UGT1A genes is decreasing in carnivorous mammals [9].
Similar to UGT1, UGT2 enzymes use UDP-glucronic acid (UDPGA) [1]. In humans, 6 exons encode UGT2A1, 2A2, and 2A3; exons 2-6 are shared in UGT2A1 and 2A2, whereas UGT2A3 has only unique exons [2]. Although UGT2A1 and UGT2A2 are highly active in bile acid glucuronidation [10], UGT2A genes are mainly expressed in the nasal epithelium [11,12] and are also known to metabolize steroids [13]. Mammalian UGT2Bs are composed of 6 separately coded exons [2] and are abundantly expressed in the liver [1]. Human UGT2B enzymes conjugate endogenous compounds such as steroid hormones, retinoids, and fatty acids, as well as exogenous compounds including morphine, zidovudine, and nonsteroidal anti-inflammatory drugs [6]. Previously, we reported that a UGT2B31-like gene in Felidae has become a pseudogene and that similar to UGT1A, UGT2B genes have decreased in carnivorous mammals [14].
Several reports have examined the relationship of UGT1 and UGT2 genes among vertebrates [3,4] to obtain a better understanding of molecular evolution. For instance, zebra finch UGT genes were evaluated to determine the evolutionary relationships of vertebrate UGT1 and UGT2 [3]. Similarly, the genomic structures of vertebrate UGT1 and UGT2 genes [4] were uncovered using data from chicken, turkey, zebra finch, and other vertebrate genomes. However there is no report on the classification of comprehensive avian UGT genes based on the evolutionary relationship to other vertebrate UGTs.
In this study, we performed phylogenetic and synteny analyses to classify avian UGT1 and UGT2 genes, using data from 43 avian species representing 32 orders. Moreover, we aimed to clarify UGT evolution in birds by investigating the influence of key ecological factors (feeding habit, habitat, and migration). Our analyses yielded the first comprehensive classification of UGT1 and UGT2 genes in birds and confirmed that feeding habit (specifically carnivory) influenced the evolution of this gene family.

Synteny analysis
The chromosomal location of annotated genes was determined using genomic data from 43 bird species in Genbank (accession numbers in S2 Table). Human, mouse, green anole, western clawed frog, and zebrafish Ensembl gene locations were also used. Graphical representations of gene location were generated with the genoPlotR package [17] in R version 3.3.2 (R Core Team 2016).

Phylogenetic analysis
Gene location, maximum likelihood (ML) phylogenetic analysis, and BLASTn searches were used for classifying UGT families and selecting UGT1 and UGT2 genes for Bayesian phylogenetic analysis. UGT1 and UGT2 genes were divided into exon1 and other exons, and then analyzed separately for phylogeny construction. Amino acid sequences were aligned in MAFFT version 7.2 [18] with the auto option and trimmed in trimAl [19] with the automated1 option. For model selection and phylogenetic analysis of UGT1 exon1, sequences with >200 bp and no gaps above 15 bp were chosen (see supplementary information: S1-S4 Files). The best-fit model was selected using the Bayes information criterion (BIC) calculated by CodeML on Aminosan [20,21]. Phylogenetic analysis on each UGT family (UGT1 and UGT2) was performed in MrBayes5D [22][23][24][25] using 4 chains (3 heated, 1 cold). Models, MCMC generations, and burn-in generations are shown in S3 Table. Tracer 1.6 [26] was used to check for stabilization and convergence between runs.

Phylogenetic generalized least square analysis
To determine whether the UGT gene count was correlated with ecology, the feeding habit, habitat, and migration status of each bird species was first classified based on Almeida et al. (S1 Table) [27]. To correct for autocorrelation, phylogenetic generalized least square (PGLS) regression was performed [28,29] in R, with the gls function under a Brownian motion correlation structure (corBrownian). The ape [30], phytools [31], and geiger [32] packages were employed. The avian phylogenetic tree constructed by Prum et al. [33] was modified and used for PGLS analysis (S1 Fig). Model selection was performed with the Akaike information criterion (AIC).

Estimating selection pressure
Feeding habits may have exerted differing levels of selective pressure on UGT1 genes. To examine these potential differences, the omega (nonsynonymous and synonymous; dN/dS) ratios of phylogenetic branches were estimated with CodeML in PAML4.9 [20], using codon alignment and tree topology from phylogenetic analysis (S2 Data). An omega ratio greater than, equal to, or less than one indicates positive, neutral, or negative selection pressure, respectively. The F3×4 codon frequency was applied for estimating omega and kapper (transition and transversion) ratios. Three models were applied for estimating omega ratio: homogenous the omega ratio in all feeding habits (carnivory, omnivory, and herbivory); different omega ratio between carnivorous species and other feeding habits; and different omega ratios across all feeding habits. Likelihood ratio tests were performed to determine model fit.
Omega ratios of UGT1 sites were also estimated, and those under positive selection were predicted using the Bayes-Empirical-Bayes (BEB) test implemented with CodeML in PAML4.9 [20].

Classification of UGT families in bird lineages
The results of TBLASTN for 43 bird species using UGT genes from 5 other vertebrates (zebrafish, western clawed frog, green anole, mouse, and human) as queries, yielded 196 UGT1 and 108 UGT2 gene sequences (S2 Table). Nearly every tested bird species possessed UGT1 and UGT2 family genes.

Relationships between the number of UGT genes and ecological factors of birds
Our analyses indicated that habitat and migration did not significantly impact the number of UGT genes, but feeding habits did (Fig 4). The best PGLS model (lowest AIC) indicated that carnivorous species had a lower UGT1E count than omnivorous and herbivorous species (Table 1). However, the number of UGT2 genes had no clear relationship to feeding habits.

Estimating the selection pressure exerted by feeding habit on UGT1E exon1
The homogenous model estimated the omega ratio to be 0.392. In the model separating carnivory and other feeding habits, omega ratios were estimated as 0.329 (carnivores) and 0.436 (herbivores, omnivores). In the third model separating all three feeding habits, omega ratios were estimated as 0.329 (carnivores), 0.409 (omnivores), and 0.469 (herbivores). The likelihood ratio test indicated that omega ratios were significantly different across feeding habitats (p < 0.05).

Detecting positive selection sites on UGT1E exon1
The results of the BEB analysis [34] on estimated omega ratios (dN/dS) of UGT1A exon1 revealed 13 amino acid sites that were exposed to positive selection (Fig 5). Five of these 13 sites were located in a region related to aglycone variation in human UGT1As [35].

Discussion
Our classification of avian UGT1 and UGT2 family genes clarified the evolutionary relationship of genes in bird lineages. First, UGT2 family genes were classified into 3 groups (Fig 3,   Fig 1. Phylogenetic classification of UGT1s in bird species. The phylogenetic tree of avian UGT1 exon1 was constructed in mrbayes5d, using sequences with >200 bp and without gaps >15 bp. Avian UGT1 exon1 were divided into 6 major groups. Groups III and IV were further divided into 6 and 2 subgroups, respectively. Mammalian UGT2B genes all exhibit variable exons1-6, whereas mammalian UGT2A1 and UGT2A2 genes share exons2-6 [2]. Simlarly to mammalian UGT2A1 and UGT2A2, avian UGT2 genes share exons2-6. Thus, depending on what exons are used to construct the phylogram, different interpretations of mammalian and avian relationships arise. Phylograms of UGT2 exons2-6 indicate that mammal and avian UGT2 genes are distinct (S3 Fig). This suggests that the ancestor of birds and mammals possessed one avian UGT2 exon2-6 set and it was duplicated in the mammalian lineage. In contrast, phylograms of UGT2 exon1 implied similarity between bird_UGT2_group_III and mammalian UGT2A1 and 2A2 genes (Fig 3). In humans, UGT2A1 and UGT2A2 are mainly expressed in the nasal epithelium [11,12] in stable amounts. In contrast, UGT2B genes are abundantly expressed in the liver in variable amounts based on feeding habits [14]. Here, we observed relatively stable amounts of avian UGT2 genes, suggesting that in birds, UGT2 enzymes likely conjugate endogenous compounds and are more similar to mammalian UGT2A than UGT2B.
Our phylogram of UGT2 genes also appeared to be different from that of the previous study. A previous report indicated that mammalian UGT2A genes formed one clade, and each gene in humans has an orthologue in mice and other mammals [2,12]. The reason why UGT2A3 genes did not form one clade in this study could involve differences in the exon region used. Species are color-coded based on feeding habits: red, carnivorous; yellow, omnivorous; and green, herbivorous. UGT1 information was retrieved from a single contig. Avian UGT1Es were located between "USP40" and "SH3BP4". Roman numbers on the arrows indicate the UGT1 group number, "Ps" indicates pseudogenes and "XXX" indicates unclassified (<200 bp) genes. Synteny of UGT1 exon1 was well conserved among bird species. Some controversy exists in terms of nomenclature for avian UGT1 genes. Zebra finch UGT1 genes were named UGT1As based on evolutionary relationships [3]. However, the UGT nomenclature committee named UGT1 genes in chicken as UGT1Es based on their sequence similarity (https://prime.vetmed.wsu.edu/resources/udp-glucuronsyltransferasehomepage/current-nomenclature). In this study, we followed the second nomenclature a) The phylogenetic tree based on avian UGT2 exon1 was constructed in mrbayes5d. Avian UGT2 exon1s are divided into 3 major groups (indicated with Roman numerals on arrows), with Group_I and Group_II forming a clade distinct from mammalian UGT2. However, bird UGT2 Group_III genes formed a single clade with mammalian UGT2A1 and UGT2A2. b) Avian phylogeny and gene locations of avian UGT2 genes were visualized in genoplotR. All UGT2 information was retrieved from a single contig. Avian UGT2s were located between "SULT1" and "YTHDC1." Exons2-5 were shared across every avian species. Synteny of UGT2 exon1 was well conserved in birds.
guidelines and named UGT1 genes as UGT1Es. Our phylogenetic and synteny analyses classified UGT1 family genes into 6 major groups (Figs 1, 2 and 6). The results suggest that the avian common ancestor would have possessed 6 UGT1 genes that were subsequently duplicated in each lineage. Genomic organization also showed that some UGT1Es became pseudogenes in each lineage (Fig 2). This suggests that UGT1E genes underwent frequent duplication and loss in birth-and-death evolution. Notably, the number of UGT1E_group_III varies among birds and may be important for conjugating different exogenous compounds.
Our analysis found a significant relationship between UGT1E gene count and avian feeding habits, with carnivores possessing fewer UGT1Es than herbivores or omnivores Figs 6 and 4 and Table 1). When we examined the selection pressure on UGT1 exon1 to determine whether such differences were due to natural selection, we did not find evidence of positive selection on any phylogenetic branches (omega ratio > 1). However, the higher omega ratio in herbivorous species suggests that UGT1E enzymes play an important role in metabolizing toxic chemicals synthesized by plants.
However, few data on UGT1 structure are available to corroborate this potential function, even in humans. Available reports found a variable region around residues 105-131 in human UGT1As that appears to confer aglycone specificity [35] (Fig 5). This region is similar to the positively selected sites in avian UGT1Es, suggesting a link to aglycone specificity.
Indeed, some reports have described a similar relationship between xenobiotic metabolizing enzymes and feeding habits. Consistent with our study, carnivorous mammals, for   [9]. However, we did not observe a correlation between UGT2 gene count and avian feeding habit, in contrast with mammalian UGT2 genes. Among mammals, the number of UGT2B genes in carnivorous species was lower than that in herbivorous and omnivorous species [14]. This difference suggests that UGT2 gene evolution in birds and mammals experienced different selective pressures.
Here, we did not find any link between other ecological factors (habitat, migration) and UGT gene count. Our findings on the influence of ecology thus partially contradict existing avian studies on the relationship between ecological factors and genes related to xenobiotic metabolism. For example, carnivorous birds or those living in wet habitats tend to possess fewer sensitive aryl hydrocarbon receptors (AhR), which regulate the expression of genes encoding xenobiotic metabolizing enzymes (e.g., UGT1s and CYP1s). Avian AhRs are divided into three genetic types: highly sensitive (Ile-324 and Ser-380), moderately sensitive (Ile-324 and Ala-380), and less sensitive to dioxin (Val-324 and Ala-380) [36,37]. Certain species may have more of the third genetic type because they receive elevated levels of naturally occurring dioxins through the food web in specific habitats or under specific diets [38]. The lack of any observed relationship between UGTs and habitat in this study could be because UGTs are not directly involved in dealing with toxicity from dioxin-like compounds [36].
Previous research has also found that the omega ratio of avian cytochrome P450 (CYP), CYP2C23 (classically called CYP2H) and CYP2J_2, differs across feeding habits [27]; these enzymes are important for functionalization reactions in xenobiotic metabolism reactions. Among passerines, insectivores and granivores exhibit differing levels of ethoxyresorufin-Odeethylase (EROD) activity, which itself reflects CYP1A activity. Specifically, EROD activity is higher in insectivores than in granivores, possibly because some insects use defensive compounds from plants [39]. Still other reports have suggested that conjugation enzymes are important in nectar-eating birds for metabolizing nicotine, a process that varies in mechanism across avian species [40]. Together, these studies indicate that increased granularity in feeding classifications may yield a clearer picture regarding how diet influences UGT evolution. However, given the limited whole-genome data available at this point, we were unable to discriminate between insectivores and other carnivores, granivores and other herbivores, or nectarivores and other omnivores. Therefore, we recommend a focus on increasing wholegenome data for avian species to enhance investigations on the evolution of UGTs and other major genes in birds.
In this study, we confirmed ecological factors did not cause significant differences in the number of avian UGT2 genes. Given that mammalian UGT2 genes respond to ecological variation, our results suggest that different selective pressures influenced UGT2 evolution in birds versus mammals. However, UGT1 gene counts varied between feeding habits, with carnivores possessing significantly less than either herbivores or omnivores. Therefore, we conclude that diet exerted a clear effect on the evolution of avian UGT1E genes.