Evolution of Hepatic Glucose Metabolism: Liver-Specific Glucokinase Deficiency Explained by Parallel Loss of the Gene for Glucokinase Regulatory Protein (GCKR)

Background Glucokinase (GCK) plays an important role in the regulation of carbohydrate metabolism. In the liver, phosphorylation of glucose to glucose-6-phosphate by GCK is the first step for both glycolysis and glycogen synthesis. However, some vertebrate species are deficient in GCK activity in the liver, despite containing GCK genes that appear to be compatible with function in their genomes. Glucokinase regulatory protein (GCKR) is the most important post-transcriptional regulator of GCK in the liver; it participates in the modulation of GCK activity and location depending upon changes in glucose levels. In experimental models, loss of GCKR has been shown to associate with reduced hepatic GCK protein levels and activity. Methodology/Principal Findings GCKR genes and GCKR-like sequences were identified in the genomes of all vertebrate species with available genome sequences. The coding sequences of GCKR and GCKR-like genes were identified and aligned; base changes likely to disrupt coding potential or splicing were also identified. Conclusions/Significance GCKR genes could not be found in the genomes of 9 vertebrate species, including all birds. In addition, in multiple mammalian genomes, whereas GCKR-like gene sequences could be identified, these genes could not predict a functional protein. Vertebrate species that were previously reported to be deficient in hepatic GCK activity were found to have deleted (birds and lizard) or mutated (mammals) GCKR genes. Our results suggest that mutation of the GCKR gene leads to hepatic GCK deficiency due to the loss of the stabilizing effect of GCKR.


Introduction
Glucose, a major source of energy for all tissues, is obtained from the diet and stored as glycogen in the liver and muscle when in excess. Storage and release of glucose is a tightly regulated process involving numerous enzymes and regulatory proteins, with glucokinase (GCK) being a key regulatory enzyme [1]. GCK is an isozyme of the hexokinases, which catalyzes the phosphorylation of six-carbon sugars. GCK differs from the other hexokinases in its affinity for glucose and end-product inhibition [2]. GCK is the major glucose-phosphorylating enzyme in the liver, pancreatic islet beta-cells, and a few other glucose-sensing cells of the gut and brain [1][2][3]. In the liver, GCK is the first, and the rate-limiting, step in glucose utilization leading to glucose storage as glycogen, while in pancreatic beta-cells GCK acts as a glucose sensor and controls the secretion of insulin [2,3]. GCK also appears to have similar glucose sensing functions in some cells of the gut and the brain [1,3]. Mutations that prevent GCK expression or function in liver and pancreatic beta-cells are known to result in the maturity onset diabetes of the young 2 (MODY2) form of diabetes [4].
Similarly, activating mutations of GCK causing elevated GCK activity in liver and pancreatic beta-cells are also known to result in persistent hyperinsulinemic hypoglycemia of infancy (HPPI), in which high insulin levels induce low blood glucose levels [5].
A pair of tissue-specific promoters drives the expression of the GCK gene: an upstream beta-cell specific promoter that is also used in gut and neuronal cells; and a downstream liver-specific promoter [3,6]. Liver-specific GCK expression is absolutely dependent upon the presence of insulin and is repressed by glucagon [3,[6][7][8]. Consequently, GCK expression is observed to increase after feeding or with insulin treatment, and expression is repressed when starved or with insulin deficiency [7,8]. Hormonal regulation allows GCK to be active only when there is excess blood glucose, but the insulin effect on GCK is not potentiated by high glucose concentration [3,[6][7][8], in fact, glucose represses GCK expression, potentially to protect phosphate homeostasis [9]. In contrast to the nutritional regulation of GCK gene expression in the liver, little change in GCK mRNA levels is seen with changes in blood glucose levels or insulin in pancreatic beta-cells, although some change in mRNA stability might occur [3,10]. The differences in the regulation of GCK expression in these two major sites of expression are likely related to the tissue-specific functions: whereas the liver only needs GCK expression when there is excess blood glucose, pancreatic islet beta-cells requires constant expression of GCK as a sensor for measuring blood glucose levels [2,3].
In addition to regulation at the transcriptional level, GCK is also regulated in the liver at the post-transcriptional level by glucokinase regulatory protein (GCKR) [11,12]. GCKR is primarily expressed in the liver [13,14], and possibly the brain [15], with little if any expression in pancreatic beta-cells [16,17]. In the liver GCKR functions as both an inhibitor and a nuclear binding protein for GCK [11,18,19]. Under low glucose conditions, GCKR binds, inactivates and sequesters cytosolic GCK into the nucleus. With an influx of glucose, GCK within the nucleus is re-activated and returns to the cytoplasm [20][21][22]. Phosphorylated forms of fructose, whose levels reflect glucose metabolism, modulate the binding of GCK with GCKRfructose-1-phosphate weakens GCK and GCKR interaction, while fructose-6-phosphate promotes GCKR-GCK binding and subsequent GCK inactivation [3,12,23]. GCKR also has a role in stabilizing GCK and preventing its degradation [20][21][22]. Mice deficient in GCKR have normal or raised GCK mRNA levels but decreased liver GCK protein concentration and activity [24,25]. Furthermore, overexpression of GCKR in hepatocytes by adenoviral vectors increases both GCK protein and enzymatic activity [26]. These observations suggest that the abundance of GCKR may be an important regulator of GCK protein levels specifically in the liver.
The livers of some mammalian (e.g., cow and cat) and nonmammalian (e.g., birds and lizards) vertebrate species have been reported to be deficient in GCK activity [27]. Hexokinases have been examined from the livers of numerous vertebrate species and their chromatographic profiles suggest that the livers of some mammals, including ruminants (e.g., cow and sheep), bats, and cats, as well as other vertebrate species such as birds and some reptiles, have little if any GCK activity [13,[27][28][29][30][31]. In these surveys, species deficient in GCK were found to possess activities of the three other hexokinases (i.e., hexokinases I, II, or III) at levels similar to those of species that have GCK [27][28][29][30][31] suggesting that the specific depletion of GCK in these species was not compensated by an increase in the activity of another hexokinase. As these species do not demonstrate symptoms of diabetes or poor glucose metabolism, they most likely possess an intact glucose sensing mechanisms; thus, mutations that inactivate GCK function seem unlikely. This conclusion is supported by our recent characterization of GCK genes from diverse species, where intact GCK genes were found in the genomes of most vertebrates examined [32]. An exception was the genome sequences of two bat species (flying fox and little brown bat), where the GCK genes were found to be missing their liver-specific first exon, which might prevent expression of GCK in the liver but not other sites [32].
As indicated above, mice with a targeted disruption of the GCKR gene show a specific deficiency of GCK activity in their liver [24,25]. Low GCK activity is accompanied by an absence of GCKR in the liver of cats [33]. Since GCKR is the most important post-transcriptional regulator of GCK in most species and expressed almost exclusively in the liver of vertebrates that express GCK, including amphibians and reptiles [13], and not in the pancreatic beta-cells [16,17], the other major site of GCK function, these observations suggest that the loss of GCKR may explain a liver-specific GCK deficiency. GCKR genes, though, have only been identified or characterized in a few species [34][35][36]. A previous evolutionary analysis of GCKR sequences identified GCKR-like genes in mammals, amphibians, and fish, as well as in non-vertebrate species, and concluded that GCKR evolved from an N-acetylmuramate 6-phospate esterase by changing its binding specificity and losing its esterase activity [36]. To address the hypothesis that liver-specific GCK deficiency is caused by the loss of GCKR, we identified and characterized GCKR genes from the genomes of diverse vertebrate species. Our results show that species deficient in liver GCK activity have mutated or deleted GCKR genes, supporting the hypothesis that the loss of GCKR, and its ability to stabilize GCK, explains the loss of liver-specific GCK activity.

Genome sequence data
Genomic sequences encoding GCKR-like protein sequences were downloaded from release 69 of the Ensembl and PreEnsembl databases (www.ensembl.org and pre.ensembl.org) in August 2012. Genomes from all available vertebrate species maintained in these databases were searched by gene name, gene symbol (GCKR), or by similarity searches with the tblastn algorithm [37] using the human GCKR protein sequence [35]. Searches of the non-redundant and genome databases maintained at the National Center for Biotechnology Information (NCBI, www.ncbi.nlm.nih. gov) were used to complement the genomes gathered from the Ensembl database and aided in the identification of incomplete genes or coding region sequences. Sequences similar to GCKR were not found in several species. For these species, genomic sequences for genes in the predicted conserved genomic neighborhood were included in this analysis. Genomes were searched for orthologs of the genes that flank the GCKR genes in diverse species (see results for details) and the genomic sequences adjacent to these genes were searched for similarity to GCKR genes.

Alignment of sequences
Long genomic DNA sequences that included the GCKR gene were aligned with MultiPipMaker (pipmaker.bx.psu.edu/pipmaker/) [38,39]. Human and mouse GCKR genes were used as masters for these alignments, and the locations of exons and coding regions for these genes was obtained from the Ensembl or NCBI databases. The identity and locations of repetitive elements in the human and mouse genomic sequences were identified using RepeatMasker (www.repeatmasker.org). Genomic alignments were used to refine the predicted potential coding regions of the genes. Potential pseudogenes were identified as sequences that failed to predict open reading frames due to the presence of base changes that either introduced stop codons, created frame shifts that disrupted the coding sequence, or disrupted splicing consensus sequences. Predicted open reading frames and protein sequences were aligned with the ClustalW algorithm [40] as implemented in MEGA5 [41], with the protein sequences used as guides for the nucleotide sequence alignments.

Phylogenetic analysis
The origin of mutations that yielded the deletion of GCKR genes, or inactivation due to frame shift and/or splice site mutations were inferred by parsimony on the consensus vertebrate phylogenetic tree from Ensembl (www.ensembl.org). Phylogenetic trees were generated from the aligned GCKR protein sequences (see Figure S1 for alignment) using the neighbor joining method with Jones-Thornton- Thompson (JTT) or Dayhoff distances estimated using MEGA5 [41]. Phylogenetic trees of mammalian GCKR coding sequences were also generated from the DNA sequence data using either maximum likelihood composite distances or divergence at synonymous and nonsynonymous sites corrected by the Kimura 2-parameter method using MEGA5 [41]. The reliability of the trees was assessed by the bootstrap method, with the tree of vertebrate sequences rooted with the lamprey sequence and those of placental mammals with the sequence from a marsupial, the Tasmanian devil. Comparisons of the rates of sequence evolution on different species lineages were conducted using relative rate tests [42] as implemented in MEGA5 [41]. For relative rate tests, the sequence from the Tasmanian devil was used as the outgroup, and the numbers of unique amino acid substitutions was counted for each lineage. The probability that numbers of unique amino acid substitutions were equal was tested by a chi square test [42].

Identification of GCKR-like genomic sequences
The human GCKR gene is composed of 19 exons distributed over roughly 27 kb of genomic DNA [43]. Previously, cDNAs and predicted genes for GCKR have been identified or characterized in a few other species, such as the rat, dog, zebrafish, tetraodon, and Xenopus laevis [14,34,36], and evidence for GCKR protein activity was identified in a few additional species, such as chicken, trout, carp, and goldfish [44]. To better characterize GCKR genes in vertebrates we searched the available genomes in the Ensembl (or preEnsembl) and NCBI databases for sequences that potentially encode GCKR. Our searches resulted in the identification of GCKR-like sequences in most species (Table 1) including all species for which a GCKR cDNA/gene had previously been identified. In the Xenopus tropicalis genome, a genome that is still not completely assembled, we identified four short genomic scaffolds containing sequences highly similar to the previously characterized Xenopus laevis GCKR cDNA and could be combined to predict a near complete GCKR gene sequence with only a single exon (exon 15) missing (Table 1). In a few other species (i.e., mouse lemur, sloth, and Chinese softshell turtle, Table 1) multiple genomic sequences were also identified that were consistent with a single gene because no part of the GCKR gene was found twice. An exception was the marmoset, where as many as four copies of the GCKR gene may exist (Table 1).

Missing GCKR-like genes
While genomic sequences similar to GCKR were found in most vertebrates, sequences similar to GCKR were not found in the genome sequences of two mammals (shrew and platypus), a reptile (anole lizard), all 5 available birds (chicken, turkey, duck, zebra finch, and budgerigar), and a fish (cod) ( Table 1). Since genome assemblies for some species are still at a draft stage it might be expected that the failure to identify a GCKR-like sequence may simply be due to gaps in the available genomes. In contrast, our searches of these same genomes for GCK gene-like sequences resulted in the identification of complete or partial genes in all of these mammalian [32] and non-mammalian vertebrate species (Table S1), suggesting that the absence of sequences similar to GCKR in at least some (but possibly not all) of the genomes is due to the loss of this gene. Gene loss can be due to either deletion of a sequence from the genome, or failure to find using the blast similarity search algorithm as the sequence became non-functional and degenerated into a pseudogene. Missing genes can sometimes be found if the gene order surrounding a gene is conserved, as the adjacent genes can then be used to search for orthologous genomic regions. This approach has successfully been used to identify genes from divergent genomes showing minimal or no similarity in blast searches, such as a homolog of the mammalian leptin gene in fish [45]. To determine whether this approach might be useful, we examined the gene neighborhoods flanking the GCKR genes to determine whether gene order is conserved in vertebrate genomes. The genes FNDC4 (fibronectin type III domain containing 4) and ITF172 (intraflagellar transport 172 homolog) are located 59 to the human GCKR gene, while ZNF512 (zinc finger gene 512) gene and the predicted gene C2orf16 (chromosome 2 open reading frame 16) are located 39 (Figure 1). Genomes of other mammals possessing the GCKR gene have the same gene order -for some species, the genomic fragment containing GCKR is shorter and do not predict all four genes. While a GCKR gene could not be found in any of the bird genomes, it was found in two of the three reptile genomes, albeit they were on short genomic fragments (Table 1). Searches using the Painted turtle genomic sequence failed to identify any additional gene; however, a sequence downstream of the 39 end of the Chinese softshell turtle GCKR gene is similar to the ZNF512 gene ( Figure 1). The ZNF512 gene is located 39 and in the same transcriptional orientation as GCKR in both mammals and a turtle (Figure 1), suggesting that this order existed in their common ancestor.
The Xenopus tropicalis GCKR gene is distributed over four small genomic fragments ( Table 1) that do not show similarity to any other gene. The coelacanth is a lobe-finned fish that is more closely related to tetrapods (mammals, birds, reptiles and amphibians) than to other fish [46]. Genes similar to FNDC4 and LDLRAP1 (low density lipoprotein receptor adaptor protein 1) are found 59 to the coelacanth GCKR gene, while genes similar to PTK6/SRMS (protein tyrosine kinase 6/src-related kinase lacking C-terminal regulatory tyrosine and N-terminal myristylation sites) and BLK (B lymphoid tyrosine kinase) were found 39 ( Figure 1). The gene neighborhood near the coelacanth GCKR gene differs considerably from that of the human GCKR gene, this suggests that recombination has occurred; however, the FNDC4 is located in a similar position with the same transcriptional orientation, in both species, suggesting that the genomic linkage of these two genes was present in their common ancestor. The gene neighborhood of the zebrafish GCKR gene strengthens the conclusion of an ancestral GCKR -FDNC4 genomic linkage, as the zebrafish also has an FNDC4-like gene 59 to its GCKR gene ( Figure 1). The zebrafish genome also predicts an IFT172-like gene in a position orthologous to the human IFT172 gene, and a PTK6/SRMS6-like gene in a position orthologous to that found in the coelacanth genome ( Figure 1). These observations suggest that the FNDC4 and IFT172 genes were 59 to the GCKR gene (as seen in the human and zebrafish genomes) and the PTK6/SRMS gene was located 39 to GCKR (as seen in the zebrafish and coelacanth) in the common ancestor of fish and tetrapods. The gene order near the GCKR gene in the other fish genomes examined differed from that of zebrafish, and they shared an order that was consistent with that shown for the takifugu (Figure 1). All of the fish genomes have an XKR6 (XK, Kell blood group complex subunit-related family, member 6) gene near the GCKR gene, albeit in different relative positions and in a different transcriptional orientation, suggesting that this genomic region has been reorganized within fish.
Based on these gene maps (Figure 1), we used the conserved flanking genes to identify orthologous genomic regions in species where Blast failed to identify a GCKR-like sequence. Since the ancestral tetrapod is predicted to have an FNDC4-like and a ZNF512-like gene flanking the GCKR gene, we used these sequences to search the mammalian genomes (shrew and platypus), reptilian (anole lizard) and avian (chicken, turkey, duck, zebra finch and budgerigar) genomes where we failed to identify a GCKR gene with our similarity searches (see Table 1). Searches of the mammalian (shrew and platypus) and the reptilian (anole lizard) genomes resulted in identifying only short genomic fragments encoding sequences similar to FNDC4 and ZNF512, with none of these genomic fragments encoding any additional genes (results not shown). This result is consistent with the low coverage and fragmented nature of these genome sequences, and thus we cannot determine whether the gene neighborhood is conserved, or if the GCKR gene was lost or is present in an unsequenced gap of these genome assemblies. In contrast, large genomic fragments were typically identified from the avian genomes, where the FNDC4 and ZNF512 genes were linked on the same genomic fragment, as shown for the chicken in Figure 1. Searches of the genomic sequences between the FNDC4 and ZNF512-like genes from the bird genomes failed to identify any sequence with similarity to GCKR. These results strongly suggest that the GCKR gene was deleted from the avian genomes, and that this deletion event occurred before the divergence of the avian lineages represented by the genome sequences. The cod genome was also searched with genes that flank the GCKR gene in different fish genomes, and a genomic fragment with some similarity to that seen in coelacanth was found (containing FNDC, LDLRAP1, and BLK-like genes), but different from other ray-finned fish. This suggests that further reorganization has occurred (results not shown). Due to recombination, it is difficult to determine whether the cod GCKR gene was lost or is at a different location, such as an unsequenced gap, in this genome.

GCKR pseudogenes
Surprisingly, GCKR genes from many species, especially mammals, are poorly annotated (i.e., truncated or missing exons) or not annotated at all in the Ensembl database. To determine why some GCKR genes are poorly annotated we used MultiPip-Maker [38,39] to align the genomic sequences of mammalian GCKR genes to the well-characterized human GCKR gene [43]. With the human exon sequences as guides, coding exons in the other mammalian GCKR genes could be identified. Similar results were obtained when the mouse GCKR gene was used as the master sequence for the MultiPipMaker analysis (results not shown). Alignments of the GCKR gene exons allowed us to identify changes that have occurred in the GCKR gene in each species. Many of the exons that had not been annotated as exons of the GCKR genes in species with poorly annotated GCKR genes in the Ensembl database were found to have mutations that either introduced frame shifts (due to insertion or deletion of sequences) into the coding sequence, or would be predicted to prevent mRNA splicing ( Table 2). The effect of these mutations likely explains why gene prediction programs that annotated these genome assemblies failed to identify these sequences as coding exons (since splicing and coding potential are properties that should be retained by coding exons). The GCKR genes of many species were found to contain multiple disrupting mutations. In some cases, more than half of the 19 coding exons (e.g., cow with 11 exons) harbored mutations predicted to disrupt splicing or translation ( Table 2). Only a few species (squirrel, tree shrew, and opossum) have a single disrupting mutation that potentially could be explained as a sequencing error or the sequence of a rare mutated allele present in the population of alleles for those species. A total of 14 mammalian species were identified to have GCKR-like genes in their genomes that likely do not encode a functional GCKR protein as they have two or more coding sequence disrupting mutations ( Table 3).
The spotted gar was the only non-mammalian genome that contained a genomic sequence where all 19 GCKR coding exons could be identified and could be used to predict an intact GCKR protein (Table 1 and Figure S1). For all other non-mammalian species, the GCKR-like genomic sequence was missing at least one exon, most often exon 1. Unfortunately the missing exons cannot be identified because the non-mammalian (turtle, frog, and fish) genomic sequences do not align with the human, or any other mammalian, GCKR gene sequence used as a master sequence in Table 2. Exons bearing inactivating mutation in mammalian GCKR genes.

Species
Frame shift mutations Splice junction mutations  MultiPipMaker (results not shown), due to the large amount of sequence divergence that has occurred since their common ancestor that existed at least 300 million years ago [47]. Similar results were obtained when the spotted gar sequence was used as master sequence, as it also is distantly related to all of the other species (including ray-finned fish, whom they diverged from more than 300 million years ago [48]). However, the coding exons that were identified for the non-mammalian GCKR genes (see Table 1) maintained open reading frames and possessed intact spice consensus sequences that should allow the generation of an intact GCKR proteins (see Figure S1). While it remains possible that inactivating mutations exist in some of the uncharacterized exons, our current genomic evidence suggests that functional GCKR proteins exit for all of the non-mammalian species where a GCKR gene was identified.

Parallel inactivation of mammalian GCKR genes
Our characterization of vertebrate GCKR genes indicates that functional versions of this gene has been lost by several mechanisms, including point mutations that disrupt splicing or cause translational frame shifts or deletion mutations which have removed the gene from the genome (Tables 2 and 3). Evidence for deletion of the gene is strongest in birds, while the failure to find a GCKR gene in some of the other species could be due to gaps in the current genome assemblies. All birds that have an assembled genome lack a GCKR-like gene sequence, suggesting that the gene was deleted before the common ancestor of birds (see Figure 2). The anole lizard is closely related to birds and lacks a GCKR-like gene (Table 1) suggesting that the deletion of the GCKR gene may have occurred in a reptilian ancestor of the anole lizard and birds (see Figure 2). The GCKR gene was not deleted in the common ancestor of all reptiles as the gene was found in turtle genomes (Table 1). Within mammals, inactivating mutations appears to be the dominant mechanism for the loss of GCKR gene function -14 species showed inactivated genes due to mutations, while only 2 species may have deleted the gene (Table 3). A single mutational event cannot explain the origin of the inactivated mammalian GCKR genes. First, no single mutation is shared among the inactivated GCKR genes, even among the mutationally inactivated genes (see Table 2). Secondly, species with inactivated GCKR genes are not monophyletic (Figure 2). Parsimony was used to infer the minimum number of gene inactivation events on the phylogeny of vertebrates. At least six gene inactivation events are required to explain the diversity of species that have multiple mutations (i.e., at least one frame sift and at least one splice site mutation) (Figure 2). Inactivating mutations must have occurred in the GCKR gene on the lineages leading to: (1) tarsier, (2) ruminant artiodactyls, (3) cat, (4) bats, (5) hedgehog, and (6) common ancestor of Afrotheria and Xenarthra (see Figure 2).

Are some intact GCKR genes pseudogenes?
Many of the inactivated GCKR genes are found in Laurasiatheria (mammalian orders: Artiodactyls, Perisodactyls, Carnivores, Chiropteria, Eulipotyphla, and Pholidota) where 4 parallel gene loss events are inferred to have occurred ( Figure 2). An alternative explanation for this distribution of inactivated genes is that all of the Laurasiatherian GCKR genes are pseudogenes, which were generated by a single mutational event that did not disrupt the open reading frame but did disrupt GCKR protein function. Subsequent to the ancestral inactivating mutational event, additional mutations introduced frame shifts and splice site mutation into the GCKR genes of most of the species representing Laurasiatheria (e.g., cow, cat, and bats). However, the gene sequences from dog, panda, horse, and pig did not acquire these types of mutations, and thus retained intact open reading frames. Pseudogenes should acquire greater numbers of mutations compared to functional genes. Thus if all of the GCKR genes in Laurasiatheria are pseudogenes, then the intact GCKR sequences from dog, panda, horse and pig should have accumulated more mutations than those of other mammals that have functional GCKR genes, such as rodents and humans [11,12,[19][20][21][22][23]. To test this possibility we first constructed a phylogenetic tree (Figure 3) of the mammalian GCKR protein sequences from species that have full-length sequences, rooted with the sequence from the Tasmanian devil, a marsupial outgroup for placental mammals [47]. As shown in Figure 3, the dog, panda, horse and pig GCKR sequences may actually show less protein sequence evolution than other mammals, especially rodents, which is certainly not a pattern consistent with accelerated evolution that would be expected of pseudogenes. Similar conclusions are drawn from phylogenies generated using more distantly related outgroups, including fish, amphibians and reptiles, where the dog and elephant GCKR proteins sequences do not accumulate greater numbers of amino acid substitutions compared to human, mouse or Tasmanian devil ( Figure 4). To confirm that there was no acceleration in the rate of amino acid substitution in the GCKR sequences of species within Lauasiatheria compared to other placental mammals we conducted relative rate tests [42]. As shown in Table 4, when the dog, panda, horse, and pig sequences were tested against the human sequence, no significant difference in the number of amino acid substitutions was seen on either lineage. When the mouse sequence was used, a significant difference in the number of substitutions was seen, but it was the mouse sequence, not the potential pseudogene sequences, which had accumulated the greater number of amino acid substitutions (Table 4). If the rat sequence was used instead of the mouse, similar results were observed (results not shown). These results indicate that the intact GCKR coding sequence from species within Laurasiatheria are evolving as Figure 3. Phylogeny of mammalian GCKR protein sequences. Phylogeny of mammalian GCKR protein sequences generated from full length GCKR coding sequences (see Figure S1). The bootstrapped (1000 replications) neighbor-joining distance tree was generated using JTT protein distances. The sequence from the Tasmanian devil was used as the outgroup. Similar trees were generated when different protein distance measures, or distance measures based on nonsynonymous distances calculated from aligned DNA sequences, were used or if trees were built by other methods, such as parsimony or maximum likelihood. doi:10.1371/journal.pone.0060896.g003 slow, or slower, than those of other mammals thus it is extremely unlikely that they are non-functional.

Discussion
In humans, GCK has essential functions in both the liver and pancreatic beta cells, and defects at either site contribute to diabetes [3,49]. Knockout mice have been used to define the tissue-specific roles of GCK [50]. Homozygous pancreatic beta cellspecific deletion of the GCK leads to death due to severe diabetes, while homozygous mice survive but are moderately hyperglycemia [50]. Liver-specific deletion of GCK leads to hyperglycemia due to defects in glucose metabolism and impacts on insulin secretion in response to glucose [50]. In the liver, GCK acts as a gatekeeper for glucose utilization, as phosphorylation of glucose by GCK drives the storage of glucose as glycogen [49]. Despite the importance of GCK function in the liver of humans, a number of vertebrate species have been reported to be deficient in GCK activity, such as cat, bat, ruminants, and birds [27,28]. Deficiency of hepatic GCK activity is unlikely to be caused by mutations that inactivate the entire GCK gene as insulin secretion is not lost in these species and they do not exhibit the symptoms of diabetes that are seen when GCK is depleted from pancreatic islets of mice [50]. Liver and beta-cell expression of GCK is driven by two different promoters [3,6], thus, mutations potentially could specifically inactivate the liver-specific isoform of GCK if they occur in the liver-specific first exon or prevent function of the liver-specific promoter. Indeed, this mechanism may explain GCK deficiency in bats, as the liverspecific 1 st exon of the GCK gene in bats appears to be deleted [32]. All other mammals examined, though, possess a liver-specific 1 st exon [32] and the sequences of this exon and liver-specific promoter do not display characteristics of being non-functional. GCKR regulates the function of GCK in the liver [15][16][17], and knockout of the mouse GCKR gene results in deficiency of liver GCK protein and activity [24,25], while overexpression of GCKR in hepatocytes leads to increase in both GCK protein and enzymatic activity levels [26]. These results indicate that in liver, Figure 4. Phylogeny of vertebrate GCKR protein sequences. Phylogeny of vertebrate GCKR protein sequences generated from intact and near-full length GCKR coding sequences (see Figure S1). Only select mammals were included in the analysis. The bootstrapped (1000 replications) neighbor-joining distance tree was generated using JTT protein distances. The Lamprey (jawless fish) sequence was used as the outgroup. Similar trees were generated when different protein distance measures were used or if other tree building methods, such as parsimony or maximum likelihood, were used. doi:10.1371/journal.pone.0060896.g004 GCK requires GCKR for stability, and the absence of GCKR leads to GCK degradation. GCKR does not appear to be essential for GCK function in pancreatic beta-cells, as GCKR knockout mice did not show impaired insulin secretion [24,25] and the mRNA expression ratio of GCK:GCKR is much higher in human pancreatic islets when compared with human liver [17]. As GCKR is the most important post-transcriptional regulator of GCK levels in the liver [11,12], and vertebrate species that are deficient in hepatic GCK appear to possess a functional GCK gene, we raised the idea that a lack of GCKR impacts liver-specific GCK levels and activity. To test this hypothesis, we have identified and characterized GCKR genes from the genomes of diverse vertebrate species. We found that species that lack GCK activity had deleted or mutated GCKR genes, while species that have reported liver GCK activity have intact GCKR genes. Our results are in accord with results seen in GCKR knockout mice [24,25] and with over-expression of GCKR [26], but further study is still needed to understand the mechanisms by which GCKR specifically regulates GCK protein levels in liver cells.
Loss of GCK activity from the liver of humans contributes to mild diabetes [4]; however, many species with limited hepatic GCK activity do not have diabetes [27][28][29][30][31]. Changes in the requirement for hepatic GCK activity may be associated with changes in diet. Ruminant animals, such as cow and sheep, acquire most of their energy from volatile fatty acids generated by fermentation in their foregut; thus, limited amounts of glucose are acquired from their diets [51]. Glucose that is needed by other tissues is produced in the liver, and its production is regulated by insulin and glucagon [51]. Therefore, ruminants do not need GCK to remove excess glucose from the circulation. Cats, like other carnivores, consume a diet containing large amounts of protein and relatively little carbohydrate [52]. As with the ruminants, glucose in carnivores is generated in the liver by a regulated process from other food sources; thus GCK may not be needed to remove excess glucose. Birds, species that have low hepatic GCK activity levels, have the highest blood glucose levels among vertebrates. With an average of 15.6 mM, or two fold higher than mammals [53], the relatively high glucose levels in birds suggests that the loss of GCKR has prevented hepatic GCK from efficiently removing excess blood glucose and may have driven additional unknown changes in glucose metabolism.
Here we have shown that the loss of hepatic GCK activity in many species is likely due to mutations in the GCKR gene, rather than mutations in GCK. Mutation of the liver-specific GCKR gene may have allowed these species to specifically lose GCK activity from the liver without affecting GCK activity in other glucosesensing tissues such as pancreatic beta-cells [2,3]. How can hepatic tissue function with the loss of GCK activity? Glucose is essential as an energy molecule by many cells in the body such as the neurons [1]; however, import of glucose is not essential for liver cells. GCK phosphorylates glucose as it enters hepatic cells, but this function is only essential if blood glucose levels vary due to diet. If glucose is not directly obtained from the diet, but rather synthesized in the liver, then blood glucose levels should be regulated by the production, rather than uptake, of glucose. As diets change, the requirements for hepatic GCK activity change, and this appears to have been achieved by the loss of GCKR gene function by multiple mechanisms on multiple lineages within vertebrates.

Supporting Information
Figure S1 Alignment of GCKR protein sequences. Fulllength and near full-length GCKR protein sequence predicted from GCKR genes listed in Table 1 aligned with ClustalW [40]. (DOCX)