Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Characterization of the Functional Domain of β2-Microglobulin from the Asian Seabass, Lates calcarifer

  • Hirzahida Mohd-Padil,

    Affiliation School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia

  • Khairina Tajul-Arifin ,

    Affiliation School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia

  • Adura Mohd-Adnan

    Affiliation School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia



β2-Microglobulin (β2M) is the light chain of major histocompatibility class I (MHC I) that binds non-covalently with the α heavy chain. Both proteins attach to the antigen peptide, presenting a complex to the T cell to be destroyed via the immune mechanism.

Methodology/Principal Findings

In this study, a cDNA sequence encoding β2M in the Asian seabass (Lates calcarifer) was identified and analyzed using in silico approaches to predict and characterize its functional domain. The β2M cDNA contains an open reading frame (ORF) of 351 bases with a coding capacity of 116 amino acids. A large portion of the protein consists of the IG constant domain (IGc1), similar to β2M sequences from other species studied thus far. Alignment of the IGc1 domains of β2M from L. calcarifer and other species shows a high degree of overall conservation. Seven amino acids were found to be conserved across taxa whereas conservation between L. calcarifer and other fish species was restricted to 14 amino acids at identical conserved positions.


As the L. calcarifer β2M protein analyzed in this study contains a functional domain similar to that of β2M proteins in other species, it can be postulated that the β2M proteins from L. calcarifer and other organisms are derived from a common ancestor and thus have a similar immune function. Interestingly, fish β2M genes could also be classified according to the ecological habitat of the species, i.e. whether it is from a freshwater, marine or euryhaline environment.


β2M is the light chain component of the class I major histocompatibility complex (MHC I) molecule. It consists of about 99 residues with a seven-stranded β-sandwich fold and a central disulfide bond [1] and belongs to the antibody constant domain-like family of proteins (immunoglobulin superfamily). At the cell surface, the MHC I complex is comprised of three extracellular domains from the α chain (α1, α2 and α3) plus the β2M protein domain. β2M ensures the proper folding and cell surface display of the MHC I molecule [2]. The classical MHC I molecule mainly functions as a component that binds antigenic peptides, presenting them to the T-cell receptor to trigger the cellular immune response [3].

It has been reported that there is a high degree of conservation of β2M sequences of mammalian species as well as between mammalian and avian species [4], [5]. The teleost β2M sequences also exhibit high sequence similarity overall and conserved regions with warm-blooded vertebrates [6]. Previous phylogenetic analysis revealed the evolutionary diversion of the β2M protein in warm-blooded vertebrates and fish [7], [8]. Meanwhile, another earlier phylogenetic analysis indicated that the freshwater fish β2M gene diverged from the common ancestor gene earlier than the seawater fish β2M gene [6].

In an effort to improve our understanding of the molecular biology of L. calcarifer, several thousand expressed sequence tags (ESTs) have been derived from various cDNA libraries from several tissues [9]. Analyses of the EST data enabled us to identify novel gene sequences, including those with significant similarity to β2M. Using the numerous β2M sequences that are available in the public databases, we analyzed the protein sequence in an effort to better understand the immune system of L. calcarifer.

Materials and Methods

The cDNA sequence of L. calcarifer β2M obtained from the spleen EST library was translated into its potential open reading frame (ORF) using the ORF Finder algorithm ( Domain analyses were carried out using several resources, including Simple Modular Architecture Research Tools (SMART) ( [10], Pfam 20.0 ( [11] and Prosite 19.36 ( [12]. The profile of the IGc1 domain obtained from the Pfam domain database was used to search for other homologous proteins using the hmmsearch program in HMMER version 2.3.2 ( [13], [14] in both the Swiss-Prot database Release 54 ( and the fish genomes at Ensembl database Release 49 ( The IGc1 domain sequences of the homologous proteins thus identified were extracted for subsequent analyses.

The sequence alignment for the IGc1 domains was built using the hmmalign program in the HMMER package against the profile of the IGc1 domain obtained from Pfam to enable the pattern of β2M protein change across the taxa to be examined. PHYLIP ( [15] was then used to perform phylogenetic analyses. A neighbor-joining tree was built using the protdist and neighbor programs with the Jones-Taylor-Thornton substitution model. The robustness of the trees was evaluated by bootstrap analysis of 1000 random iterations using seqboot, while consense was used to generate the consensus tree. All programs used to construct the phylogenetic trees are contained in PHYLIP packages [15]. Subsequently, MEGA4 ( [16] was utilized to view the resultant phylogenetic trees. The L. calcarifer β2M sequence analyzed in this study has been deposited in the GenBank database under the accession number FJ200516.


Analysis of the L. calcarifer β2M Sequence

Analyses of the cDNA sequence of L. calcarifer β2M (clone LSE48F06) from the spleen EST library indicated the most probable ORF codes for a polypeptide of 116 amino acids in length. A domain search revealed that a large portion of the protein sequence matched the immunoglobulin C-type (IGc1) domain in the SMART, Pfam and Prosite databases. Almost half of the amino acid residues of β2M form two large β structures, which are linked by a central disulfide bond. Its conformation thus strongly resembles the overall tertiary structure of the Igc1 domain [17]. In addition, analysis against the Prosite database showed the presence of an immunoglobulin and major histocompatibility complex protein signature, YSCRVTH, located at residues 97–103 in the L. calcarifer β2M sequence.

A total of 81 IGc1 domains contained in β2M sequences were obtained by protein search against the Swiss-Prot database (version 14 updated 23 October 2007) and the known proteins of five fish species (medaka, stickleback, zebrafish, pufferfish and spotted green pufferfish) in the Ensembl database (version 49 updated March 2008). Of these 81 domain sequences, only 56 were used to build the multiple sequence alignment (MSA) (see Table 1); the remaining sequences were excluded as they were considered identical, truncated or replicated between the two databases used. As the IGc1 domain is the functional domain in all β2M sequences analyzed in this study, subsequent analyses focused on this domain as a representation of the β2M protein.

Sequence Alignment

Alignment of the IGc1 domains of β2M from L. calcarifer and other species showed a high degree of overall conservation across taxa (see Figure 1). The IGc1 domain of L. calcarifer starts with a Ser residue, which is conserved among the taxa, with the exception of domains from the Japanese flounder (Q8AYH8), Chinese hamster (Q9WV24), hispid pocket mouse (Q8CIQ3) and Australian echidna (Q864T6), in which the starting residue is a Thr. The fish sequences, with the exception of Siberian sturgeon (Q9PRF8), are two residues shorter (lacking residue-75 and residue-76) than the human (P61769) sequence. The deletions in the teleost sequences are located in the loop between the anti-parallel beta-strands S6 and S7 [18]. Seven positions or residues, Asn11, Cys15, Pro22, Asp44, Phe47, Cys70 and Val72 (numbering refers to the position in the MSA) are found to be completely conserved. These residues or regions are believed to be in the active or binding site of the β2M protein. A previous study reported that the two cysteine residues (Cys15 and Cys70), which covalently link to form a disulfide bridge (thus connecting the two β sheets), are important elements in protein folding that contribute to stabilization of the MHC class I molecule [1].

Figure 1. Multiple sequence alignment of IGc1 domains.

The alignment consists of IGc1 domain sequences from organisms of various taxa such as Eutheria, Marsupials, Monotremes, Avians, Chondrichthyes fish and Actinopterygii fishes. Lca is the L. calcarifer protein sequence and the conserved residues are marked by (*). S1, S2, S3, S4, S5, S6 and S7 indicate the regions of seven β strands in the IGc1 domain. Numbers at the top indicate amino acid positions. Information on the sequences is given in Table 1.

Phylogenetic Analysis

The phylogenetic tree built is an unrooted tree (see Figure 2) with two distinct clades, the mammalian β2M and fish β2M. The avian, marsupial and monotreme sequences form an intermediate group between the fish and the mammalian clades. All fish sequences are from the Actinopterygii class with the exception of the clearnose skate (Q8AXA0) [19], which is a cartilaginous fish (class Chondrichthyes). In order to clarify the relationships among the β2M molecules of the Actinopterygii fish (Siberian sturgeon, channel catfish (O42197), Japanese medaka (Q90ZJ6), Japanese flounder, zebrafish (Q04475, NP_998291), common carp (Q03422) and Lake Tana barbel (P55076)), an NJ tree consisting of fish sequences only was constructed using the chicken (P21611) β2M sequence as an outgroup (see Figure 3). The β2M from L. calcarifer, which is a euryhaline and catadromous species, clustered together with Japanese medaka β2M, forming a separate clade from the freshwater fishes (common carp, Lake Tana barbel, zebrafish and channel catfish). The phylogenetic trees also reveal a molecular signal of the ecological distinction between marine and freshwater fish. The Siberian sturgeon β2M sequence is the most basal lineage, which is placed outside the main cluster of teleosts in the tree (see Figure 3).

Figure 2. NJ phylogenetic tree of β2M protein sequences representing whole organisms.

The phylogenetic tree shown is the collapsed tree of 55 sets of sequence data. This tree shows that β2M sequences are clustered together according to their taxons. β2M sequences from Eutheria are clustered together and consist of sequences from Primates, Equine, Rodents, Ruminants, SscQ07717, OcuP01885 and FcaQ5MGS7. Marsupials, Monotremes and Avians are the intermediate taxons between Eutheria and Fish. Amphibian Xla protein Q9IA97 is clustered together with Actinopterygii fishes while the outgroup in this tree is a cartilaginous fish Reg Q8AXA0. The divergence of fish and mammalian β2M received a high bootstrap value (89) to support the reliability of this phylogenetic tree.

Figure 3. NJ phylogenetic tree of IGc1 domains present in β2M protein sequences from fish.

The phylogenetic tree shows that fish β2M proteins are clustered according to the fish's ecological habitat, which may be fresh water, euryhaline or marine. The chicken sequence was used as the outgroup. Information on the sequences is given in Table 1.


Our analysis of the novel L. calcarifer β2M gene recovered in this study indicates that, overall, fish β2M sequences have high sequence similarity and share many conserved features with published sequences from mammals and birds. Using β2M as both a phylogenetic marker and a source of information, we confirmed previous studies indicating that the β2M proteins of mammals and fish represent clearly distinct evolutionary paths, with fish β2M genes more closely related to avian sequences than to those of mammals [7], [8]. The divergence between fish and mammals is partially a consequence of several unreversed changes in the ORF. For example, at site-14 (S2) in all mammalian β2M sequences, the residue is Arg, whereas in Actinopteygii fish β2M sequences it is Ile. The Ser residue in mammalian sequences is substituted by Ala at site-45 (S4) in all Actinopterygii fish, whereas Lys at position 55 (S5) in mammalian β2M is consistently replaced by Thr in all Actinopterygii fish. Although these changes of amino acids are located within the mature protein region, the residues at those sites are not involved in any important stabilizing interactions of the protein [20].

Our analyses of the Actinopterygii fish β2M sequences employed sequences representing two subclasses: Neopterygii (teleosts) and Chondrostei. Within the Neopterygii, two superorders are evident: Ostariophsi and Acanthopterygii. The molecular results showed phylogenetic relationships that support those established based on fish morphology [21]. Ostariophsi includes two different orders (Cypriniformes and Siluriformes), whereas Acanthopterygii includes three different fish orders (Beloniformes, Perciformes and Pleuronectiformes). With the exception of Siberian sturgeon, our results also showed that fishes included in Ostariophsi are mainly freshwater fishes, whereas those of Acanthopterygii are marine or euryhaline fishes (see Figure 3). The results reconfirm a previous paper that indicated an evolutionary divergence had occurred between freshwater and marine or euryhaline fish β2M sequences [8]. At the molecular level among the Actinopterygii fishes studied, Asian seabass, Japanese medaka and Japanese flounder sequences share synapomorphic amino acids at three sites: Thr73, (S6) Gly78 (loop between S6 and S7) and Asp82 (S7). Siberian sturgeon (Chondrost) resolves as the most basal lineage in agreement with other studies [21], [18], confirming this fish is the most primitive member of the subclass Actinopterygii. Indeed, a two-codon (residues 75 and 76 in the alignment) deletion is synapomorphic in all teleost β2M sequences in contrast to Siberian sturgeon β2M.

Given the preponderance of analytical results and qualitative comparative results, we suggest that the L. calcarifer β2M gene recovered here is likely to function similarly to previously characterized β2M genes, and that the protein it encodes acts as a light chain that binds non-covalently to the heavy chain of the MHC class I molecule. The two proteins would then create a complex with the antigen peptide and present the antigen to T cells to be destroyed by the immune mechanism [22]. β2M is also involved in stabilizing the MHC class I molecules [23], [24]. Since β2M is clearly most closely related to the IGc1-type domains of MHC class I and II, its gene must have been linked to that of the MHC at some point of evolution [7]. Furthermore, the similarity between the structures of β2M and the IGc1-type domains of MHC I and II suggests that they share a common ancestor encoded in MHC genes [25]. In this study, we have identified the β2M gene in L. calcarifer and confirm its phylogenetic placement within a group of related fish species. We believe that, as more fish β2M sequences become available, reanalysis of the data may be able to better resolve the evolutionary history of the seeming ecological divergence detected among fish sequences and that of the L. calcarifer β2M gene from the rest of the β2M gene tree. Again, the overall utility of our approach in the detection, recovery and delineation of genes within L. calcarifer is emphasized by its success in our study of β2M.


The authors thank Azrol Ridzuan Abd Aziz for providing the β2M nucleotide sequence, Dr. Michael R. J. Fortsner for helpful discussion and revision of the manuscript and Mohd. Zulqurnain Abd. Razak for help with the figures.

Author Contributions

Conceived and designed the experiments: KTA AMA. Performed the experiments: HMP. Analyzed the data: HMP. Wrote the paper: HMP KTA AMA. Interpreted data: KTA AMA.


  1. 1. Bjorkman PJ, Saper MA, Samraoui B, Bennett WS, Strominger JL, et al. (1987) Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329: 506–512.
  2. 2. Coico R, Sunshine G, Benjamini E (2003) Immunology A Short Course. New Jersey: John Wiley & Son, Inc. 361 p.
  3. 3. Collins RW (2004) Human MHC class I chain related (MIC) genes: their biological function and relevance to disease and transplantation. Eur J Immunogenet 31: 105–114.
  4. 4. Ono H, Figueroa F, O'hUigin C, Klein J (1993) Cloning of the β2-microglobulin gene in the zebrafish. Immunogenetics 38: 1–10.
  5. 5. Hui FH, Tian YY, Ruo QY, Feng SG, Chun X (2006) cDNA cloning and genomic structure of grass carp (Ctenophayngodon idellus) B2-microglobulin gene. Fish Shellfish Immmunol 20: 118–123.
  6. 6. Criscitiello MF, Benedetto R, Antao A, Wilson MR, Chinchar VG, et al. (1998) β2-microglobulin of Ictalurid catfishes. Immunogens 48: 339–343.
  7. 7. Stewart R, Ohta Y, Minter RR, Gibbons T, Horton TL, et al. (2005) Cloning and characterization of Xenopus β2-microglobulin. Dev Comp Immunol 29: 723–732.
  8. 8. Choi W, Lee EY, Choi TJ (2006) Cloning and sequence analysis of the β2-microglobulin transcript from flounder, Paralichthys olivaceous. Mol Immunol 43: 1565–1572.
  9. 9. Tan SL, Mohd-Adnan A, Mohd-Yusof NY, Forstner MRJ, Wan K-L (2008) Identification and analysis of a prepro-chicken gonadotropin releasing hormone II (preprocGnRH-II) precursor in the Asian seabass, Lates calcarifer, based on an EST-based assessment of its brain transcriptome. Gene 411: 77–86.
  10. 10. Schultz J, Milpetz F, Bork P, Ponting CP (1998) SMART, a simple modular architecture research tool: identification of signaling domain. Proc Natl Acad Sci U S A 95: 5857–5864.
  11. 11. Robert DF, Jaina M, Benjamin SB, Sam GJ, Volker H, et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34: 247–251.
  12. 12. Hulo N, Bairoch A, Bulliard V, Cerutti L, De Castro E, et al. (2006) The Prosite database. Nucleic Acids Res 34: 227–230.
  13. 13. Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14: 755–763.
  14. 14. Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. United Kingdom: Cambridge University Press. 95 p.
  15. 15. Felsenstein J (1996) Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol 266: 418–427.
  16. 16. Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 24(8): 1596–1599.
  17. 17. Becker JW, Reeke GN (1985) Three-dimensional structure of β2-microglobulin. Proc Natl Acad Sci U S A 82: 4225–4229.
  18. 18. Lundqvist ML, Appelkvist P, Hermsen T, Pilstrom L, Stet RJM (1999) Characterization of beta-2-microglobulin in a primitive fish, the Siberian sturgeon (Acipenser baeri). Immunogenetics. 50. (1-2): pp. 79–83.
  19. 19. Cannon JP, Haire RN, Litman GW (2002) Identification of diversified genes that contain immunoglobulin-like variable regions in a protochordate. Nat Immunol 3(12): 1200–1207.
  20. 20. Benyamini H, Gunasekaran K, Wolfson H, Nussinov R (2003) β2-Microglobulin amyloidosis: insights from conservation analysis and fibril modeling by protein docking techniques. J Mol Biol 330: 159–174.
  21. 21. Hurley IA, Mueller RL, Dunn KA, Schmidt EJ, Friedman M, et al. (2007) A new time-scale for ray-finned fish evolution. Proc Biol Sci 274: 489–498.
  22. 22. Hansen TH, Lee DR (1997) Mechanism of class I assembly with β2-microglobulin and loading with peptide. Adv Immunol 64: 105–137.
  23. 23. Van KL, Ashton RPG, Ploegh HL, Tonegawa S (1992) TAP1 mutant mice are deficient in antigen presentation, surface class molecules, and CD4-8+ cells. Cell 71: 1205–1214.
  24. 24. Zijlstra M, Bix M, Simister N, Loring J, Raulet D, et al. (1990) β2-Microglobulin deficient mice lack CD4-8+ cytolytic T cells. Nature 344: 742–746.
  25. 25. Shum BP, Azumi K, Zhang S, Kehrer SR, Raison RL, et al. (1996) Unexpected β2-microglobulin sequence diversity in individual rainbow trout. Proc Natl Acad Sci U S A 93: 2779–2784.
  26. 26. Dixon B, Nagelkerke LA, Sibbing FA, Egberts E, Stet RJ (1996) Evolution of MHC class II beta chain-encoding genes in the Lake Tana barbel species flock (Barbus intermedius complex). Immunogenetics 44(6): 419–431.
  27. 27. Dixon B, Van ESHM, Rodrigues PNS, Egberts E, Stet RJM (1993) Fish major histocompatibility complex genes: an expansion. Dev Comp Immunol 19: 109–133.
  28. 28. Naruse K, Fukamachi S, Mitani H, Kondo M, Matsuoka T, et al. (2000) A detailed linkage map of medaka, Oryzias latipes: comparative genomics and genome evolution. Genetics 154(4): 1773–1784.
  29. 29. Vihtelic TS, Fadool JM, Gao J, Thornton KA, Hyde DR, et al. (2005) Expressed sequence tag analysis of zebrafish eye tissues for NEIBank. Mol Vis 11: 1083–1100.
  30. 30. Miska KB, Hellman L, Miller RD (2003) Characterization of β2-microglobulin coding sequence from three non-placental mammals: the duckbill platypus, the short-beaked echidna, and the grey short-tailed opossum. Dev Comp Immunol 27(3): 247–256.
  31. 31. Riegert P, Andersen R, Bumstead N, Doehring C, Dominguez-Steglich M, et al. (1996) The chicken beta 2-microglobulin gene is located on a non-major histocompatibility complex microchromosome: a small, G+C-rich gene with X and Y boxes in the promoter. Proc Natl Acad Sci U S A 93(3): 1243–1248.
  32. 32. Welinder KG, Jespersen HM, Walther-Rasmussen J, Skjodt K (1991) Amino acid sequences and structures of chicken and turkey β2-microglobulin. Mol Immunol 28(1-2): 177–182.
  33. 33. Western AH, Eckery DC, Demmer J, Juengel JL, McNatty KP, et al. (2003) Expression of the FcRn receptor (and) gene homologues in the intestine of suckling brushtail possum (Trichosurus vulpecula) pouch young. Mol Immunol 39(20): 707–717.
  34. 34. Groves ML, Greenberg R (1982) Complete amino acid sequence of bovine β2-microglobulin. J Biol Chem 257(5): 2619–2626.
  35. 35. Ellis SA, Martin AJ (1993) Nucleotide sequence of horse β2-microglobulin cDNA. Immunogenetics 38(5): 383.
  36. 36. Tallmadge RL, Lear TL, Johnson AK, Guerin G, Millon LV, et al. (2003) Characterization of the β2-microglobulin gene of the horse. Immunogenetics 54(10): 725–733.
  37. 37. Daniel F, Morello D, Le Bail O, Chambon P, Cayre Y, et al. (1983) Structure and expression of the mouse beta 2-microglobulin gene isolated from somatic and non-expressing teratocarcinoma cells. EMBO J 2(7): 1061–1065.
  38. 38. Cole T, Dickson PW, Esnard F, Averill S, Risbridger GP, et al. (1989) The cDNA structure and expression analysis of the genes for the cysteine proteinase inhibitor cystatin C and for β2-microglobulin in rat brain. Eur J Biochem 186(1-2): 35–42.
  39. 39. Gastinel LN, Valente G, Bjorkman PJ (1992) cDNA sequence of CHO-KI hamster beta2-microglobulin. Proc Natl Acad Sci U S A 89(2): 638–642.
  40. 40. Wolfe PB, Cebra JJ (1980) The primary structure of guinea pig β2-microglobulin. Mol Immunol 17(12): 1493–1505.
  41. 41. Canavez FC, Ladasky JJ, Muniz JA, Seuanez HN, Parham P, et al. (1998) β2-Microglobulin in neotropical primates (Platyrrhini). Immunogenetics 48(2): 133–140.
  42. 42. Canavez FC, Moreira MA, Simon F, Parham P, Seuanez HN (1999) Phylogenetic relationships of the Callitrichinae (Platyrrhini, primates) based on β2-microglobulin DNA sequence. Am J Primatol 48(3): 225–236.
  43. 43. York IA, Grant EP, Dahl AM, Rock KL (2005) A mutant cell with a novel defect in MHC class I quality control. J Immunol 174(11): 6839–6846.
  44. 44. Gussow D, Rein R, Ginjaar I, Hochstenbach F, Seeman G, et al. (1987) The human β2-microglobulin gene, primary structure and definition of the transcriptional unit. J Immunol 139: 3132–3138.
  45. 45. Milland J, Loveland BE, McKenzie IF (1993) Isolation of a clone for pig β2-microglobulin cDNA. Immunogenetics 38(6): 464.
  46. 46. Gates FT, Coligan JE, Kindt TJ (1979) Complete amino acid sequence of rabbit β2-microglobulin. Biochemistry 18(11): 2267–2272.