Integrating Bioinformatics Tools to Handle Glycosylation

Yuliet Mazola; Glay Chinea; Alexis Musacchio

doi:10.1371/journal.pcbi.1002285

Citation: Mazola Y, Chinea G, Musacchio A (2011) Integrating Bioinformatics Tools to Handle Glycosylation. PLoS Comput Biol 7(12): e1002285. https://doi.org/10.1371/journal.pcbi.1002285

Editor: Fran Lewitter, Whitehead Institute, United States of America

Published: December 29, 2011

Copyright: © 2011 Mazola et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

This is an original PLoS Computational Biology tutorial.

Introduction

This tutorial is planned for biologists and computational biologists interested in bioinformatics applications to study protein glycosylation. Glycosylation is a co- and post-translational modification that involves the selective attachment of carbohydrates to proteins. The enhancement of glycosylation by applying glycoengineering strategies has become widely used to improve properties for protein therapeutics. In this tutorial, the use of bioinformatics to assist the rational design and insertion of N-glycosylation sites in proteins is described.

Background

Glycosylation is a co- and post-translational modification involving the covalent addition of carbohydrates to proteins. Carbohydrates (also referred to as glycans, sugars, or saccharides) are adopting linear and branched structures and are composed of monosaccharides, which are covalently linked by glycosidic bonds. There are four enzymatic glycosylation processes: N-glycosylation, O-glycosylation, C-glycosylation (or C-mannosylation), and glycosylphosphatidylinositol (GPI) anchor (Figure 1). Glycan acceptor sites for each glycosylation type are described in Table 1. Experimental detection of occupied glycosylation sites in proteins is an expensive and laborious process [1]. Instead, a number of glycosylation prediction methods as well as glycan and glycoprotein analysis tools have been developed (Table 2 and Table 3). For a detailed description of glycobiology-related databases and software, including glycosylation predictors, the reader is referred to nice reviews on the subject [2]–[5].

Download:

Figure 1. Schematic representation of glycosylation forms.

For each glycosylation type, the amino acid acceptor site is illustrated in balls and sticks: N-glycosylation (asparagine residue), O-glycosylation (serine residue), C-mannosylation (tryptophan residue), and glycosylphosphatidylinositol (GPI) anchor (C-terminal protein residue). Small balls colored in grey, red, blue, and orange represent carbon, oxygen, nitrogen, and phosphorus atoms, respectively. Hydrogen atoms were not shown. The atoms involved in glycan linkage are indicated with rows. Glycan molecules are shown as sticks and highlighted with a yellow background color. The GPI molecule was divided into three parts: phosphoethanolamine, glycan core, and phosphatidylinositol. The glycan core is composed of one non-acetylated glucosamine (GlcN) and three mannose moieties. The long fatty acids contained in the phosphatidylinositol portion are indicated using waves.

https://doi.org/10.1371/journal.pcbi.1002285.g001

Download:

Table 1. General features of different glycosylation types.

https://doi.org/10.1371/journal.pcbi.1002285.t001

Download:

Table 2. Glycosylation prediction servers.

https://doi.org/10.1371/journal.pcbi.1002285.t002

Download:

Table 3. Tools for glycan and glycoprotein analysis.

https://doi.org/10.1371/journal.pcbi.1002285.t003

The Attractiveness of Modifying Protein Glycosylation

Of particular interest is the role of carbohydrates in modulating physico-chemical and biological properties of proteins. Several glycosylation parameters are involved, including the number of glycans attached, the position of the glycosylation sites, and the glycan features (such as the molecular size, sequence, and charge). Glycan can influence protein function [6]; the presence of a glycosyl chain pointing toward a binding pocket might block such a cavity and hence, influence the ligand binding mode and affect protein biological activity (Figure 2). Carbohydrates can also increase protein stability and solubility, as well as reduce immunogenicity and susceptibility to proteolysis [7]. This explains why the rational manipulation of glycosylation parameters (glycoengineering) is widely applied to obtain proteins suited for therapeutic applications [8]. Glycoengineering can enhance in vivo activity even in proteins that do not normally contain N-glycosylation sites [9]. Some protein instabilities prevented by applying glycosylation engineering include proteolytic degradation, formation of crosslinked species, unfolding processes, oxidation, low solubility, aggregation, and kinetic inactivation [10].

Download:

Figure 2. Three-dimensional structures of two glycosyl hydrolase 32 (GH32) family enzymes.

Surface representation of the overall 3D structure of (A) Arabidopsis thaliana cell-wall invertase (PDB database accession code: 2AC1) and (B) Cichorium intybus fructan 1-exohydrolase IIa (PDB database accession code: 1ST8). The N- and C-terminal domains are colored in yellow and blue, respectively. The attached N-glycan molecules are represented as sticks in red color. The active site is shown in green. Another binding pocket that extends between N- and C-terminal domains is orange, highlighted in (A). This cleft is reserved for higher DP-inulin type fructans. An open conformation of the mentioned cavity is observed in GH32 enzymes capable of degrading inulin substrates, such as C. intybus fructan 1-exohydrolase IIa (A). However, the introduction of a glycosyl chain blocks the cleft and prevents inulin binding and degradation in some GH32 enzymes, such as in A. thaliana invertase (B).

https://doi.org/10.1371/journal.pcbi.1002285.g002

Rational Design and Insertion of N-glycan Sites in Proteins

One of the strategies used in glycoengineering involves the introduction of N-glycosylation sequons to increase carbohydrate content in protein pharmaceuticals [7]. In this tutorial, a workflow for the rational design and insertion of N-glycan sites into a desirable protein (also referred to as a target protein) using bioinformatics is provided (Figure 3). A detailed description of the workflow is given below. General features and availability of non-glycobiology-related bioinformatics resources can be found in Table 4.

Download:

Figure 3. Workflow for rational design and insertion of N-glycan sites in proteins.

https://doi.org/10.1371/journal.pcbi.1002285.g003

Download:

Table 4. Software for protein sequence and tertiary structure analysis.

https://doi.org/10.1371/journal.pcbi.1002285.t004

The target protein amino acid sequence is the starting point in this analysis. Additional information, such as post-translational modifications, site-directed mutagenesis studies, and three-dimensional (3D) structure, are also helpful. These data can be found in the protein annotation and literature databases UniProtKB [11] and PubMed [12], respectively.

Prior to performing any modification to the target protein sequence, one should know the residues involved in protein function and tertiary structure. These residues should not be modified. In general, functional and structural relevant residues tend to be more conserved within a protein family [13]. Conserved residues are identified by multiple sequence alignment using, for example, the CLUSTALW server [14], analyzing the sequence similarity among the target protein and its homologues. In particular, a multiple sequence alignment with diverse and divergent protein homologue sequences is suggested, since residues conserved over a longer period of time are under stronger evolutionary constraints. The homologue proteins are recognized via a pairwise alignment using, for instance, the BLASTp server [15]. A degree of conservation for each aligned position in the multiple sequence alignment is quantified. At this step, available tools for sequence conservation analysis could be applied, like the AL2CO server [16]. The amino acid frequencies for each aligned position are estimated and the conservation index is calculated from those frequencies. As input for the AL2CO server, the multiple sequence alignment file is required. Optionally, if a Protein Data Bank (PDB) file (atomic coordinates) of the target or any related homologue protein is also uploaded, the AL2CO server adds the calculated conservation indices into the output PDB file. Then, conserved motifs can be mapped onto the 3D structure and visualized with the Visual Molecular Dynamics (VMD) software [17].

We recommend the insertion of N-glycan sites, such as Asn-x-Ser/Thr, preferentially at positions where potential N-glycosylation sequons predominate in the homologue proteins. The prediction of N-glycosylation sites has to be done for the target and homologue proteins, and any of the available prediction servers, such as NetNGlyc, EnsembleGly, or GPP, can be used (Table 2). The GPP server input is the protein amino acid sequence and the output is sent by email. For NetNGlyc and EnsembleGly servers, the protein UniProtKB/Swiss-Prot accession number or primary amino acid sequences are accepted as input. Results are shown online and are easy to understand. Predicted N-glycan sites are mapped and scored onto the protein sequence representing the occurrence probability of N-glycosylation. In the case of NetNGlyc, the predicted Asn-x-Ser/Thr motifs are highlighted in red color, and a graph showing potential N-glycosylation versus amino acids position is also given.

Following the glycosylation prediction, three potential cases may emerge: (a) predicted N-glycan sites are found in both the target and the homologue proteins; (b) predicted N-glycan sites are found only in homologue proteins; and (c) no N-glycan sites are predicted either in the target protein or in homologue proteins. How to proceed?

In case (a), an optimization of Asn-x-Ser/Thr sequons replacing residues at position +1 (Asn occupies position 0) or surrounding the sequon is done. Statistical analysis of occupied and non-occupied N-glycosylation sites revealed that the amino acids at position +1 and nearby N-glycan sequons modulate the occurrence of N-glycosylation (Table 5). Some suggestions for amino acid substitutions: (a) aromatic amino acids (phenylalanine, tyrosine, and tryptophan) in position −2 and −1, (b) small nonpolar amino acids (glycine, alanine, and valine) in position +1, and (c) bulky hydrophobic amino acids (leucine, isoleucine, and methionine) in positions +3 to +5 (Figure 4). The statistical analysis of amino acids neighboring N-glycosylation sites in the protein primary sequence and tertiary structure can be conducted using the GlySeq and GlyVicinity software, respectively [18].

Download:

Figure 4. Amino acid preferences in occupied N-glycan sites.

The sequence logo displays residues preferentially placed at occupied N-glycan sequons. Neighboring residues located downstream (positions +3 to +5) and upstream (positions −1 and −2) from the asparagine residue (position 0) are also shown. The size of each letter represents the residue prevalence at the putative position. For example, threonine residue is preferred over serine, at position +2. The WebLogo server [29] was used to generate the sequence logo.

https://doi.org/10.1371/journal.pcbi.1002285.g004

Download:

Table 5. Comparative studies for occupied and non-occupied N-glycan sites.

https://doi.org/10.1371/journal.pcbi.1002285.t005

In case (b), a sequence pattern like Asn-x-Ser or Asn-x-Thr is inserted in the target protein. There is a large preference for threonine, as opposed to serine, in position +2. This is in agreement with the observation that replacing serine with threonine in the sequon results in an overall increase of the occupancy [19]. Some suggestions for amino acid substitution at position +1 are (a) highly conserved amino acids at the position +1 within the homologue proteins may be kept except proline, and (b) small nonpolar amino acids (glycine, alanine, and valine) at the position +1 increase the probability of sequon occupancy [20].

In case (c), the analysis of the secondary structure has to be performed to insert the N-glycan sites at or just after protein secondary structure changes. Glycosylation sites are frequently found in points of changes of secondary structure, with a bias toward turns and bends [19]. Protein secondary structure features are described in the PDB file. If no 3D structures are available, a prediction of the secondary structure can be solved using, for example, the PSI-PRED server [21]. Only the primary amino acid sequence is required as input.

With the insertion of N-glycosylation sites in the target protein primary structure, the attachment of N-glycan molecules is favored. Then, the analysis and visualization of the glycoprotein is also helpful. Tertiary glycoprotein structure having attached N-glycans can be modeled using the GlyProt server [22]. This facilitates the identification of spatially unfavorable N-glycosylation sites [6].

The 3D glycan structures are provided in the GlyProt server database; they can also be implemented using the SWEET-II [23], Glydict [24], and Shape [25] software. For the GlyProt server input 3D protein structure, the atomic coordinate file from the modified target protein is required. In this case, a 3D structure model has to be built, using the structure of the native target protein or related homologue as a template. The sequence used as input to build the 3D model has to contain the inserted N-glycan sequons, for which homology modeling software like MODELLER [26] and the online SWISS-MODEL server [27] can be used.

Finally, molecular dynamics simulations to explore protein backbone conformational changes could be applied using, for example, the GROMACS software [28]. This strategy allows for the refinement of the initial glycoprotein structure. All bioinformatics software previously mentioned are freely available. An example of the application of the workflow presented in this manuscript is available in Supporting Information (Text S1 and Figures S1, S2, S3, S4).

Concluding Remarks

In a brief survey, a workflow integrating available bioinformatics resources to assist protein glycosylation was exposed. In particular, the rational manipulation of the native N-glycosylation pattern, including in silico tools, was given. The application of the bioinformatics strategy described in this tutorial, at the early stages of glycoengineering, can help the design and insertion of N-glycan sites in proteins, reducing time, effort, and cost.

Supporting Information

Figure S1.

Protein tertiary structure.

https://doi.org/10.1371/journal.pcbi.1002285.s001

(TIF)

Figure S2.

Multiple sequence alignment.

https://doi.org/10.1371/journal.pcbi.1002285.s002

(PDF)

Figure S3.

Pairwise sequence alignment.

https://doi.org/10.1371/journal.pcbi.1002285.s003

(PDF)

Figure S4.

Protein tertiary structure with modeled N-glycans.

https://doi.org/10.1371/journal.pcbi.1002285.s004

(TIF)

Text S1.

Supporting information text.

https://doi.org/10.1371/journal.pcbi.1002285.s005

(DOC)

References

1. Zaia J (2008) Mass spectrometry and the emerging field of glycomics. Chem Biol 15: 881–892.
- View Article
- Google Scholar
2. der Lieth CW, Bohne-Lang A, Lohmann KK, Frank M (2004) Bioinformatics for glycomics: status, methods, requirements and perspectives. Brief Bioinform 5: 164–178.
- View Article
- Google Scholar
3. Mahal LK (2008) Glycomics: towards bioinformatic approaches to understanding glycosylation. Anticancer Agents Med Chem 8: 37–51.
- View Article
- Google Scholar
4. Aoki-Kinoshita KF (2008) An introduction to bioinformatics for glycomics research. PLoS Comput Biol 4: e1000075.
- View Article
- Google Scholar
5. Frank M, Schloissnig S (2010) Bioinformatics and molecular modeling in glycobiology. Cell Mol Life Sci 67: 2749–2772.
- View Article
- Google Scholar
6. Le Roy K, Verhaest M, Rabijns A, Clerens S, Van Laere A, et al. (2007) N-glycosylation affects substrate specificity of chicory fructan 1-exohydrolase: evidence for the presence of an inulin binding cleft. New Phytol 176: 317–324.
- View Article
- Google Scholar
7. Sinclair AM, Elliott S (2005) Glycoengineering: the effect of glycosylation on the properties of therapeutic proteins. J Pharm Sci 94: 1626–1635.
- View Article
- Google Scholar
8. Sola RJ, Griebenow K (2010) Glycosylation of therapeutic proteins: an effective strategy to optimize efficacy. BioDrugs 24: 9–21.
- View Article
- Google Scholar
9. Elliott S, Lorenzini T, Asher S, Aoki K, Brankow D, et al. (2003) Enhancement of therapeutic protein in vivo activities through glycoengineering. Nat Biotechnol 21: 414–421.
- View Article
- Google Scholar
10. Sola RJ, Griebenow K (2009) Effects of glycosylation on the stability of protein pharmaceuticals. J Pharm Sci 98: 1223–1245.
- View Article
- Google Scholar
11. The UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39: D214–D219.
- View Article
- Google Scholar
12. National Center for Biotechnology Information (2011) PubMed database. Available: http://www.ncbi.nlm.nih.gov/pubmed. Accessed 15 April 2011.
13. Mirny LA, Shakhnovich EI (1999) Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J Mol Biol 291: 177–196.
- View Article
- Google Scholar
14. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673–4680.
- View Article
- Google Scholar
15. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.
- View Article
- Google Scholar
16. Pei J, Grishin NV (2001) AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17: 700–712.
- View Article
- Google Scholar
17. Humphrey W, Dalke A, Schulten K (1996) VMD: visual molecular dynamics. J Mol Graph 14: 33–38.
- View Article
- Google Scholar
18. Lutteke T, Frank M, der Lieth CW (2005) Carbohydrate Structure Suite (CSS): analysis of carbohydrate 3D structures derived from the PDB. Nucleic Acids Res 33: D242–D246.
- View Article
- Google Scholar
19. Petrescu AJ, Milac AL, Petrescu SM, Dwek RA, Wormald MR (2004) Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding. Glycobiology 14: 103–114.
- View Article
- Google Scholar
20. Yurist-Doutsch S, Chaban B, VanDyke DJ, Jarrell KF, Eichler J (2008) Sweet to the extreme: protein glycosylation in Archaea. Mol Microbiol 68: 1079–1084.
- View Article
- Google Scholar
21. McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16: 404–405.
- View Article
- Google Scholar
22. Bohne-Lang A, der Lieth CW (2005) GlyProt: in silico glycosylation of proteins. Nucleic Acids Res 33: W214–W219.
- View Article
- Google Scholar
23. Bohne A, Lang E, von der Lieth C-W (1998) W3-SWEET: Carbohydrate Modeling By Internet. J Mol Model 4: 33–43.
- View Article
- Google Scholar
24. Frank M, Bohne-Lang A, Wetter T, Lieth CW (2002) Rapid generation of a representative ensemble of N-glycan conformations. In Silico Biol 2: 427–439.
- View Article
- Google Scholar
25. Rosen J, Miguet L, Pérez S (2009) Shape: automatic conformation prediction of carbohydrates using a genetic algorithm. J Cheminf 1: 1–7.
- View Article
- Google Scholar
26. Fiser A, Sali A (2003) Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol 374: 461–491.
- View Article
- Google Scholar
27. Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res 31: 3381–3385.
- View Article
- Google Scholar
28. Van Der SD, Lindahl E, Hess B, Groenhof G, Mark AE, et al. (2005) GROMACS: fast, flexible, and free. J Comput Chem 26: 1701–1718.
- View Article
- Google Scholar
29. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190.
- View Article
- Google Scholar
30. Kowarik M, Young NM, Numao S, Schulz BL, Hug I, et al. (2006) Definition of the bacterial N-glycosylation site consensus sequence. EMBO J 25: 1957–1966.
- View Article
- Google Scholar
31. Schaffer C, Graninger M, Messner P (2001) Prokaryotic glycosylation. Proteomics 1: 248–261.
- View Article
- Google Scholar
32. Gupta R, Brunak S (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput 310–322.
- View Article
- Google Scholar
33. Nothaft H, Szymanski CM (2010) Protein glycosylation in bacteria: sweeter than ever. Nat Rev Microbiol 8: 765–778.
- View Article
- Google Scholar
34. Gentzsch M, Tanner W (1997) Protein-O-glycosylation in yeast: protein-specific mannosyltransferases. Glycobiology 7: 481–486.
- View Article
- Google Scholar
35. Julenius K (2007) NetCGlyc 1.0: prediction of mammalian C-mannosylation sites. Glycobiology 17: 868–876.
- View Article
- Google Scholar
36. Krieg J, Hartmann S, Vicentini A, Glasner W, Hess D, et al. (1998) Recognition signal for C-mannosylation of Trp-7 in RNase 2 consists of sequence Trp-x-x-Trp. Mol Biol Cell 9: 301–309.
- View Article
- Google Scholar
37. Hofsteenge J, Blommers M, Hess D, Furmanek A, Miroshnichenko O (1999) The four terminal components of the complement system are C-mannosylated on multiple tryptophan residues. J Biol Chem 274: 32786–32794.
- View Article
- Google Scholar
38. Zanetta JP, Pons A, Richet C, Huet G, Timmerman P, et al. (2004) Quantitative gas chromatography/mass spectrometry determination of C-mannosylation of tryptophan residues in glycoproteins. Anal Biochem 329: 199–206.
- View Article
- Google Scholar
39. Brazier-Hicks M, Evans KM, Gershater MC, Puschmann H, Steel PG, et al. (2009) The C-glycosylation of flavonoids in cereals. J Biol Chem 284: 17926–17934.
- View Article
- Google Scholar
40. Kobayashi T, Nishizaki R, Ikezawa H (1997) The presence of GPI-linked protein(s) in an archaeobacterium, Sulfolobus acidocaldarius, closely related to eukaryotes. Biochim Biophys Acta 1334: 1–4.
- View Article
- Google Scholar
41. Ikezawa H (2002) Glycosylphosphatidylinositol (GPI)-anchored proteins. Biol Pharm Bull 25: 409–417.
- View Article
- Google Scholar
42. Orlean P, Menon AK (2007) Thematic review series: lipid posttranslational modifications. GPI anchoring of protein in yeast and mammalian cells, or: how we learned to stop worrying and love glycophospholipids. J Lipid Res 48: 993–1011.
- View Article
- Google Scholar
43. Roitsch T, Lehle L (1989) Structural requirements for protein N-glycosylation. Influence of acceptor peptides on cotranslational glycosylation of yeast invertase and site-directed mutagenesis around a sequon sequence. Eur J Biochem 181: 525–529.
- View Article
- Google Scholar
44. Shakin-Eshleman SH, Spitalnik SL, Kasturi L (1996) The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency. J Biol Chem 271: 6363–6366.
- View Article
- Google Scholar
45. Kasturi L, Chen H, Shakin-Eshleman SH (1997) Regulation of N-linked core glycosylation: use of a site-directed mutagenesis approach to identify Asn-Xaa-Ser/Thr sequons that are poor oligosaccharide acceptors. Biochem J 323(Pt 2): 415–419.
- View Article
- Google Scholar
46. Mellquist JL, Kasturi L, Spitalnik SL, Shakin-Eshleman SH (1998) The amino acid following an asn-X-Ser/Thr sequon is an important determinant of N-linked core glycosylation efficiency. Biochemistry 37: 6833–6837.
- View Article
- Google Scholar
47. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242.
- View Article
- Google Scholar
48. Christlet TH, Biswas M, Veluraja K (1999) A database analysis of potential glycosylating Asn-X-Ser/Thr consensus sequences. Acta Crystallogr D Biol Crystallogr 55: 1414–1420.
- View Article
- Google Scholar
49. Ben Dor S, Esterman N, Rubin E, Sharon N (2004) Biases and complex patterns in the residues flanking protein N-glycosylation sites. Glycobiology 14: 95–101.
- View Article
- Google Scholar

[ref1] 1. Zaia J (2008) Mass spectrometry and the emerging field of glycomics. Chem Biol 15: 881–892.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. der Lieth CW, Bohne-Lang A, Lohmann KK, Frank M (2004) Bioinformatics for glycomics: status, methods, requirements and perspectives. Brief Bioinform 5: 164–178.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Mahal LK (2008) Glycomics: towards bioinformatic approaches to understanding glycosylation. Anticancer Agents Med Chem 8: 37–51.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Aoki-Kinoshita KF (2008) An introduction to bioinformatics for glycomics research. PLoS Comput Biol 4: e1000075.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Frank M, Schloissnig S (2010) Bioinformatics and molecular modeling in glycobiology. Cell Mol Life Sci 67: 2749–2772.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Le Roy K, Verhaest M, Rabijns A, Clerens S, Van Laere A, et al. (2007) N-glycosylation affects substrate specificity of chicory fructan 1-exohydrolase: evidence for the presence of an inulin binding cleft. New Phytol 176: 317–324.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Sinclair AM, Elliott S (2005) Glycoengineering: the effect of glycosylation on the properties of therapeutic proteins. J Pharm Sci 94: 1626–1635.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref8] 8. Sola RJ, Griebenow K (2010) Glycosylation of therapeutic proteins: an effective strategy to optimize efficacy. BioDrugs 24: 9–21.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref9] 9. Elliott S, Lorenzini T, Asher S, Aoki K, Brankow D, et al. (2003) Enhancement of therapeutic protein in vivo activities through glycoengineering. Nat Biotechnol 21: 414–421.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref10] 10. Sola RJ, Griebenow K (2009) Effects of glycosylation on the stability of protein pharmaceuticals. J Pharm Sci 98: 1223–1245.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref11] 11. The UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39: D214–D219.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref12] 12. National Center for Biotechnology Information (2011) PubMed database. Available: http://www.ncbi.nlm.nih.gov/pubmed. Accessed 15 April 2011.

[ref13] 13. Mirny LA, Shakhnovich EI (1999) Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J Mol Biol 291: 177–196.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref14] 14. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673–4680.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref15] 15. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref16] 16. Pei J, Grishin NV (2001) AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17: 700–712.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref17] 17. Humphrey W, Dalke A, Schulten K (1996) VMD: visual molecular dynamics. J Mol Graph 14: 33–38.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref18] 18. Lutteke T, Frank M, der Lieth CW (2005) Carbohydrate Structure Suite (CSS): analysis of carbohydrate 3D structures derived from the PDB. Nucleic Acids Res 33: D242–D246.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref19] 19. Petrescu AJ, Milac AL, Petrescu SM, Dwek RA, Wormald MR (2004) Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding. Glycobiology 14: 103–114.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref20] 20. Yurist-Doutsch S, Chaban B, VanDyke DJ, Jarrell KF, Eichler J (2008) Sweet to the extreme: protein glycosylation in Archaea. Mol Microbiol 68: 1079–1084.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref21] 21. McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16: 404–405.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref22] 22. Bohne-Lang A, der Lieth CW (2005) GlyProt: in silico glycosylation of proteins. Nucleic Acids Res 33: W214–W219.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref23] 23. Bohne A, Lang E, von der Lieth C-W (1998) W3-SWEET: Carbohydrate Modeling By Internet. J Mol Model 4: 33–43.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref24] 24. Frank M, Bohne-Lang A, Wetter T, Lieth CW (2002) Rapid generation of a representative ensemble of N-glycan conformations. In Silico Biol 2: 427–439.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref25] 25. Rosen J, Miguet L, Pérez S (2009) Shape: automatic conformation prediction of carbohydrates using a genetic algorithm. J Cheminf 1: 1–7.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref26] 26. Fiser A, Sali A (2003) Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol 374: 461–491.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res 31: 3381–3385.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref28] 28. Van Der SD, Lindahl E, Hess B, Groenhof G, Mark AE, et al. (2005) GROMACS: fast, flexible, and free. J Comput Chem 26: 1701–1718.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref29] 29. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref30] 30. Kowarik M, Young NM, Numao S, Schulz BL, Hug I, et al. (2006) Definition of the bacterial N-glycosylation site consensus sequence. EMBO J 25: 1957–1966.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref31] 31. Schaffer C, Graninger M, Messner P (2001) Prokaryotic glycosylation. Proteomics 1: 248–261.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref32] 32. Gupta R, Brunak S (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput 310–322.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref33] 33. Nothaft H, Szymanski CM (2010) Protein glycosylation in bacteria: sweeter than ever. Nat Rev Microbiol 8: 765–778.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref34] 34. Gentzsch M, Tanner W (1997) Protein-O-glycosylation in yeast: protein-specific mannosyltransferases. Glycobiology 7: 481–486.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref35] 35. Julenius K (2007) NetCGlyc 1.0: prediction of mammalian C-mannosylation sites. Glycobiology 17: 868–876.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref36] 36. Krieg J, Hartmann S, Vicentini A, Glasner W, Hess D, et al. (1998) Recognition signal for C-mannosylation of Trp-7 in RNase 2 consists of sequence Trp-x-x-Trp. Mol Biol Cell 9: 301–309.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref37] 37. Hofsteenge J, Blommers M, Hess D, Furmanek A, Miroshnichenko O (1999) The four terminal components of the complement system are C-mannosylated on multiple tryptophan residues. J Biol Chem 274: 32786–32794.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref38] 38. Zanetta JP, Pons A, Richet C, Huet G, Timmerman P, et al. (2004) Quantitative gas chromatography/mass spectrometry determination of C-mannosylation of tryptophan residues in glycoproteins. Anal Biochem 329: 199–206.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref39] 39. Brazier-Hicks M, Evans KM, Gershater MC, Puschmann H, Steel PG, et al. (2009) The C-glycosylation of flavonoids in cereals. J Biol Chem 284: 17926–17934.
View Article
Google Scholar

[114] View Article

[115] Google Scholar

[ref40] 40. Kobayashi T, Nishizaki R, Ikezawa H (1997) The presence of GPI-linked protein(s) in an archaeobacterium, Sulfolobus acidocaldarius, closely related to eukaryotes. Biochim Biophys Acta 1334: 1–4.
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref41] 41. Ikezawa H (2002) Glycosylphosphatidylinositol (GPI)-anchored proteins. Biol Pharm Bull 25: 409–417.
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref42] 42. Orlean P, Menon AK (2007) Thematic review series: lipid posttranslational modifications. GPI anchoring of protein in yeast and mammalian cells, or: how we learned to stop worrying and love glycophospholipids. J Lipid Res 48: 993–1011.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref43] 43. Roitsch T, Lehle L (1989) Structural requirements for protein N-glycosylation. Influence of acceptor peptides on cotranslational glycosylation of yeast invertase and site-directed mutagenesis around a sequon sequence. Eur J Biochem 181: 525–529.
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref44] 44. Shakin-Eshleman SH, Spitalnik SL, Kasturi L (1996) The amino acid at the X position of an Asn-X-Ser sequon is an important determinant of N-linked core-glycosylation efficiency. J Biol Chem 271: 6363–6366.
View Article
Google Scholar

[129] View Article

[130] Google Scholar

[ref45] 45. Kasturi L, Chen H, Shakin-Eshleman SH (1997) Regulation of N-linked core glycosylation: use of a site-directed mutagenesis approach to identify Asn-Xaa-Ser/Thr sequons that are poor oligosaccharide acceptors. Biochem J 323(Pt 2): 415–419.
View Article
Google Scholar

[132] View Article

[133] Google Scholar

[ref46] 46. Mellquist JL, Kasturi L, Spitalnik SL, Shakin-Eshleman SH (1998) The amino acid following an asn-X-Ser/Thr sequon is an important determinant of N-linked core glycosylation efficiency. Biochemistry 37: 6833–6837.
View Article
Google Scholar

[135] View Article

[136] Google Scholar

[ref47] 47. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242.
View Article
Google Scholar

[138] View Article

[139] Google Scholar

[ref48] 48. Christlet TH, Biswas M, Veluraja K (1999) A database analysis of potential glycosylating Asn-X-Ser/Thr consensus sequences. Acta Crystallogr D Biol Crystallogr 55: 1414–1420.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref49] 49. Ben Dor S, Esterman N, Rubin E, Sharon N (2004) Biases and complex patterns in the residues flanking protein N-glycosylation sites. Glycobiology 14: 95–101.
View Article
Google Scholar

[144] View Article

[145] Google Scholar