ࡱ > o q n y R uG bjbj 2z { { g?
8 < l d u T , $ m F 4 / )@ : E 0 u e e 4 $ 0 u , : Table S2. List of initial features and the element number of each feature.
Type of propertiesFeatures (dimension)Sources [reference]Basic sequence
attributesSequence length (1)
Amino acid composition (20)
Di-peptide composition (400)Locally calculatedPhysicochemical and biochemical
propertiesAmino acid propensities (544)Locally calculated based on the amino acid indices obtained from AAindex ADDIN EN.CITE Kawashima200819[1]191917Kawashima, S.Pokarowski, P.Pokarowska, M.Kolinski, A.Katayama, T.Kanehisa, M.Laboratory of Genome Database, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai Minato-ku Tokyo 108-8639, Japan. shuichi@hgc.jpAAindex: amino acid index database, progress report 2008Nucleic Acids ResNucleic Acids ResD202-536Database issue2007/11/14Amino Acids/*chemistry*Databases, ProteinInternetProteins/*chemistry2008Jan1362-4962 (Electronic)
0305-1048 (Linking)17998252http://www.ncbi.nlm.nih.gov/pubmed/179982522238890gkm998 [pii]
10.1093/nar/gkm998eng[ HYPERLINK \l "_ENREF_1" \o "Kawashima, 2008 #19" 1]Extinction coefficient (4)
Instability index (1)
Aliphatic index (1)
Grand average of hydropathicity (1)
Isoelectric point (1)
Molecular weight (1)ProtParam ADDIN EN.CITE Gasteiger200521[2]212117Gasteiger, E.Hoogland, C.Gattiker, A.Duvaud, S.Wilkins, M.R.Appel, R.D.Bairoch, A.Protein identification and analysis tools on the ExPASy server. (In) John M. Walker (ed)The proteomics protocols handbookThe proteomics protocols handbookpp. 571-607Humana Press2005[ HYPERLINK \l "_ENREF_2" \o "Gasteiger, 2005 #21" 2]Structural propertiesSolvent accessibility (4)
Secondary structural content (3)NetSurfP 1.1 ADDIN EN.CITE Petersen200922[3]222217Petersen, B.Petersen, T. N.Andersen, P.Nielsen, M.Lundegaard, C.Center for Biological Sequence Analysis-CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark-DTU, Lyngby, Denmark. bent@cbs.dtu.dkA generic method for assignment of reliability scores applied to solvent accessibility predictionsBMC Struct BiolBMC Struct Biol5192009/08/04AlgorithmsComputational BiologyDatabases, ProteinNeural Networks (Computer)Proteins/*chemistrySolvents/*chemistry20091472-6807 (Electronic)
1472-6807 (Linking)19646261http://www.ncbi.nlm.nih.gov/pubmed/1964626127250871472-6807-9-51 [pii]
10.1186/1472-6807-9-51eng[ HYPERLINK \l "_ENREF_3" \o "Petersen, 2009 #22" 3]Unfoldability (1)
Disordered regions (3)
Global charge (1)
Hydrophobicity (1)FoldIndex ADDIN EN.CITE Prilusky200523[4]232317Prilusky, J.Felder, C. E.Zeev-Ben-Mordehai, T.Rydberg, E. H.Man, O.Beckmann, J. S.Silman, I.Sussman, J. L.Biological Services, Weizmann Institute of Science, Rehovot 76100, Israel.FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfoldedBioinformaticsBioinformatics3435-821162005/06/16*AlgorithmsComputer GraphicsComputer SimulationEnergy TransferInternet*Models, Chemical*Models, MolecularProtein ConformationProtein FoldingProteins/analysis/*chemistrySequence Alignment/*methodsSequence Analysis, Protein/*methods*SoftwareStructure-Activity Relationship*User-Computer Interface2005Aug 151367-4803 (Print)
1367-4803 (Linking)15955783http://www.ncbi.nlm.nih.gov/pubmed/15955783bti537 [pii]
10.1093/bioinformatics/bti537eng[ HYPERLINK \l "_ENREF_4" \o "Prilusky, 2005 #23" 4]Signal peptide and
transmembrane topologySignal peptide (2)SignalP 4.0 ADDIN EN.CITE Petersen201124[5]242417Petersen, T. N.Brunak, S.von Heijne, G.Nielsen, H.SignalP 4.0: discriminating signal peptides from transmembrane regionsNat MethodsNat Methods785-68102011/10/01AlgorithmsCell Membrane/*metabolism*Computational Biology*Protein Sorting Signals*Software20111548-7105 (Electronic)
1548-7091 (Linking)21959131http://www.ncbi.nlm.nih.gov/pubmed/2195913110.1038/nmeth.1701
nmeth.1701 [pii]eng[ HYPERLINK \l "_ENREF_5" \o "Petersen, 2011 #24" 5]Transmembrane domains
(alpha-helix and beta-barrel) (3)TMHMM 2.0 ADDIN EN.CITE ADDIN EN.CITE.DATA [ HYPERLINK \l "_ENREF_6" \o "Krogh, 2001 #25" 6]
TMB-Hunt ADDIN EN.CITE Garrow200526[7]262617Garrow, A. G.Agnew, A.Westhead, D. R.School of Biochemistry and Microbiology, University of Leeds, Leeds LS2 9JT, UK.TMB-Hunt: a web server to screen sequence sets for transmembrane beta-barrel proteinsNucleic Acids ResNucleic Acids ResW188-9233Web Server issue2005/06/28*AlgorithmsEvolution, MolecularInternetMembrane Proteins/*chemistry/classificationProtein Structure, SecondarySequence Analysis, Protein/*methods*Software2005Jul 11362-4962 (Electronic)
0305-1048 (Linking)15980452http://www.ncbi.nlm.nih.gov/pubmed/15980452116014533/suppl_2/W188 [pii]
10.1093/nar/gki384eng[ HYPERLINK \l "_ENREF_7" \o "Garrow, 2005 #26" 7]Post-translational modificationsPhosphorylation (3)NetPhos 2.0 ADDIN EN.CITE ADDIN EN.CITE.DATA [ HYPERLINK \l "_ENREF_8" \o "Blom, 1999 #27" 8]Acetylation (1)NetAcet 1.0 ADDIN EN.CITE Kiemer200528[9]282817Kiemer, L.Bendtsen, J. D.Blom, N.Center for Biological Sequence Analysis, BioCentrum-DTU Building 208 Technical University of Denmark, DK-2800 Lyngby, Denmark.NetAcet: prediction of N-terminal acetylation sitesBioinformaticsBioinformatics1269-702172004/11/13AcetylationAcetyltransferases/*chemistry/metabolism*Algorithms*Artificial IntelligenceBinding SitesProtein BindingProtein Interaction Mapping/*methodsSaccharomyces cerevisiae Proteins/*chemistry/metabolismSequence Analysis, Protein/*methods*Software2005Apr 11367-4803 (Print)
1367-4803 (Linking)15539450http://www.ncbi.nlm.nih.gov/pubmed/15539450bti130 [pii]
10.1093/bioinformatics/bti130eng[ HYPERLINK \l "_ENREF_9" \o "Kiemer, 2005 #28" 9]Palmitoylation (4)CSS-Palm 3.0 ADDIN EN.CITE Ren200829[10]292917Ren, J.Wen, L.Gao, X.Jin, C.Xue, Y.Yao, X.Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, University of Science & Technology of China, Hefei, Anhui 230027, China.CSS-Palm 2.0: an updated software for palmitoylation sites predictionProtein Eng Des SelProtein Eng Des Sel639-4421112008/08/30AlgorithmsAmino Acid SequenceAnimalsCluster AnalysisComputer Simulation*Lipoylation*Models, MolecularProtein Processing, Post-TranslationalProteins/*metabolismSaccharomycetales/chemistry*Software2008Nov1741-0134 (Electronic)
1741-0126 (Linking)18753194http://www.ncbi.nlm.nih.gov/pubmed/187531942569006gzn039 [pii]
10.1093/protein/gzn039eng[ HYPERLINK \l "_ENREF_10" \o "Ren, 2008 #29" 10]
ADDIN EN.REFLIST 1. Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, et al. (2008) AAindex: amino acid index database, progress report 2008. Nucleic Acids Res 36: D202-205.
2. Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, et al. (2005) Protein identification and analysis tools on the ExPASy server. (In) John M. Walker (ed). The proteomics protocols handbook Humana Press: pp. 571-607.
3. Petersen B, Petersen TN, Andersen P, Nielsen M, Lundegaard C (2009) A generic method for assignment of reliability scores applied to solvent accessibility predictions. BMC Struct Biol 9: 51.
4. Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, et al. (2005) FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21: 3435-3438.
5. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8: 785-786.
6. Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305: 567-580.
7. Garrow AG, Agnew A, Westhead DR (2005) TMB-Hunt: a web server to screen sequence sets for transmembrane beta-barrel proteins. Nucleic Acids Res 33: W188-192.
8. Blom N, Gammeltoft S, Brunak S (1999) Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294: 1351-1362.
9. Kiemer L, Bendtsen JD, Blom N (2005) NetAcet: prediction of N-terminal acetylation sites. Bioinformatics 21: 1269-1270.
10. Ren J, Wen L, Gao X, Jin C, Xue Y, et al. (2008) CSS-Palm 2.0: an updated software for palmitoylation sites prediction. Protein Eng Des Sel 21: 639-644.
" J K z | " # - . / 9 K L ξ}}}ododoW}o} ha hm
CJ OJ QJ hm
CJ OJ QJ o(ha hm
CJ OJ QJ o(ha hm
CJ OJ QJ aJ o(ha hm
CJ OJ QJ aJ "ha hm
5CJ OJ QJ aJ o( ha hm
5CJ OJ QJ aJ hCGA hm
5CJ OJ QJ aJ hCGA 5CJ OJ QJ aJ o( "ha hm
5CJ OJ QJ aJ o( ha hm
5CJ OJ QJ aJ K ^ s = kd $$If l F O&
t 0 4 4
l a p yt0w d, $G$ H$ If gd0w d, G$ H$ XD2 YD2 gd0w # . L P kd $$If l F O&
t 0 4 4
l a p yt0w d, $G$ H$ If gd0w # $ % & ' ( ) + , q r | } ~ ɰɔɰ݆vvvɰZɰ݆v 6j ha hm
CJ OJ QJ UaJ mH nH u ha hm
CJ OJ QJ aJ o(ha hm
CJ OJ QJ o(6jt ha hm
CJ OJ QJ UaJ mH nH u 0j ha hm
CJ OJ QJ UaJ mH nH u 'ha hm
CJ OJ QJ aJ mH nH uha hm
CJ OJ QJ aJ %j ha hm
CJ OJ QJ UaJ #L * + , G ] q M kd $$If l 4 5F O&
t 0 4 4
l a p yt0w d, $G$ H$ If gd0w
M kd $$If l 4 4F O&
t 0 4 4
l a p yt0w d, $G$ H$ If gd0w
$ $ ǴkVHǴha hm
CJ OJ QJ o()ha hm
CJ OJ QJ fH q
&