Proteomic Analysis of Tardigrades: Towards a Better Understanding of Molecular Mechanisms by Anhydrobiotic Organisms

Background Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Principal Findings Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. Conclusions The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades.


Introduction
Many organisms are exposed to unfavourable, stressful environmental conditions, either permanently or for just certain periods of their lives. To survive these extreme conditions, they possess different mechanisms. One of amazing adaptation is anhydrobiosis (from the Greek for ''life without water''), which has puzzled scientists for more than 300 years. For the first time the Dutch microscopist Anton van Leeuwenhoek (1702) gave a formal description of this phenomenon. He reported the revival of ''animalcules'' from rehydrated moss samples. In extreme states of dehydration, anhydrobiotic invertebrates undergo a metabolic dormancy, in which metabolism decreases to a non-measurable level and life comes to a reversible standstill until activity is resumed under more favourable conditions [1]. One of the best known anhydrobiotic organisms are tardigrades. Tardigrades remain in their active form when they are surrounded by at least a film of water. By loosing most of their free and bound water (.95%) anhydrobiosis occurs [2]. Tardigrades begin to contract their bodies and change their body structure into a so-called tun state ( Figure 1). In the dry state these organisms are highly resistant to environmental challenge and they may remain dormant for a long period of time. Based on their amazing capability to undergo anhydrobiosis, tardigrades colonise a diversity of extreme habitats [3], and they are able to tolerate harsh environmental conditions in any developmental state [4]. Possessing the ability to enter anhydrobiosis at any stage of life cycle, tardigrades can extend their lifespan significantly [4,5]. Additionally, in the anhydrobiotic state, tardigrades are extraordinary tolerant to physical extremes including high and subzero temperatures [6,7,8], high pressure [6,9], and extreme levels of ionizing radiation [10,11]. Interestingly, tardigrades are even able to survive space vacuum (imposing extreme desiccation) and some specimens have even recovered after combined exposure to space vacuum and solar radiation [12].
Anhydrobiosis seems to be the result of dynamic processes and appears to be mediated by protective systems that prevent lethal damage and repair systems. However, the molecular mechanisms of these processes are still poorly understood. Up to now investigations of mechanisms of desiccation tolerance have focused mainly on sugar metabolisms, stress proteins and a family of hydrophilic proteins called LEA (late embryogenesis abundant). The presence of non-reducing trehalose and its expression during anhydrobiosis has been reported for different anhydrobiotic species [13,14], which indicates the important role of trehalose in anhydrobiosis. However, the existence of anhydrobiotic animals that exhibit excellent desiccation tolerance without having disaccharides in their system [15,16] shows that sugars alone do not sufficiently explain these phenomena.
Milnesium tardigradum Doyère (1840) is a very well known species of carnivorous tardigrade. Different aspects of the life history of this species have been already described [17]. While some genetic studies of M. tardigradum exist [18] almost nothing is known about the proteome. Partial sequences of three heat shock protein (hsp70 family) genes and the housekeeping gene beta-actin have been described [18] and the relation of hsp70 expression to desiccation tolerance could be shown by real time PCR [18] and by de novo protein synthesis [6]. Since no trehalose could be detected in M. tardigradum [19], investigating proteins and posttranslational modifications is of particular importance to clarify surviving mechanisms during desiccation.
To gain insight into the unique adaptation capabilities of tardigrades on the protein level we aimed to establish a comprehensive proteome reference map of active M. tardigradum employing optimized protocols for protein extraction, generation of high-resolution 2D gels and high-throughput protein identification by electrospray ionization tandem mass spectrometry (ESI-MS/MS). The proteome reference map of M. tardigradum provides the basis for further studies in order to understand important physiological processes such as anhydrobiosis and stress resistance. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of active and anhydrobiotic states. Thus, our proteomic approach together with in-depth bioinformatic analysis will certainly provide valuable information to solve the over 300 years existing puzzle of anhydrobiosis.

Preparation of Protein Extracts from Active Tardigrades
To establish and optimize a reliable and robust protocol for the extraction of proteins from tardigrades in the active state we applied different workup protocols and evaluated them by onedimensional (1D) gel electrophoresis. Figure 2 shows the separation of protein extracts from whole tardigrades without any precipitation step (lane 2), after trichloroacetic acid/acetone precipitation (lane 3), after chloroform/methanol precipitation (lane 4) and after using a commercially available clean-up kit (lane 5). When using trichloroacetic acid/acetone precipitation we lost many proteins especially in the low molecular weight range. Chloroform/methanol precipitation and application of clean-up kit delivered satisfying results but also using the whole protein lysate directly without any further purification resulted in high yields across the entire molecular weight range. This workup protocol was therefore used throughout our proteome study. To evaluate the quality of our protocol especially with respect to proteolysis we performed Western blot analysis to detect any protein degradation. Since no proteins have been identified so far, we have chosen two polyclonal antibodies directed against the highly conserved proteins actin and alpha-tubulin. As shown in Figure 3A and 3B both proteins could be detected at their expected molecular weight at approx. 40 and 50 kDa, respectively, which is in agreement with the protein bands of the control lysate of HeLa cells. Importantly, no protein degradation could be observed during our sample preparation.

Two Dimensional Gel Electrophoresis (2-DE)
The establishment of an optimized workup protocol was a prerequisite for high quality 2D gels from tardigrades in the active state. The proteomics workflow is depicted in Figure 4. One important step in the workflow is the collection and preparation of the samples. To avoid contamination with food-organisms,  tardigrades were washed several times and starved over 3 days. Direct homogenization and sonication of deep-frozen tardigrades in ice cold lysis buffer without any previous precipitation step yielded protein extracts which were separated by high resolution 2D gel electrophoresis. For maximal resolution of protein spots and high loading capacity (330 mg proteins) we used pI 3-11 NL strips (24 cm) for the first dimension. Thus, high resolution separation could be achieved in the acidic as well as in the basic pH range as shown in the image of the silver stained preparative gel of whole protein extract ( Figure 5).
Approximately 1000 protein spots were automatically detected on the 2D gel image using the Proteomweaver image software. A total of 606 protein spots were picked from the silver stained gel. These spots were digested with trypsin and after extraction of the tryptic peptides from the gel plugs peptide mixtures were analyzed by nanoLC-ESI-MS/MS.

Protein Identification
Identification of proteins depends on the representation of the sequence or a close homologue in the database. Since almost no genome or EST sequences of M. tardigradum are available to date in public databases we initiated the tardigrade EST sequencing project as outlined in figure 4 (Mali et al, submitted data). A cDNA library was prepared from tardigrades in different states (active, inactive, transition states). The cDNAs were sequenced as ESTs and clustered. Thereby, we obtained a nucleotide database containing 818 contigs and 2500 singlets. cDNA sequencing and generation of ESTs are still ongoing, thus the sequence coverage of M. tardigradum in the database is incomplete.
For protein identification we used the following databases: the database of M. tardigradum containing the clustered ESTs as outlined above, the tardigrade protein database, which was translated from the clustered EST database and thus represents a subdatabase containing only annotated proteins with known function and the publicly available NCBInr database. The selected 606 spots from the 2D gel correspond to some highly expressed proteins, but mostly to spots in the medium and low expression range. A total of 271 spots could be identified from the tardigrade protein and the NCBInr databases. Figure 6 demonstrates how identified proteins are distributed among these two databases. 56 unique proteins were successfully identified by searching the NCBInr database. It concerns proteins which are either highly conserved among different species e. g. actin or protein entries from M. tardigradum which are already available in the NCBInr database e.g. elongation factor 1-alpha. Further 73 unique proteins could be identified by searching the tardigrade protein database and another 15 unique proteins were present in both databases. Identical proteins that were identified from several spots were included only once in the statistics to avoid bias. Thus, the combination of the two databases was sufficient for the identification of 144 unique proteins. The corresponding protein spots are indicated by green circles in the 2D reference map shown in Figure 5. Table 1 shows an overview of identified proteins with annotation in different functional groups. In addition, detailed information about each of the identified 144 proteins including spot number, protein annotation, accession number (NCBInr and Tardigrade specific accession number), total protein score, number of matched peptides, peptide sequence and sequence coverage is  Total protein extracts were separated by two-dimensional gel electrophoresis. After silver staining protein spots were picked and ingel digested with trypsin. MS/MS data obtained by LC-ESI-MS/MS analysis were searched against the NCBInr database, the clustered tardigrade EST database and the tardigrade protein database. Identified proteins with annotation were classified in different functional groups using the Blast2GO program. Identified proteins without annotation were analysed with the DomainSweep program to annotate protein domains. doi:10.1371/journal.pone.0009502.g004 listed in Table 2. The individual ion score is included in brackets at the end of every peptide sequence. Following ion scores indicate a significant hit (p,0.05): .53 for NCBInr searches, .14 for searches in the tardigrade protein database and .27 by searching the EST clustered database. Identical proteins identified in different spots are listed only once in Table 2. In these cases the spot with the highest protein score (in bold) is ranked at the top whereas the other spots are listed below. All further information such as accession numbers, peptide sequences and sequence coverage refer to the top-ranked spot.
The 15 proteins which were identified in both databases are indicated with asterisk (e.g. spot A30*) and both accession numbers are listed. In these cases the listed peptide sequences belong to the hit with the highest score. Protein spots below the bold one are marked with u, when only found in the NCBInr database or marked with^, when only found in the tardigrade protein database.
Furthermore we were able to identify additional 150 protein spots by searching MS/MS data in the clustered EST database of M. tardigradum. These 150 proteins correspond to 36 unique contigs and ESTs. The protein information is listed in Table 3 and the protein spots are indicated by blue circles in the 2D reference map ( Figure 5). Unfortunately, it was not possible to annotate them when performing a BLAST search. For these proteins of unknown function more information could be obtained by applying protein domain annotation methods. We ran all proteins through the DomainSweep pipeline which identifies the domain architecture within a protein sequence and therefore aids in finding correct functional assignments for uncharacterized protein sequences. It employs different database search methods to scan a number of Figure 5. Image of a preparative 2D-gel with selected analysed protein spots. Total protein extract of 400 tardigrades in the active state corresponding to 330 mg was separated by high resolution two-dimensional gel electrophoresis. Proteins were visualised by silver staining. Three different categories are shown: Identified proteins with functional annotation are indicated in green, identified proteins without annotation are indicated in blue and not yet identified proteins are indicated in red. doi:10.1371/journal.pone.0009502.g005 protein/domain family databases. 2 out of the 36 unique proteins gave a significant hit, whereas 28 proteins were listed as putative and 6 proteins gave no hit at all.
In addition, we analyzed further 185 protein spots, which are indicated with red colour in Figure 5. Despite high quality MS/ MS spectra, it was not possible to identify these protein spots in either of the databases used in our study.
In summary, we identified 421 (69.5%) out of 606 protein spots which were picked from the preparative 2D gel. 271 spots yielded 144 unique proteins with distinct functions whereas 150 spots were identified as proteins with yet unknown functions.

Functional Assignment of Proteins
The 144 unique proteins with annotation were further analysed using the Blast2GO program, which provides analysis of sequences and annotation of each protein with GO number to categorize the proteins in molecular function, biological process and cellular component. By analysing the proteins on the GO level 2 in the category molecular function we received a total of 9 subgroups as shown in Figure 7, upper middle chart. The majority of the identified proteins exhibit either binding (45%) or catalytic activity (33%). A more detailed analysis (GO level 3) revealed that 39% of the proteins with catalytic activity are involved in hydrolase activity ( Figure 7, upper right chart) and 38% of binding proteins bind to other proteins ( Figure 7, upper left chart).
Identified proteins are involved in diverse biological processes. A total of 16 subgroups of biological processes are represented ( Figure 7, lower middle chart). 23% are involved in cellular processes and 18% in metabolic processes. Within the cellular processes a majority of 20% of tardigrade proteins are involved in cellular component organization and biogenesis. Within the metabolic processes 28% of proteins are involved in cellular metabolic processes, 26% in primary metabolic processes and 21% in macromolecule metabolic processes (Figure 7, lower right chart). A detailed GO description of all identified and annotated tardigrade proteins is included in Table S1.
Several protein spots have been identified as cytoskeletal proteins, including actin as most abundant protein spot (E48) on the 2D gel and tubulin. Actin and tubulin are highly conserved proteins and were used to control proteolytic degradation during our workup procedure by Western blotting. Four different actin proteins are found by MS/MS analysis, which play important roles in muscle contraction, cell motility, cytoskeletal structure and cell division. Tubulin is a key component of the cytoskeletal microtubules. Both alpha-and beta-tubulin could be identified on the 2D gel in spot D107, D110 and F6. Further proteins involved in motor activity and muscle contraction were found, namely tropomyosin (e.g. spot F35), myosin (e.g. spot F81), annexin A6 (e.g. spot D90) and myophilin (e.g. spot A128), which is a smoothmuscle protein and was described in the tapeworm Echinococcus granulosus [20].
In addition, several proteins have been identified which are known to have important roles in embryonic or larval development. Mitochondrial malate dehydrogenase precursor (e.g. spot B109), vitellogenin 1 and 2 (e.g. spot D62 and B88), GDPmannose dehydratase (spot C87), protein disulfide isomerase 2 (e.g. spot F3), hsp-3 (spot F21), hsp-1 (spot F27), tropomyosin (spot F35) and troponin C (spot F87) belong to this group of proteins. Vitellogenin, a major lipoprotein in many oviparous animals, is known as the precursor of major yolk protein vitellin [21]. Vitellogenin is a phospholipo-glycoprotein which functions as a nutritional source for the development of embryos [22]. During developing oocytes vitellogenin and vitellin are modified through cleavage and by different posttranslational modifications (PTMs) like glycosylation, lipidation and phosphorylation. Interestingly we could identify vitellogenin in several spots on the 2D gel showing vertical (pI) shifts most probably caused by PTMs.
Peroxiredoxins identified first in yeast [23] are conserved, abundant, thioredoxin peroxidase enzymes containing one or two conserved cysteine residues that protect lipids, enzymes, and DNA against reactive oxygen species. Different isoforms of peroxiredoxins could be identified on the 2D gel: peroxiredoxin-4 (spot C132), peroxiredoxin-5 (spot B183) and peroxiredoxin-6 (spot D159). An important aspect of desiccation tolerance is protection against free radicals [24,25]. Notably, the expression of 1-cysteine (1-Cys) peroxiredoxin family of antioxidants is reported in Arabidopsis thaliana and is shown to be related to dormancy [26]. Our results show the presence of important antioxidant systems, including superoxide dismutase (SOD) and peroxidases. Additionally different forms of glutathione S-transferases (spot A122, B153, B166, B169, D166, and D159) could be identified. Glutathione transferases (GSTs) constitute a superfamily of detoxifying enzymes involved in phase II metabolism. Detoxification occurs by either glutathione conjugation, peroxidase activity or passive binding [27]. Furthermore GSTs have cellular physiology roles such as regulators of cellular pathways of stress response and housekeeping roles in the binding and transport of specific ligands [28]. The consequence of this diversity in role is the expression of multiple forms of GST in an organism. It has been shown that the expression of the different isoenzymes is highly tissue-specific [29], and this heterogeneity of GSTs may be further complicated by posttranslational modifications such as glycosylation [30].
Some protein spots were identified as calreticulin (e.g. spot F14) which is a Ca 2+ -binding protein and molecular chaperone. Calreticulin is also involved in the folding of synthesized proteins and glycoproteins [31].                  Three different cathepsin proteins could be identified: cathepsin K (spot A84), cathepsin Z (spot E80) and cathepsin L1 (spot F81). Cathepsin L is a ubiquitous cysteine protease in eukaryotes and has been reported as an essential protein for development in Xenopus laevis [32], Caenorhabditis elegans [33] and Artemia franciscana [34].
Several protein spots are associated with ATP generation and consumption and may have important roles in the early development as described for Artemia, because many important metabolic processes require ATP [35,36]. ATP synthase (spot B152) regenerates ATP from ADP and Pi [37]. It consists of two parts: a hydrophobic membrane-bound part (CF0) and a soluble part (CF1) which consists of five different subunits, alpha, beta (spot E89), gamma, delta (spot C139) and epsilon. Arginine kinase (spot B167) is an ATP/guanidine phosphotransferase that provides ATP by catalyzing the conversion of ADP and phosphorylarginine to ATP and arginine [38]. The presence of arginine kinase has been shown in tissues with high energy demand [39].
Interestingly, we could identify the translationally controlled tumor protein (TCTP) (spot F75) on the 2D gel. TCTP is an important component of TOR (target of rapamycin) signalling pathway, which is the major regulator of cell growth in animals and fungi [40].

Evaluation of Heat Shock Proteins by Western Blot Analysis
To evaluate the highly conserved heat shock proteins 60 and 70, we performed Western blot analyses with antisera directed against these proteins. Hsp70 was found in several spots on the reference 2D proteome map, e.g. in spot B172, C31, C133 and F27. None of these spots fits well to the calculated molecular weight of approx. 70 kDa, most of them were considerably smaller. In contrast, the immunoblot shows the strongest band at the expected position which is in agreement with the position of hsp70 in the control lysate of HeLa cells ( Figure 8B). However, several additional bands can be observed at higher as well as at lower molecular weights. The lower bands might account for the identified spots on the 2D gel with lower molecular weight. The full-length protein might have escaped the spot picking procedure since only a limited number of detected spots were further processed.
Hsp60 was identified in spot F57 of the 2D map as described above. Since hsp60 was identified by only one peptide hit we confirmed this result by immunostaining using an antibody directed against a peptide in the C-terminal region of the entire protein.
Only one band is visible on the Western blot at approx. 24 kDa whereas the protein band in the HeLa control lysate is located at its expected position ( Figure 8A). The lower molecular weight is in accordance with the location of hsp60 (spot F57) on the 2D gel. Thus, in M. tardigradum hsp60 exists in a significantly shorter form. Whether the observed difference in the molecular weight indicates a different function and role of this protein in M. tardigradum needs to be investigated in future experiments. To test whether other tardigrade species show similar results we performed an immunoblot with protein lysates from 5 other species namely Paramacrobiotus richtersi, Paramacrobiotus ''richtersi group'' 3, Macrobiotus tonollii, Paramacrobiotus ''richtersi group'' 2 and Paramacrobiotus ''richtersi group'' 1. Total protein lysate from HeLa cells was loaded as control ( Figure 9A, lane 1). Actin served as loading control for all lysates ( Figure 9B). Interestingly, some species also exhibit truncated forms of hsp60 on the Western blot whereas others show higher forms. The molecular weights are ranging from approx. 75 kDa for P.  (Figure 9, lane 3 and 7).

Establishing a Comprehensive Proteome Map of Milnesium tardigradum
The analysis of the proteome of M. tardigradum represents to our knowledge the first detailed study of tardigrades on the protein level. Our experimental strategy aimed to identify as many as possible proteins from tardigrades. Thus, we have not employed any subcellular fractionation steps to obtain specific subproteomes. We have tested various protocols for protein extraction from whole tardigrades. We could show that direct homogenisation of tardigrades in lysis buffer without any previous precipitation steps is most efficient and enables the generation of high quality 2D gels.
Since nothing was known about the proteolytic activity in M. tardigradum special precautions were taken to avoid any protein degradation or proteolysis throughout the whole workup procedure. Integrity of proteins was carefully inspected by Western blot analysis of the two housekeeping proteins actin and tubulin where the sequence homology was assumed to be high enough to detect the proteins with commercially available antibodies. The development of a robust workup protocol laid the basis for the generation of a protein map from whole tardigrades in the active state. 56 unique proteins could be identified by searching high  quality MS/MS spectra against the publicly available NCBInr database. However, for many proteins we could not find any homologues in the NCBInr database and only by using our own newly generated tardigrade protein database it was possible to identify another 73 unique proteins. 15 proteins were present in both databases. In addition 36 unique proteins were found in the clustered tardigrade EST database which could not be annotated by BLAST search. This concerns new specific proteins of M. tardigradum.

Performance of Database Searches
When we started our study of the tardigrade proteome very little was known about tardigrades at the genome and gene expression level. To this day, only 12 proteins are recorded in the NCBInr database, which originate from M. tardigradum. For all of them only partial sequences ranging from as few as 43 amino acids for beta actin up to 703 amino acids for elongation factor-2 are available. Therefore, in parallel to our proteomic study a M. tardigradum EST sequencing project has been initiated. Subsequently, two tardigrade specific databases have been established: a clustered tardigrade EST database and a tardigrade protein database which was extracted from the clustered EST database and thus represents a subdatabase containing all tardigradespecific proteins with annotated function. However, since cDNA sequencing is still ongoing sequence information remains incomplete. We assume that the tardigrade database currently covers approximately one tenth of the tardigrade specific genes comparing the unique clusters found in tardigrades to all known proteins of Caenorhabditis elegans or Drosophila melanogaster in Ensembl. This fact is greatly influencing our database searches. For most of the protein spots that were analysed by ESI-MS/MS high quality fragmentation spectra were obtained from MS/MS experiments. However, when we searched these MS/MS data against the tardigrade databases and the publicly available NCBInr database, only about 70% of the spots yielded in protein identification whereas the remaining spots gave no significant protein hit. In addition it was impossible to manually extract peptide sequences that were sufficient in length to perform BLAST searches with satisfactory results.
When we examined the protein hits obtained by the three databases in more detail we found that in the NCBInr database approximately one half of the proteins were identified by only one significant peptide hit ( Figure 10). For about 25% of the proteins more than one significant peptide hit was obtained. For the remaining 25% only the protein score which is the sum of two or more individual peptides scores was above the significance threshold while none of the peptide scores alone reached this value. In contrast, proteins found in the tardigrade protein database were predominantly identified by more than one significant peptide hit whereas a smaller number was represented by only one peptide. In no cases a protein was identified by the sum of non-significant peptide matches. For proteins without annotation the number of proteins identified by only one peptide was only slightly higher than the number of proteins identified by two or more peptides.
These results are not surprising. Since the NCBInr database contains very few sequences originating from M. tardigradum e.g. elongation factor 1-alpha the identification relies predominantly on high homologies between tardigrade sequences and sequences from other more or less related species of other taxa. The chances for detecting more than one identical peptide is significantly higher when searching MS/MS data against the tardigrade EST and tardigrade protein databases since these databases contain only tardigrade specific sequences.
Overall, one might evoke a potentially high false positive rate especially since proteins are included in the reference map which are either identified by only one significant peptide hit or where two or more non-significant peptide scores are summed up to a significant protein score. On the other hand, proteins like LEA and heat shock protein 60 are identified by only one peptide match. Nevertheless they could be confirmed by Western blot analysis to be present in the tardigrade protein extract. Given the incomplete sequence data available to date many proteins might escape confirmation by orthogonal methods e.g. due to the lack of specific antibodies.

Proteins Associated with Anhydrobiosis
Among the numerous proteins which were identified in our proteomic study some proteins have already been reported to play an important role in anhydrobiotic organisms. Most importantly, spot F88 was identified as a protein belonging to the LEA (late embryogenesis abundant) family (group 3). This result was already known from Western blot analyses (Schill et al., 2005, poster presentation, ISEPEP, Denmark). At least six different groups of LEA proteins have been described so far. Group 1, 2 and 3 are the three major groups. Whereas group 1 is only found in plants and group 2 predominantly in plants, group 3 is reported in organisms other than plants. Although the precise role of LEA proteins has not yet been fully elucidated, different research groups have reported on their association with tolerance to water stress by desiccation [41,42]. LEA protein of group 3 could be already identified in nematodes C. elegans, Steinernema feltiae and Aphelenchus avenae, and the prokaryotes Deinococcus radiodurans, Bacillus subtilis and Haemophilus influenzae [43,44,45].

Proteins Exhibiting an Unusual Location on the 2D Map
In general we identified some proteins which show a lower molecular weight than expected. As described above hsp60 is detected as a protein band at 24 kDa by Western blotting and the location of the corresponding spot on the 2D gel shows the same molecular weight. Comparison of different tardigrade species indicates the existence of short as well as long forms of hsp60. Figure 10. Statistical analysis of significant peptides found in the three different databases which were used to search the MS/MS data. The number of significant peptide hits is compared between the different databases. When searching against the NCBInr database most proteins were identified with only one significant peptide hit. In contrast when using the tardigrade protein database most proteins were represented by two or more significant peptides. doi:10.1371/journal.pone.0009502.g010 Unique proteins, when analyzed on the 2D gel, often show multiple spots due to posttranslational modifications. Proteins of the vitellogenin family are widely distributed on the 2D gel and show pI as well as molecular weight shifts, which are due to modification through cleavage and to different PTMs like glycosylation and phosphorylation during development of oocytes. Ongoing experiments to detect PTMs using different fluorescence staining methods like ProQ-Emerald for the detection of glycoproteins and ProQ-Diamond for the detection of phosphoproteins indicate that these modifications indeed occur in tardigrades (data not shown).

Prediction of Functional Domains in Proteins with Yet Unknown Functions
36 proteins which could not be identified by BLAST searches were further examined looking for matching functional protein domains with DomainSweep. The function of the following two spots could be revealed with high confidence (Table 3): spot F63 seems to belong to the ''tumor protein D52'' interpro family (IPR007327). The hD52 gene was originally identified through its elevated expression level in human breast carcinoma, but cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Regarding the taxonomic neighbours of the tardigrades, one member in C. elegans and 10 members in Drosophila melanogaster are reported by Interpro for this family. Spot C95 seems to belong to the family ''glucose/ribitol dehydrogenase'' (IPR002347). 80 members both in C. elegans and in Drosophila melanogaster are reported for this family. 28 putative hits were found associated with other spots. These protein hits are putative candidates and therefore less reliable. A comprehensive protein database of M. tardigradum as the result of our ongoing cDNA sequencing will help us to evaluate these candidates.

Conclusion
In this study we present for the first time a comprehensive proteome map of M. tardigradum. A full description of proteins present in the active state provides a valuable basis for future studies. Most importantly, the protein reference map allows us to undertake quantitative proteomics analysis to detecting proteins with different expression levels in the active versus the anhydrobiotic state. In particular, our workflow is fully compatible with the application of 2D difference gel electrophoresis (2D DIGE), which is one technique allowing sensitive analysis of differences in the protein expression levels. This differential analysis on the protein level will help us to understand survival mechanisms in anhydrobiotic organisms and eventually to develop new methods for preservation of biological materials.

Tardigrade Culture and Sampling
Tardigardes of the species M. tardigradum Doyère 1840 were maintained in a laboratory culture. The culture was grown on agarose plates (3%) (peqGOLD Universal Agarose, peqLAB, Erlangen Germany) covered with Volvic TM water (Danone Waters, Wiesbaden, Germany) at 20uC. The juveniles were fed on green algae Chlorogonium elongatum, the adults with bdelloid rotifers Philodina citrina. The specimens for the experiments were all of middle-age, thus effects of age can be excluded. Tardigrades were starved over 3 days and washed several times with Volvic TM water to avoid contamination with food-organisms. Subsequently the animals were transferred to microliter tubes (200 individuals per tube) and surrounding water was reduced to approx. 1-2 ml.

Sample Preparation for Gel Electrophoresis
To optimize the sample preparation different precipitation methods have been tested. Chloroform/methanol and TCA/ acetone precipitations were performed as described by Wessel, Fluegge [46] and Görg [47], respectively. We used also the commercially available precipitation kit (clean-up kit from GE Healthcare). Comparing the result of different precipitation protocols on a 1D gel we decided to homogenise the tardigrades directly in ice cold lysis buffer and avoid any precipitation steps. The animals (200 individuals) were homogenised directly in 60 ml lysis buffer (containing 8 M urea, 4% CHAPS, 30 mM Tris, pH 8,5) by ultrasonication (SONOPULS, HD3100, Bandelin Electronic) with 45% amplitude intensity and 1-0.5 sec intervals. The lysis buffer contained a Protease Inhibitor Mix (GE Healthcare) to inhibit serine, cysteine and calpain proteases. After homogenisation the samples were stored at 280uC. For gel electrophoresis insoluble particles were removed by centrifugation for 2 min at 14,000 g and the supernatant was quantified using BCA mini-assay.

One Dimensional Gel Electrophoresis and Western Blotting
To compare the efficiency of different sample preparation methods we separated approx. 10 mg total protein extract on a 1D gel. The gel was stained with protein staining solution (PageBlue from Fermentas). For Western blotting a total protein extract of tardigrades (15-20 ug) was separated on a NuPAGE TM 4-12% Bis-Tris mini gel (Invitrogen) using MES running buffer. 200 V were applied until the bromophenol blue front had reached the bottom of the gel (approx. 40 min). Separated proteins were electro transferred onto PVDF membrane for 1.5 h at maximum 50 mA (0.8/cm 2 ) in a semi-dry transfer unit (Hoefer TM TE 77) using following transfer solution: 24 mM Tris, 192 mM glycine and 10% methanol. The PVDF membrane was incubated in a blocking buffer containing 5% non-fat milk, 0.1% Tween20 in PBS. As primary antibodies we used anti actin pan Ab-5 (dianova), anti hsp 60 Ab (D307) (Cell signaling), anti hsp70 Ab (BD Biosciences Pharmingen) and anti a-Tubulin Ab (Sigma).
For molecular weight determination of the target proteins on film we used ECL DualVue marker (GE-Healthcare). Immunoreaction was detected using the ECL Western Blotting Detection kit from GE Healthcare. Images were acquired using an Image Scanner Model UTA-1100 (Amersham Biosciences).

Two Dimensional Gel Electrophoresis
For 2D gel preparation we added 60 ml 2x sample buffer (7 M urea, 2 M thiourea, 2% CHAPS, 2% DTT, 2% IPG-buffer 3-11 NL) to each aliquot and incubated by shaking for 30 min at 25uC. To avoid streaking on the gels we used 330 ml destreaking buffer (GE Healthcare) instead of rehydration buffer, to which we added 2% IPG-buffer (pI 3-11). Samples were incubated by shaking for 30 min at 25uC. We loaded 100 mg protein on analytical gels and 330 mg on preparative gel.
Strip loading. Loading of proteins was performed during strip rehydration with the recommended volume (450 ml for 24 cm strips) over night.
IEF conditions. First dimension isoelectric focusing (IEF) was performed, using 24 cm long IPG strips with non-linear gradients from pH 3-11 and an Ettan IPGphor instrument and proceeded for 46.4 kVh with the following running protocol: 3 h at 300 V, 6 h at 500 V, 8-h gradient up to 1000 V, 3-h gradient up to 8000 V and 3 h at 8000 V. Strips were either immediately used for the second dimension or stored at 280uC.
Second dimension. Strips were equilibrated in 6 M urea, 2% SDS, 30% glycerol, 0.375 M Tris-HCl pH 8.8, 0.002% bromophenol blue and 10 mg/ml DTT for 15 min, followed by a second equilibration step with the same buffer containing 25 mg/ ml iodoacetamide instead of DTT, also for 15 min.
Strips were loaded on 12% SDS-gels with an overlay of agarose solution (0,5 mg/100 ml electrophoresis buffer). The second dimension was performed using an Ettan Dalttwelve electrophoresis system (GE Healthcare). Separation was carried out at 1.5 watt/1.5 mm thick gel until the bromophenol blue reached the bottom of the gel (approx. 18 h).
Silver staining of proteins and image analysis. Proteins on analytical gels were visualized by destructive silver staining according to Blum [48]. Additionally, we performed a silver stain compatible with mass spectrometric analysis described by Sinha [49] for preparative gels. Images were acquired using an Image Scanner Model UTA-1100 (Amersham Biosciences).

Protein Identification
In-gel digestion. Protein spots were excised semi-manually with a spot picker (GelPal, Genetix) following non-destructive silver staining and stored at 280uC after removing water. Gel pieces were reduced, alkylated and in-gel digested with trypsin. Briefly, after incubation with 150 ml water at 42uC for 8 min, water was removed (washing step) and gel pieces were shrunk by dehydration with 150 ml 40 mM NH 4 HCO 3 /ethanol 50:50 (v/v) at 42uC for 5 min in a thermo mixer (600 rpm). The solution was removed and the proteins were reduced with 50 ml 10 mM dithiothreitol in 40 mM NH 4 HCO 3 for 1 h at 56uC. The solution was removed and gel pieces were incubated with 150 ml 40 mM NH 4 HCO 3 for 5 min at 42uC. After removing the solution gel pieces were alkylated with 100 ml 55 mM iodoacetamide in 40 mM NH 4 HCO 3 for 30 min at 25uC in the dark, followed by three alternating washing steps each with 150 ml of 40 mM NH 4 HCO 3 and ethanol for 5 min at 37uC. Gel pieces were then dehydrated with 100 ml neat acetonitrile for 1 min at room temperature, dried for 15 min and subsequently rehydrated with porcine trypsin (sequencing grade, Promega, Mannheim, Germany) with the minimal volume sufficient to cover the gel pieces after rehydration (100 ng trypsin in 40 mM NH 4 HCO 3 ). Samples were incubated over night at 37uC.
Extraction. After digestion over night the supernatant was collected in PCR-tubes while gel pieces were subjected to four further extraction steps. Gel pieces were sonicated for 5 min in acetonitrile/0.1% TFA 50:50 (v/v). After centrifugation the supernatant was collected and gel pieces were sonicated for 5 min in acetonitrile. After collecting the supernatant gel pieces were sonicated for 5 min in 0.1% TFA followed by an extraction step again with acetonitrile. The combined solutions were dried in a speed-vac at 37uC for 2 h. Peptides were redissolved in 6 ml 0.1% TFA by sonication for 5 min and applied for ESI-MS/MS analysis.
ESI-MS/MS analysis and database search. NanoLC-ESI-MS/MS was performed on a Qtof Ultima mass spectrometer (Waters) coupled on-line to a nanoLC system (CapLC, Waters). For each measurement 5 ml of the digested sample was injected. Peptides were trapped on a Trapping guard C18-AQ, 10 mm60.3 mm, particle size 5 mm (Dr. Maisch). The liquid chromatography separation was performed at a flow rate of 200 nl/min on a Reprosil C18-AQ column, 150 mm675 mm, particle size 3 mm (Dr. Maisch GmbH). The following linear gradient was applied: 5% B for 5 min, from 5 to 15% B in 5 min, from 15 to 40% B in 25 min, from 40 to 60% B in 15 min and finally 60 to 95% B in 5 min. Solvent A contains 94.9% water, 5% acetonitrile, 0.1% formic acid, solvent B contains 95% acetonitrile, 4.9% water and 0.1% ml formic acid. The LC-ESI-MS/MS device was adjusted with a PicoTip Emitter (New Objective, Woburn, MA) fitted on a Z-spray nanoESI interface (Waters). Spectra were collected in the positive ion mode. The capillary voltage was set to 2400 V and the cone voltage was set to 80 V. Data acquisition was controlled by MassLynxTM 4.0 software (Waters). Low-energy collision-induced dissociation (CID) was performed using argon as a collision gas (pressure in the collision cell was set to 5610 25 mbar), and the collision energy was in the range of 25-40 eV and optimized for all precursor ions dependent on their charge state and molecular weight. Mass Lynx raw data files were processed with Protein Lynx Global Server 2.2 software (Waters). Deisotoping was performed using the MaxEnt3 algorithm.
The obtained MS/MS spectra were searched against the publicly available NCBInr database using the MASCOT algorithm version 2.0 (Matrix Science, London, UK). The mass tolerance was set to 0.1 Da for fragment ions and 200 ppm for precursor ions. No fragment ions score cutoff was applied. The following search parameters were selected: variable modification due to methionine oxidation, fixed cysteine modification with the carbamidomethyl-side chain, one missed cleavage site in the case of incomplete trypsin hydrolysis. The following settings were applied: minimum protein score .53, minimum number of peptides $1. Furthermore, protein hits were taken as identified if a minimum of one peptide had an individual ion score exceeding the MASCOT identity threshold. Under the applied search parameters a sum MASCOT score of .53 refers to a match probability of p,0.05, where p is the probability that the observed match is a random event. Redundancy of proteins that appeared in the database under different names and accession numbers was eliminated. Additionally we searched against the M. tardigradum EST and protein database (see below) to identify sequences not present in the NCBInr databases. The following settings were applied: minimum protein score .14 for the EST and .27 for the clustered EST database (p,0.05). Other parameters were as described for the NCBInr searches.

Generation of the Tardigrade EST Database
cDNA libraries from mRNA from tardigrades in different states (active, inactive, transition states) were prepared and sequenced (Mali et al, submitted data). The obtained EST sequences were cleaned from vector sequences using Seqclean against UniVecdatabase from NCBI (version 12. September 2008, Kitts et al., unpublished). Repeats within the cleaned ESTs were masked using the online service RepeatMasker (version 3.2.6, RM-20080801, Smit et al., unpublished data) followed by a second Seqclean run to eliminate low quality and short sequences. The assembly was performed using cap3 [50] with clipping enabled and resulted in 3318 Unigenes (2500 singlets, 818 contigs). Identification of ribosomal sequences was done using a BlastN-search [51] against the Silva-DB (only eukaryotic sequences, Silva95, [52]) and an Evalue cutoff of 1e-3 and resulted in 46 sequences which showed high similarity to ribosomal sequences. Unigenes coding for known proteins were identified using a BlastX search against Uniprot/ Swissprot (version 14.1, September 2008), Uniprot/TrEMBL (version 56.1, September 2008, The UniProt Consortium, 2008) and NRDB (version 12. September 2008,) with an E-value cutoff of 1e-3 and a hmmer-search against PFAM database (release 22, [53]) with an E-value cutoff of 0.1. Translation of Unigen sequences which gave a BlastX or PFAM hit (1539/1889 sequences) into the corresponding frame and a six-frame translation was performed using Virtual Ribosome (version 1.1 Feb-Mar, 2006, [54]). For six frame translation the read through mode of Virtual Ribosome was used. Afterwards stop codons were substituted by an undefined amino acid (X). All new sequences have been deposited in GenBank. The accession numbers are indicated in the Tables 2, 3 and S1 in the column ''Tardigrade specific Accession no.''.

Classification of Proteins
For functional analysis of identified proteins we used Blast2GO software, which consists of three main steps: blast to find homologous sequences, mapping to collect GO-terms associated to blast hits and annotation to assign functional terms to query sequences from the pool of GO terms collected in the mapping step [55]. Function assignment is based on GO database. Sequence data of identified proteins were uploaded as a multiple FASTA file to the Blast2GO software. We performed the blast step against public database NCBI through blastp. Other parameters were kept at default values: e-value threshold of 1e-3 and a recovery of 20 hits per sequence. Furthermore, minimal alignment length (hsp filter) was set to 33 to avoid hits with matching region smaller than 100 nucleotides. QBlast-NCBI was set as Blast mode. Furthermore, we have chosen an annotation configuration with an e-value-Hit-filter of 1.0E-6, Annotation CutOff of 55 and GO weight of 5. For visualizing the functional information (GO categories: Molecular Function and Biological process) we used the analysis tool of the Blast2GO software.

Protein Domain Analysis of Proteins without Annotation
Six frame translations of the Unigenes were run through the DomainSweep pipeline [56] and the significant and putative hits were collected. For each of the protein/domain databases used, different thresholds and rules were established [56]. Domain hits are listed as 'significant' i. if two or more hits belong to the same INTERPRO [57] family. The task compares all true positive hits of the different protein family databases grouping together those hits, which are members of the same INTERPRO family/domain. ii. if the motif shows the same order as described in PRINTS [58] or BLOCKS [59]. Both databases characterize a protein family with a group of highly conserved motifs/segments in a well-defined order. The task compares the order of the identified true positive hits with the order described in the corresponding PRINTS or BLOCKS entry. Only hits in correct order are accepted.
All other hits above the trusted thresholds are listed as 'putative'. By comparing the peptides which were identified by mass spectrometry with the six translations, the correct frame and the associated domain information was listed.