Molecular Characterization of Cold Adaptation of Membrane Proteins in the Vibrionaceae Core-Genome

Cold-adaptation strategies have been studied in multiple psychrophilic organisms, especially for psychrophilic enzymes. Decreased enzyme activity caused by low temperatures as well as a higher viscosity of the aqueous environment require certain adaptations to the metabolic machinery of the cell. In addition to this, low temperature has deleterious effects on the lipid bilayer of bacterial membranes and therefore might also affect the embedded membrane proteins. Little is known about the adaptation of membrane proteins to stresses of the cold. In this study we investigate a set of 66 membrane proteins from the core genome of the bacterial family Vibrionaceae to identify general characteristics that discern psychrophilic and mesophilic membrane proteins. Bioinformatical and statistical methods were used to analyze the alignments of the three temperature groups mesophilic, intermediate and psychrophilic. Surprisingly, our results show little or no adaptation to low temperature for those parts of the proteins that are predicted to be inside the membrane. However, changes in amino acid composition and hydrophobicity are found for complete sequences and sequence parts outside the lipid bilayer. Among others, the results presented here indicate a preference for helix-breaking and destabilizing amino acids Ile, Asp and Thr and an avoidance of the helix-forming amino acid Ala in the amino acid composition of psychrophilic membrane proteins. Furthermore, we identified a lower overall hydrophobicity of psychrophilic membrane proteins in comparison to their mesophilic homologs. These results support the stability-flexibility hypothesis and link the cold-adaptation strategies of membrane proteins to those of loop regions of psychrophilic enzymes.


Introduction
Cold adapted bacteria colonize habitats that are hostile to most organisms, e.g. the arctic ocean or the deep-sea, with temperature minima close to or even below the freezing point of water. To maintain growth and survival at these low temperatures coldadapted (psychrophilic) bacteria require an array of specific adaptations in the cellular components, protein synthesis machinery and enzymes [1,2]. The main challenges psychrophiles have to overcome are a decrease in enzyme activity and increased viscosity of the aqueous environment due to the low temperature. In terms of enzyme activity, significant differences in amino acid composition and three-dimensional structure of cold-adapted enzymes have been reported in comparison to their mesophilic counterparts [3][4][5][6]. In general, the results show that psychrophilic enzymes tend to be more flexible and less stable to maintain their catalytic activity. However, studies on psychrophilic enzymes produced differing results, indicating that psychrophilic organisms developed more than one strategy to adapt to the cold [7,8].
Another crucial aspect in cold-adaptation is the deleterious effect of low temperature on the lipid membrane of bacteria. A lipid bilayer, with a strongly hydrophobic interior, is the structural base for all biological membranes and the surrounding environment for most membrane proteins. As temperature decreases, lipids lose their fluidity and ultimately pass through a transition to form a gel phase, in which the molecules are packed more tightly and motion is highly reduced. This eventually leads to a loss of function of the membrane itself as well as of the proteins that are embedded in or interact with the lipid bilayer. In order to maintain membrane fluidity psychrophilic bacteria show an altered lipid composition with increased ratio of unsaturated fatty acyl residues, cis double bonds, chain shortening, and methyl branching. These changes are mediated through modification of pre-existing lipids by cold-shock-activated enzymes and by de novo synthesis of specific enzymes [9,10]. However, much less is known about the corresponding changes in membrane proteins in response to the low temperature. The present study was designed to address this task.
Representatives of the family Vibrionaceae of gram-positive c{proteobacteria are among the most commonly reported bacteria to be found in extreme environments [7]. Currently, Vibrionaceae includes 130 highly diverse species divided into seven genera, including Vibrio, Aliivibrio and Photobacterium [11]. Representatives of this family populate almost all aquatic habitats, fresh water as well as sea or brackish waters, and it encloses psychrophilic, intermediate and mesophilic organisms. In the presented study we investigate membrane proteins present in the core-genome of 64 completely sequenced Vibrionaceae genomes, including sequences from three genera and 20 different species. The compilation of a dataset that comprises representatives of species from different genera is not free of criticism. Some studies limit the investigated sequences to proteins from closely related species to minimize phylogenetic effects in amino acid substition [12]. However, recent studies indicate that slow neutral substitution processes that are based on phylogenetic distance have little influence on the results of adaptation studies [13]. Additionally, a dataset that includes only genomes from relatively closely related organisms may not be able to identify general features of protein adaptation to low temperature. Thus, a set of genomes that belong to a relatively wide range of phyla can be more suitable for the investigation of general adaptation strategies. In addition to this, other factors are also reported to influence the outcome of these studies, such as varying GC-content or the comparison of bacteria from environmental niches that differ in more than one characteristic, e.g., optimal growth and hydrostatic pressure [14,15]. In the presented study, we tried to overcome these problems by (i) compiling a diverse dataset from the bacterial family Vibrionaceae that enables the identification of general adaptation strategies and (ii) exclusively comparing transmembrane proteins that are part of the core-genome of our dataset, i.e., that are conserved in all investigated genomes.

Classification of Psychrophilic, Intermediate and Mesophilic Organisms
The genomes in our dataset were divided into three distinct temperature groups: psychrophilic, intermediate and mesophilic. Table 1 summarizes the distribution of genomes from the different temperature groups in our dataset. In the presented study psychrophiles are defined as bacteria that are capable of growing at 4uC but not at 30uC. Furthermore, mesophilic organisms are defined as capable of growing at temperatures above 35uC. Additionally, an intermediate group was defined that includes bacteria that either (i) have a maximal growth temperature of 30uC but do not grow at 4uC or (ii) grow at 4uC but not above 35uC. Exact growth temperature ranges of 57 isolates were obtained from the literature [16][17][18] (see File S1 for details). Unfortunately, the exact growth temperature of the six remaining isolates could not be determined. However, five of these isolates were classified according to the general temperature groups, i.e., mesophilic or psychrophilic, as stated at the GOLD database [19]. Finally, based on the phylogenetic results from a recent study, the remaining Vibrio sp. MED222 was considered a representative of speciesVibrio splendidus and was therefore classified accordingly [20]. In total the dataset used in the presented study includes four psychrophilic bacteria, six isolates assigned to the intermediate temperature group and 54 mesophilic bacteria.

Determination of Membrane Proteins
The set of core-genes of all 64 Vibrionaceae genomes was determined using the software orthoMCL [21] as described in detail in a previous study [20]. We applied the bioinformatic tools signalP [22,23], HMMTOP [24,25] and TMBETA-NET [26] to all core proteins and identified a set of 66 clusters of homologous membrane proteins present in all 64 investigated genomes. Furthermore, we used the software PSIPRED [27] for a more general secondary structure prediction of all identified membrane proteins. The prediction of specific types of membrane proteins resulted in a dataset of 52 transmembrane protein (TMP) families, 11 membrane protein families with a leading signal sequence, and 3 outer membrane proteins containing b-barrels. The group of bbarrel membrane proteins was too small for a separate statistical analysis, but was included in the general analysis of all proteins. As shown in table 2, the homologous sequences of each cluster show a high degree of sequence identity within as well as between different temperature groups. This is not surprising as core-genes tend to be highly conserved, most notably for core genes of closely related organisms.

Amino Acid and Length Variations
We calculated the length of transmembrane helices and, for signal peptides, of the signal sequence to investigate whether coldadaptation affects the length of these sequence parts. Additionally, we investigated all helices as identified by PSIPRED to determine general changes in helix length of membrane proteins from mesophilic to psychrophilic temperature group. As shown in table 3 the length of all investigated sequence parts appeared to be similar for sequences in our dataset. Additionally, we examined the alignment site variability of amino acids within sequence parts located inside the membrane, i.e. transmembrane helices or signal sequences, and outside of the membrane. Figure 1A shows the results for the set of all 52 aligned TMPs. The site variations for the transmembrane helices range from 1 to 9 amino acids, with 55.27% of all sites fully conserved, and 0.0071% sites varying with 9 amino acids. For TMP sequence parts outside of a membrane the variability ranged from 1 to 13 amino acids, with 43.6% conserved sites, 1.04% with 9 amino acids and 0.0099% with 13 amino acids. Figure 1B shows the results of the same analysis for the 11 signal peptide alignments. For the sequence parts outside the membrane similar results as for the TMPs were achieved: variability ranged from 1 to 10, with 45.3% conserved sites, and 0.66% with 10 amino acids. However, amino acid variability for the signal sequences differed from the transmembrane proteins: variability ranged from 1 to 11 amino acids with 30.6% conserved among all alignment sites and 0.75% of the alignment cites varied in 10 and 11 amino acids, respectively. Furthermore, the amount of alignment sites that varied in 5 up to 8 amino acids was found to be increased in comparison to the transmembrane proteins.

Changes in Amino Acid Composition
We analyzed the amino acid composition of each cluster of homologous membrane proteins to determine changes in the composition of psychrophilic sequences in comparison to their mesophilic counterparts. A paired Wilcoxon test for all 20 amino acids was performed separately on the group of TMPs and signal peptides as well as on the complete dataset. For each group compositional changes were calculated (i) for the entire sequence, (ii) sequence parts inside membranes and (iii) sequence parts outside membranes. Table 4 shows the results of the statistical analysis. Surprisingly, all changes in amino acid composition that were statistically significant (p-value ,0:01) were found either for the complete sequence or for sequence parts outside of the membrane. For the transmembrane helices and signal sequences no significant preferences in amino acid composition were found in our dataset. The most central residues in terms of compositional changes are the amino acids Ala (A) and Ile (I) as they show significant differences in frequency for the complete sequences as well as for sequence parts outside the membrane in all investigated proteins. In the psychrophilic group A is significantly suppressed, while I is significantly enhanced ( Figure 2). Additionally, the amino acids Lys (K), Asp (D) and Thr (T) are statistically increased for all 66 psychrophilic membrane proteins when compared to the mesophilic group. Furthermore, our results reveal that the amino acids Arg (R) and Glu (E) are underrepresented in the 52 psychrophilic TMP sequences.
Due to the fact that differing GC-content can have a significant effect on the compositional changes, we additionally investigated the changes in amino acid composition of a sub-group of genomes with close GC-content. Unfortunately, the two genera of our dataset that include psychrophilic organisms, Aliivibrio and Photobacterium, are represented by only four and five genomes, respectively. Furthermore, only the genus Photobacterium is represented by mesophilic as well as psychrophilic organisms. As the barophilic character of representatives of Photobacterium profundum may also have an effect on the amino acid composition we propose that it is unsuitable as a reference group to confirm our general findings. Therefore, we chose the group of 55 Vibrio genomes and determined whether the changes in amino acid composition between the groups of mesophilic and intermediate Vibrio genomes supports our general findings from the comparison of mesophilic and psychrophilic Vibrionaceae. Figure 3 and table 4 show that these results confirm some but not all findings from the complete dataset. Regarding the amino acid composition of Ala (A), Lys (K) and Asp (D) the results from the Vibrio sub-group are conform to those of the complete dataset. However, the comparison of the intermediate versus the mesophilic group of Vibrio genomes shows no statistically significant changes in the composition of amino acids Ile (I) and Thr (T) which was found when comparing all Vibrionaceae. Furthermore, we determined significant compositional changes for the Vibrio sub-group that were not found for all genomes. The amount of amino acids Gln (Q), Leu (L), Pro (P) and Trp (W) is significantly decreased in the amino acid sequence of the intermediate membrane proteins whereas the amino acids Asp Aliivibrio; (D), Gly (G) and Met (M) are significantly increased. Additionally, Arg (R) is also significantly decreased in the membrane proteins of the intermediate Vibrio group, which is not found for all proteins of the complete dataset but for the TMPs. An additional analysis was performed to investigate preferences in amino acid substitutions between mesophilic and psychrophilic membrane proteins of the complete dataset. All possible 421 amino acid substitutions (gaps included) were analyzed for statistical bias by using the two-sided chi-square test. The results showed no statistically significant preferences (p-values ,0:01) in the substitution pattern of amino acids (data not shown). Thus, our results show that, in terms of cold adaptation of membrane proteins, no single amino acid is primarily substituted by one specific amino acid.

Physicochemical Properties
In order to determine physicochemical characteristics of psychrophilic membrane proteins we analyzed the physical, chemical and geometric properties of all 20 amino acids in the sequences of the cold-adapted temperature group in comparison to those of mesophiles. We chose several characteristics that are known to be important for cold-adaptation, mainly for psychrophilic enzymes [28,29]. Table 5 shows the results including the statistical bias as well as specific trends for the transmembrane (TM) and the signal proteins (SP). In general, the homologous sequences of all clusters show very similar physicochemical properties. The only significant change for psychrophilic membrane proteins is the amount of hydrophobic amino acids. In comparison to the mesophilic group cold-adapted sequences show a significant decrease of hydrophobic amino acids in their primary protein structure. The reduced hydrophobicity is also observed by the general hydrophobicity scale [30] for the signal proteins (pvalue = 0.04), and by the special hydrophobicity scale [31] for all membrane proteins (p-value = 0.14), as shown in table 6. Additional findings are that the isoelectric point of TMPs is decreased (p-value = 0.013), and molecular weight and the accessible surface area is increased in signal proteins.
As mentioned above, our results regarding compositional changes revealed a significant decrease in the frequency of the amino acid A in psychrophilic TMPs of our dataset. It has been shown that A is one of the best helix-forming residue in peptide  sequences [32] and thus a decrease in A-content might increase protein flexibility due to a decreased ability of psychrophilic TMPs to form helices. To further investigate this hypothesis we calculated the secondary structure profiles of transmembrane proteins utilising the secondary structure propensity scale of all amino acids [33]. As shown in Table 7, these results confirmed a decrease of the helix forming property (p-value = 0.08 in the membrane parts, and p-value ,10 {4 of the whole protein),   indicating a lower degree of helix stabilization for psychrophilic TMPs in comparison to mesophilic sequences.

Discussion
We performed a comparative bioinformatical analysis using protein alignments from the core-genome of 64 fully sequences Vibrionaceae genomes to determine general characteristics of coldadaptation of membrane proteins, transmembrane proteins as well as signal peptides. We identified 66 membrane proteins, 52 TMPs, 11 signal peptides and 3 b-barrel containing membrane proteins, present in all genomes in our dataset. Analysis of a set of diverse membrane proteins that are part of the Vibrionaceae core-genome, rather than to focus on one particular protein or a phylogenetically narrow group of bacteria, increases the possibility that the characteristics found in this study are common for thermal adaptation of membrane proteins. Additionally, limiting our dataset to sequences conserved in all representatives of one bacterial family reduces the effects of phylogenetic variations on the presented results.

Determined Variations of Signal Peptides
Signal peptides show common structural features, e.g., a positively charged N-terminal followed by a hydrophobic core and a more polar cleavage site [34,35]. Variations in length of, e.g., the hydrophobic core of a signal sequence, can effect the accurate translocation or function of the mature peptide [36]. Furthermore, it has been shown that the interaction of the signal sequence with the hydrophobic lipid bilayer of the membrane is affected by environmental stress, e.g. varying temperature or salinity [37]. Therefore, it is legitimate to assume that signal sequences undergo a change in amino acid composition or length to adapt to low temperature. On the other hand, it has been shown that signal sequence do not always function efficiently when expressed in a different organism [38]. This indicates the host-specificity of at least a fraction of all signal sequences and thus changes in composition or length of signal sequences might be based on adaptation to the host and not to environmental conditions. Our results indicate that the signal sequence of psychrophiles differ from those of mesophilic bacteria in the level of conservation. However, as we were not able to determine a pattern in properties of the substituted amino acids or the exact location of the length difference, our results remain ambiguous in the context of cold adaptation strategies. The determined increase in site variability of the membrane parts of signal peptides may be caused by phylogenetic or functional differences of the sequences in our dataset and not based on adaptation to the cold. Another possible explanation is the size of the dataset as we included only 11 signal   peptides. Future studies may address this problem by investigating single signal peptide families or larger datasets.

General Features of Cold-adaptation in Membrane Proteins
The results from the analysis including the complete dataset, however, are more clear. The determined changes in amino acid composition are either found for the complete protein sequence or, more often, for sequence parts that are located outside of the membrane. As these sequence parts tend to be loop-like regions it is not surprising that our results resemble those reported for loop regions of cold-adapted enzymes. One key finding of the presented study is the increased amount of the amino acid I in the sequence parts of membrane proteins that are located outside of the lipid bilayer. Due to its hydrophobicity and helix-destabilizing property an increase in I decreases structural stability and thus increases the flexibility of loop regions in cold-adapted proteins [39]. Furthermore, a study recently published by Budde et al. on the grampositive bacterium Bacillus subtilis showed the importance of membrane-associated I in the cold-shock response of B. subtilis [40], supporting the hypothesis that the increase in I is a common strategy in cold-adaptation of membrane associated proteins. As reported, the increase in I was not confirmed by the comparison of the intermediate versus the mesophilic group of Vibrio genomes. However, as shown in figure 2 the intermediate group of the complete dataset also shows an increase in I although less than in the psychrophilic group. We therefore propose that the general result regarding amino acid I was not confirmed by the Vibrio subgroup due to the fact that we did not compare psychrophiles but organisms of the intermediate group to mesophiles. Thus, we hypothesize that the increase of amino acid I would be significant for psychrophiles of the genus Vibrio as found for the complete dataset.
An additional finding is the decreased amount of the amino acid A in protein sequences of psychrophilic membrane proteins when compared to the mesophilic group. This result was confirmed by the comparison of the intermediate group versus the mesophiles of genus Vibrio. The lower amound of A in the peptide sequences decreases the stability of a{helices and thus the structural stability of the overal sequence, due to the fact that A is the best helixforming residue [32]. The results of the calculated helix profiles supports this hypothesis as it reveals a decrease in helix forming properties for the investigated membrane sequences.
The remaining changes in amino acid composition show a clear preference for hydrophobic and helix-breaking or destabilizing amino acids. For example, the amino acid D has been reported to be helix-breaking [41] whereas T is helix indifferent. Also, the avoidance of amino acid E in psychrophilic TMPs can be explained by its trend to favor for helical structures and has also been reported for the proteomes of other psychrophilic organisms [39,42]. Additionally, our results indicate that the amino acids K and R also play a role in adaptation of TMPs to low temperatures. Although the changes in frequency of these two amino acids lack an obvious interpretation, similar results have recently been   reported for other psychrophilic bacteria, too [43,44]. Additionally, these findings are also confirmed by our investigation of the sub-group of intermediate and mesophilic Vibrio genomes. The differences in the results of the complete dataset in comparison to the sub-group of Vibrio genomes can be explained by multiple factors. First, psychrophilic representatives of different genera may show different strategies to adapt to low temperature. Therefore, the reported differences may be specific for the genus Vibrio and are not generally found for membrane proteins of the complete family of Vibrionaceae. Second, the group of Vibrios in our dataset does not include true psychrophiles but organisms that are representatives of the intermediate temperature group that are described as mostly psychrotolerant. Thus, the reported compositional changes might be specific to this temperature group and might not be found in psychrophiles of this genus. Although speculative, we favor this explanation, as the results of the intermediate Vibrio group show similarities to the intermediate group of the complete dataset in comparison to the mesophilic group of the complete dataset. Nevertheless, it has to be mentioned that the differences of the two analysis may also be based on GC-content and phylogenetic distance of the compared genomes and not on cold adaptation strategies.
The analysis of physicochemical properties of membrane proteins revealed a lower hydrophobicity in the psychrophilic sequences. This effect can be seen in Table 5 and 6, where it appears as a detectable indicator, and point out that a less hydrophobic protein is an important factor in cold adaptation of membrane proteins. The sequence regions of membrane proteins that are located outside of the membrane are exposed to water in either the cytoplasm or the periplasmic space. Low temperature affects the dynamics of water and water networks and decreases the ability to form hydrogen bonds. An increased hydrophobicity may compensate for these effects and was also recently reported for loop-regions of cold-adapted enzymes [8,45].
In summary, our results show that cold-adaptation of membrane proteins almost exclusively effects parts of the proteins that are located outside of the membrane. For these sequence parts our analysis reveals similar cold-adaptation strategies as for loop regions of cold-adapted enzymes. This includes higher flexibility at the expense of stability, a decrease in the propensity to form helices as well as a decrease in hydrophobicity. Surprisingly, we were not able to identify characteristics of cold-adaptation of the sequence parts embedded in the membrane although low temperature has a strong effect on the lipid bilayer. Therefore, it might be reasonable to assume that psychrophilic organisms countervail the changes in membrane fluidity mostly by altering the composition of the membrane itself rather than to modify the transmembrane helix parts of its membrane proteins. Future studies including more sequences may confirm these results or give a deeper insights into the adaptation of membrane helices of psychrophilic bacteria.

Homolog Retrieval and the Comparative Analysis
We used the dataset of 64 Vibrionacea genomes, including 20 different species from the genera Aliivibrio, Photobacterium and Vibrio. The core genome was identified using the software orthoMCL. A detailed description of the dataset and the homology clustering can be found in Kahlke et al. 2012 [20].
Identification of putative signal and membrane proteins was performed by applying three different prediction tools to the complete set of core-genes of all 64 Vibrionaceae genomes. The transmembrane proteins and their transmembrane helices were predicted using HMMTOP v2.1 [24,25]. These were used for statistical analysis of helix properties, e.g. length variations. Prediction of transmembrane beta-strands of outer membrane proteins was performed using TMBETA-NET [26] and signal peptides were identified using SignalP v3.0 [22,23]. Additionally, the secondary structure of all identified transmembrane proteins was predicted using PSIPRED v3.21 [27].
All sequences of each homolog cluster were manually curated and those proteins were excluded that contained sequences with missing starts or stops, i.e. truncated sequences from draft genomes.
Finally, we compiled a dataset of 4224 protein sequences distributed over 66 cluster of predicted transmembrane sequences. The predicted sequences were conserved among all genomes included in our dataset. Paralogs were excluded from our analysis, hence each of the clusters contained 64 protein sequences. The sequences of each cluster of homologs were subsequently aligned using T-Coffee v.6.7 [46] and then divided into three different temperature groups based on the optimal growth temperature or normal living condition for the species: psychrophile (4 sequences), intermediate (6 sequences) and mesophile (54 sequences). Table 1 shows a summary of our dataset.

Changes in Gene Composition and Properties
We have applied the new methods of comparative statistical bioinformatics as described in a recent paper by Thorvaldsen et al. (2010) [47], and amino acid composition and substitution were examined by the toolbox DeltaProt [47]. Since there are several sequences of the same species in each temperature group, each temperature group was treated as a statistically dependent sample of data, and hence the analysis used statistical tests for dependent data (Thorvaldsen et al 2010, table 2). The statistical tests were applied on the mean values of each group and used as representative observations. Our data set is unbalanced since there is different number of sequences in each temperature group, and the variances of each of the mean values will be unequal. Therefore, we used a non-parametric paired Wilcoxon sign-rank test as the appropriate test since it only assumes that the mean values have symmetrical shapes and a common mean. The pairing is based on equal (homologous) proteins from the two temperature groups, i.e., mesophiles and psychrophiles. By this we examine the change in composition of amino acids in the different temperature groups. We also looked for amino acid directional biases in substitutions among the proteins from the different groups, and for general change in physicochemical properties of the involved amino acids. We applied the same physicochemical properties as in Thorvaldsen et al. (2007Thorvaldsen et al. ( , 2009 [48,49].

Supporting Information
File S1 Species name, strain and temperature habitat of the studied bacteria. (DOC)