Insights into the Molecular Evolution of the PDZ/LIM Family and Identification of a Novel Conserved Protein Motif

The PDZ and LIM domain-containing protein family is encoded by a diverse group of genes whose phylogeny has currently not been analyzed. In mammals, ten genes are found that encode both a PDZ- and one or several LIM-domains. These genes are: ALP, RIL, Elfin (CLP36), Mystique, Enigma (LMP-1), Enigma homologue (ENH), ZASP (Cypher, Oracle), LMO7 and the two LIM domain kinases (LIMK1 and LIMK2). As conventional alignment and phylogenetic procedures of full-length sequences fell short of elucidating the evolutionary history of these genes, we started to analyze the PDZ and LIM domain sequences themselves. Using information from most sequenced eukaryotic lineages, our phylogenetic analysis is based on full-length cDNA-, EST-derived- and genomic- PDZ and LIM domain sequences of over 25 species, ranging from yeast to humans. Plant and protozoan homologs were not found. Our phylogenetic analysis identifies a number of domain duplication and rearrangement events, and shows a single convergent event during evolution of the PDZ/LIM family. Further, we describe the separation of the ALP and Enigma subfamilies in lower vertebrates and identify a novel consensus motif, which we call ‘ALP-like motif’ (AM). This motif is highly-conserved between ALP subfamily proteins of diverse organisms. We used here a combinatorial approach to define the relation of the PDZ and LIM domain encoding genes and to reconstruct their phylogeny. This analysis allowed us to classify the PDZ/LIM family and to suggest a meaningful model for the molecular evolution of the diverse gene architectures found in this multi-domain family.


INTRODUCTION
The sequencing and annotation of an increasing number of genomes has led to a huge amount of protein sequence data. The goal of functional genomics is to determine the function of these proteins. For this purpose, it is essential to construct a comprehensive evolutionary classification of proteins and their families, which can be especially useful if members of the same protein family have similar or identical biochemical functions [1]. The classification of protein families is based on homologous relationships and several methods are currently available for clustering proteins into families [2,3]. Most of those approaches rely on sequence similarity measures, such as those obtained with BLAST [4] or hidden Markov models [5]. Because many proteins contain multiple domains, many of these methods of protein clustering result in the establishment of incorrect families. This problem is complicated in metazoan proteomes, and the human proteome in particular, where multi-domain proteins are abundant.
Domains are the building blocks of all globular proteins and present one of the most useful levels at which protein function can be understood [3]. There is a limited repertoire of types of domains [6,7] and the domains from this set are duplicated and recombined in different ways to form the respective proteomes of various genomes in life. Although the presence of a shared domain (or more than one shared domain) can be an indicator of similar functions [8], it does not necessarily imply it [9]. The repertoire of different architectures present in the genomes has arisen by the duplication and recombination of the ancestral superfamily domains. Convergent evolution of gene architectures has been defined as more than one independent evolutionary event (recombination) leading to the same domain architecture [10].
PDZ and LIM domains are both interaction modules, present in proteins with diverse functions and assorted additional domains. Originally PDZ domains were recognized in the postsynaptic density protein PSD-95 [11], the septate junction protein Discslarge of Drosophila melanogaster [12] and the epithelial tight junction protein ZO-1 [13]. PDZ domains play important roles in organizing cell signaling assemblies [14] and are found in plants, yeast, bacteria and a variety of metazoans [15,16]. They recognize short C-terminal peptide motifs, internal sequences resembling a C-terminus and have further been shown to bind to phospholipids [reviewed in 14,17].
The predominance of PDZ domains in metazoans was proposed to indicate their co-evolution with multicellularity. Proportionately fewer PDZ domains are found in bacteria and yeast. However, a relatively low number of PDZ domains are found to be encoded in plant genomes. PDZ domains were found to be present in proteins from phylogenetically diverse groups of bacteria [18] and it was suggested that PDZ domains might have entered the bacterial and plant genomes by horizontal gene transfer. This hypothesis was based on the observation that human and bacterial htrA genes were significantly more similar to each other than either is to each of the yeast htra-like repeats [16]. Indeed, the yeast PDZ-like domains found in the four htrAs exhibit extremely low sequence homology to the metazoan consensus PDZ domains.
The LIM domain is a tandem zinc-finger structure that functions, like the PDZ, as a protein-protein interaction module [19][20][21]. LIM domains are found in proteins from a wide variety of eukaryotic organisms although fewer LIMs are found in yeast and plants compared to vertebrates for example [19,22] (this is similar to the PDZ domains). We have found only a single bacterial LIM domain from Chloroflexus aurantiacus in a database search (UniProt Q3E5J3). Dawid et al. [20] have classified LIM domains into five groups. More recently, Kadrmas and Beckerle described only four distinct LIM groups and showed that invertebrates, like Drosophila melanogaster and Caenorhabditis elegans express nearly the same complement of LIM protein groups but show decreased complexity within each of them [21]. Both the PDZ and the LIM domains in proteins are most frequently found in combination with other domains. Most multi-domain proteins are related from gene fusions, deletions and internal repetitions [23]. An investigation of these evolutionary events requires a method to find the domain architecture from which each protein originates. The techniques of molecular phylogenetics, developed to recover the nested hierarchy of taxa from character information in their gene and/or protein sequences, can reconstruct the evolutionary family history. However, the evolutionary diversification of protein families often leads to structural differences, which makes their phylogenetic characterization difficult. Differences in domain architecture among multi-domain proteins for example often have raised the question of whether these proteins are orthologous, even though they have clearly arisen, at least in part, from a common ancestor. Considering all these problems, it has been suggested that the concept of orthology is applicable only at the level of domains rather than at the level of proteins [24,25], except for proteins with identical domain architectures. Recently, the LAP family, which contain genes with both LRR and PDZ domains, has been classified by phylogenetic analysis showing the feasibility of an approach using domain sequences to obtain phylogenetic data [26].
In order to characterize this family, we used a combinatorial approach, analyzing phylogenies of intronic sequences, of full length sequences and of sequence information for structural domains. Our results show that it is possible to derive a meaningful model for the molecular evolution of the PDZ/LIM family and characterize the phylogeny of its members.

Genomic structures and gene architectures of the PDZ/LIM family
An overview of the gene architectures for the ten human genes encoding the PDZ/LIM protein family is shown in figure 1A. All genes contain a single central-or N-terminal-positioned PDZ domain. Single or multiple LIM domains are positioned either N-terminal or C-terminal from the PDZ domain (Fig. 1B). Besides the PDZ and LIM domains, different motifs and domains can be found. Another protein interaction domain found in LMO7, is the Calponin homology (CH) domain [see for a review e.g. 48]. One catalytic domain, a tyrosine kinase domain is present in both LIMK1 and LIMK2. Furthermore, a ZASP-like motif (ZM motif) is found in ZASP, ALP and Elfin [49]. The ZM motif has been described to function in concert with the PDZ domain to localize ZASP to aactinin, the major Z-disk cross linker in sarcomers [28,49].
A close examination of the gene architecture of the PDZ and LIM domain encoding genes found in metazoan taxons reveals different combinations for the assembly of these functional domains (Fig. 1B). Four groups of combinations, representing LMO7, the ALP subfamily, the Enigma subfamily and the LIMKs respectively, are found in vertebrates. Both the number of combinations as well as the total number of genes found increases from the invertebrates to the vertebrates (see Fig. 1B). Only two different combinations can be found in Caenorhabditis elegans: eat-1, the previously described single gene ''ALP/Enigma'' homolog [50] and tag204 (temporarily assigned gene 204), the Caenorhabditis elegans LMO7 homolog. In Drosophila melanogaster, an eat-1 homolog (tungus), a LMO7 homolog (CG31534) and a LIMK1 (AB042816) homolog is found. In contrast to both the Caenorhabditis elegans and the Drosophila melanogaster LMO7 homologs, LMO7 of Ciona intestinalis appears to have a CH domain. No combination of PDZ and LIM domain(s) was found for taxons in the Plantae or Fungi lineages with the BLAST algorithm [4] in the databases used (see Material and Methods).

Evolution of the ALP/Enigma subfamilies
As an initial starting point to study the molecular evolution of the PDZ/LIM family, we performed the phylogenetic analysis of the ALP and Enigma subfamilies, using full length amino acid sequences for the different groups (see table S1 and S2 for accession numbers). Figure 2A shows the rooted phylogenetic tree inferred for the Enigma subfamily using the Ciona intestinalis ZASP sequences as an outgroup. It shows a topology of the form (ZASP (Enigma, Enigma Homolog)). The ZASP homolog, with one PDZ and three LIM domains, was present in both urochordates and vertebrates (see Chordata in Fig. 1B). The phylogenetic tree in figure 2A clearly shows the split between urochordates and vertebrates.
The analysis of the ALP subfamily shows a topology of the form (ALP (RIL (Elfin, Mystique))) ( Fig. 2B and Figure S1).
A combined phylogenetic analysis for the PDZ domain sequences of the two subfamilies is shown in figure 2C. The results here indicate that both the ALP subfamily and the Enigma subfamily evolved from a ''one PDZ four LIM'' ancestral gene like eat-1/tungus (tungus being the Drosophila melanogaster ortholog). The data suggest that loss of three LIM domains (LIM 2-4) from the common 4 LIM domain-containing ancestral gene leads to the ALP subfamily; whereas loss of one LIM domain (LIM1) leads to the three LIM domain containing Enigma subfamily. To further investigate the separation between the ALP and Enigma groups, we have searched for PDZ/LIM genes in Ciona intestinalis (Urochordata), in Amphioxus (Euchordata) in hagfish (Hyperotreti), in the lamprey (Hyperoartia) and in sharks and rays (Chondrichthyes). The results of the phylogenetic analysis are shown in figure 2C.

Evolution of the LIM kinases and LMO-7
Similar to the limited approach, using full length sequences for only the subfamilies, we analyzed the LIM kinases and LMO7 ( Fig. 2D and 2E, respectively). Figure 2D indicates a gene duplication event in the common ancestor for both LIM kinases, but also suggests that the separation leading to LIMK1 and LIMK2 did not occur before the chordates split from the other deuterostomes, since it appears that the ancestral LIM kinase gene duplicated after Ciona intestinalis. We further observed that the LIMK2 homologs of both Xenopus laevis (Q8QHM0) and Xenopus tropicalis (EN-SXETP00000009075) encode only one LIM domain (in contrast The tungus gene, found in the two arthropod species investigated and the nematode homolog Eat-1 both encode one N-terminal PDZ and four C-terminal LIM domains. Eat-1 has been described earlier as the Caenorhabditis elegans ALP/Enigma gene [50]. Only a single LIMK gene was found per invertebrate species examined, and the LMO7 homolog lacks the CH domain (CG31534). The LMO7 gene of Drosophila melanogaster lacks not only the CH domain but also the PDZ domain (not shown, see Supplemental table S1). As not all ALP and Enigma subfamily members share the ZM domain (ZASP and ALP contain 2 and Elfin one ZM motif) we have excluded the ZM motif from these groups and show only a ZM motif for eat-1/ tungus in this figure. doi:10.1371/journal.pone.0000189.g001 to the Xenopus LIMK 1, which contains two LIM domains), suggesting that the frogs lost a LIM (see table S1).

Identification of novel ZASP-like genes containing a PDZ domain and a ZM motif
A short zebrafish ZASP-like gene containing only the PDZ domain and a ZM-motif had been described in GenBank (NM_201505). We described earlier similar structured short splice forms of the zebrafish ZASP gene, which could point to important functional similarities. In this study, we have now identified short ZASP-like genes in several species (Caenorhabditis elegans, D. melanogaster, and Ciona intestinalis) but were not able to find another vertebrate homolog in addition to the zebrafish gene. In an analysis based on the PDZ domain sequences (Fig. 2C) all invertebrate ZASP-like homologs segregated close to eat-1 and tungus whereas the zebrafish ZASP-like gene is closely related to zebrafish ZASP, indicating that a partial duplication event (or a complete duplication followed by partial truncation) occurred twice during evolution (Fig. 2C) and generated the ZASP-like structure. The zebrafish also contains an extra gene that resembles the normal members of the ALP subfamily. We have previously shown via a basic phylogenetic analysis on the full length amino acid sequence that the alp-like gene, clusters very closely to the normal alp gene of zebrafish [51]. In contrast to alp, alp-like lacks one of the ZM-motifs and is only found in zebrafish not in any other species including other fishes (Fig. 2C).

Identification of a novel highly-conserved ALP-like motif (AM)
Examination of the multiple amino acid sequence alignments, between PDZ/LIM family proteins of different species revealed a novel motif, specific for the ALP subfamily members, which we denoted ALP-like motif (AM) (Fig. 3). The motif was not present in any known motif/structure databases. The primary 34 amino acid long sequence of the ALP-like motif contains a putative consensus PKC phosphorylation site and secondary structure prediction suggests two a-helices, one in the beginning and one at the end (Fig. 3). A closer look into the genomic structures of the ALP subfamily genes indicated that this domain was always encoded by the fifth exon. Together these findings suggest that the ALP-like motif must have evolved after the separation of the ALP and Enigma subfamilies. This is in contrast to the ZASP-like motif (ZM) which is found in both subfamilies as well as in the ancestral Eat-1 gene. In figure 3 we show alignments for both motifs, however it is important to note that we have newly discovered the ALP-like motif and denoted it AM in analogy to the earlier denoted ZASP-like motif (ZM), but no structural or functional similarity is apparent between both motifs. Future studies are warranted to shed light on the function of the Alp-like motif (AM).

Evolution of the PDZ and LIM domains
The phylogenetic analysis of specific individual domains, or their combinations, obtained from multiple domain-containing proteins can give insights into the mechanisms of protein evolution. Any phylogenetic analysis on full length sequence alignments may fail as robust structural variations, which are often present among protein groups or families, can prevent such an approach. The PDZ/LIM family illustrates complex domain arrangements in a multi-domain protein family ( Fig. 1A and 1B).
We analyzed full length cDNA-, EST-derived-and genomic-PDZ and LIM domain sequences from over 25 species, ranging from yeast to humans (see table S1). All results obtained were supported by high Bayesian and Maximum likelihood support values.
The initial dendogram derived from amino acid sequences of LIM domains (Fig. 4) illustrates the different clusters for the subfamilies and groups and individual LIM domains. The dendogram shows that three LIM domains of the Enigma subfamily individually cluster together (Fig. 4). It further demonstrates that all LIM domains found in the Protist Dictyostelium discoideum all cluster together (marked in red) to the exclusion of LIM domains in other species. Therefore, we used a Dictyostelium discoideum LIM domain as an outgroup for our phylogenetic analysis of the LIM domains.
The evolutionary trees of the PDZ-and the LIM-domains, derived from this analysis are depicted in figure. 5A and 5B, respectively. Rooted evolutionary trees are shown (using yeast (Fig. 5A) and Dictyostelium discoideum (Fig. 5B) sequences as outgroups). For a better overview of the complete analysis, only ALP and ZASP are shown here as examples for the ALP-and Enigma-subfamilies.
The phylogenetic tree for the different PDZ domains shows that the most ancestral PDZ domain found in the family is LMO-7, with tag-204 being closest to the root and in front of all later clades. In a BLAST search, we identified the PDZ of CASK as the best aligning sequence to the LIMK PDZ and included sequences for the Caenorhabditis elegans (Lin2) and Drosophila melanogaster (Camguk) CASK homologs in the analysis. The PDZ phylogeny shows that both CASK homologs also relate to the ''LMO7 common ancestor'' and further show that their PDZ domain is closely related to the PDZ domain of the LIM kinases (Fig. 5A). Other MAGUK PDZ domains also showed that they originated from a ''LMO7 common ancestor'' (data not shown).
This suggests that the common ancestor of the LIM kinases and Lin2 and CASK lost the C-terminal LIM domain. Consequently, the PDZ domain of the ''LMO7 common ancestor'' is of central importance for the molecular evolution of the PDZ/LIM family.
The analysis of the LIM domains shows an early split between the LMO7 group and the LIM domain 2 of the LIMKs on one hand and all other LIM domains on the other hand. A PDZ single LIM structured common ancestor (as also seen for the PDZ phylogeny) is suggested by these results. We further examined  a conserved intronic-sequence we have discovered in the LIM domains of tag204 and eat-1 (Fig. 5C).
The most parsimonious explanation derived from both the analysis of the PDZ domain tree and from the LIM domain tree is summarized in our model for the molecular evolution of the PDZ/ LIM family (Fig. 6). The LIM domains of the LIMKs are related to the other PDZ/LIM family members (as is the PDZ domain); however gene rearrangements were necessary to generate the gene architecture of the LIMKs. A convergent event in respect to the combination of PDZ and LIM domains is indicated by our analysis, signifying that the PDZ has combined with the LIMs more than once during evolution. A single LIMK gene was identified in Drosophila melanogaster. No LIMK gene was found in Caenorhabditis elegans and lower taxons and only two PDZ/LIM genes are present in Caenorhabditis elegans: tag204 and eat-1. We performed a BLAST search for the LIM domains closest to the LIMK LIMs in Caenorhabditis elegans and found UNC-115 which was also found in Drosophila melanogaster. Interestingly, the first two LIM domains of UNC-115 from Caenorhabditis elegans and Drosophila melanogaster cluster with the LIM kinase LIM 1 and 2 domains, suggesting that the two LIM domains were first assembled and then met the PDZ (see table S3 for sequence homologies). Taken together, our findings led us to propose a model which gives a plausible scenario for the molecular evolution of the PDZ/LIM genes and their diverse gene architecture (Fig. 6).

Chromosomal location of ALP/Enigma subfamily genes
As previously mentioned [50], the human ALP/Enigma genes cluster specifically in a way that an ALP-like gene pairs with an Enigma-like gene in inverse orientation, analogous to the evenskipped genes, which both (EVX1 and EVX2) are transcribed in an opposite orientation as compared to adjacent HOX genes [52]. This intuitively suggests a form of genome duplication as a plausible mechanism for their evolution. To further investigate this we looked at the chromosomal location of the ALP/Enigma genes in different species. ALP/Enigma clusters are observed in four different species indicating that this clustering appears more than random (Fig. 7). However, several species (mouse, dog) do not show any clustering, making these results difficult to interpret in an evolutionary context.

DISCUSSION
In this study, we have approached two fundamental problems in the study of molecular evolution. One is the ''problem of comparison'', or how to compare very differently structured elements (many multi-domain proteins are only homologous along parts of their sequences); the other is the ''problem of origin'', or how to determine which members in a complex family of proteins share a common ancestral gene. We have focused our analysis on the coevolution of two conserved functional domains and have chosen to study the PDZ/LIM family for several reasons: the members of this family play important biological roles in development and for actin cytoskeleton organization, the entire family has never been analyzed before (and could not, in a conventional approach) and last, the diverse combinations of PDZ and LIM domains present in this family already suggested differences in their evolution.
We analyzed the phylogeny of the PDZ and the LIM domains separately, using a full length approach to analyze the individual subfamilies for a better resolution of later chains of events and looked at the distribution of PDZ/LIM genes in most sequenced eukaryotic lineages. The combined analysis and interpretation, and the merger of these results allowed us to classify the PDZ/ LIM family and draw a plausible phylogenetic model.
According to our analysis, the PDZ/LIM gene family in mammals, as defined by a common ancestral gene, has 8 members, including LMO7 and the 7 ALP/Enigma genes. This is supported by a comparison of a conserved intronic sequence situated in the LIM domains of the two Caenorhabditis elegans family members (Fig. 5C). The PDZ and LIM domains of both the PDZ/LIM genes and the LIM Kinases have most likely evolved from the same ancestral domains. However, the fact that these domains have been combined twice during evolution is a convergent phenomenon. This convergent event, which we describe here for the domain combinations, might be of functional relevance. Indeed, all 10 PDZ/LIM genes have been shown to be able to associate with the actin cytoskeleton [27,28,31]. It is possible, that the combination of LIM and PDZ domains in a single functional module is necessary for specific interactions with both the actin cytoskeleton and other proteins. Thus, this could indicate a functional convergence for all 10 genes in organizing protein complexes associated with the actin cytoskeleton.
The Caenorhabditis elegans eat-1 gene is related to the Drosophila melanogaster tungus gene and both share the same gene architecture, with one PDZ domain, a single ZM motif and four LIM domains. Both, PDZ domain and LIM domain phylogenies show that the ALP as well as the Enigma subfamily genes originate from eat-1/ tungus like ancestor and separated late in evolution using the same PDZ encoding exons, but losing either the last three LIM encoding exons or the first, respectively (Fig. 2C and 5B). Here the LIM domains of the genes in the ALP family all group together with the first LIM domain of eat-1 (we named it LIM1 in the tree), while the three LIM domains of the Enigma family segregate together with the last three LIM domains encoded by eat-1. This was already postulated by McKeown and colleagues when they functionally described the eat-1 gene [50] and we confirmed this here in our analysis with high support values. Looking at a smaller evolutionary window, we have evidence that the separation occurred between the Euchordata and the early vertebrata with a ZASP homolog found in amphioxus and lamprey, and RIL and Elfin homologs found in an early vertebrate, like the ray (Chondrichthyes) (Fig. 2C).
The basic model for the evolution of a multi-domain protein family and the original definition of this gene family (defined by a common ancestral gene) is suggested with high likelihood by our findings. Our phylogenetic interpretation of the evolution of the PDZ/LIM family shows that the LIMKs are the most distantly related genes, whereas all others, including LMO7, appear to have a common ancestral gene and thus constitute a classical gene family.

Creating a functional domain database
First a dataset of all functional domains of all PDZ and LIM domain encoding genes was established. For this purpose human protein sequences were BLASTed [4] against the genome and EST databases of Ensembl (http://www.ensembl.org/index.html), National Centre for Biotechnology Information (http://www.ncbi. nlm.nih.gov/) and the Joint Genome Institute (http://www.jgi. doe.gov/). Blast hits with high enough E-values were further investigated with the protein domain prediction program SMART [53][54][55]. Homologs were searched for in the following species: Anopheles gambiae, Arabidopsis thaliana, Bos taurus, Danio rerio (Dr), Caenorhabditis elegans (Ce), Canis familiaris, Ciona intestinalis (Ci), Ciona savignyi, Drosophila melanogaster (Dm), Gallus gallus (Gg), Homo sapiens (Hs), Mus musculus (Mm), Pan troglodytes, Phytophtora sojae, Populus trichocarpa, Saccharomyses cerevisiae (Sc), Strongylocentrotus purpuratus (Sp), Takifugu rubripes, Tetraodon nigroviridis, Xenopus laevis and Xenopus tropicalis (Xt). In our phylogenetic analysis however, we have excluded several species (e.g. some of the mammals) as their inclusion did not add any significant value for inferring an evolutionary model. Further not shown are the results for Pan troglodytes and Macaca mulatta which show exactly the same chromosomal distribution as observed for the humans. Not shown is also the data for Rattus norwegicus, where LIMK1 was co localized with LMO7, but no other combination was found. No combinations whatsoever were found in Mus musculus and in Canis familaris. It must be noted that for some of the species investigated the genomes are not completely sequenced and/or fully assembled. doi:10.1371/journal.pone.0000189.g007

Alignment and phylogenetic analysis
Alignments were performed using ClustalX [56] with default parameter values, and manually refined in GeneDoc where necessary. Reliably aligned regions in full length alignments were selected with Gblocks [57]. The minimum length for conserved blocks was set to five residues, while we decided to keep gap containing positions if the gap was present in 50% of the sequences examined. It has to be noted here that even under less restricted conditions, Gblocks selected only the domains for analysis, which makes the number of characters used in the phylogenetic analysis in the same order of magnitude as our domain specific analysis. The edited alignments were used for phylogenetic analysis employing both Bayesian analysis and maximum likelihood (ML). Bayesian trees were generated with MrBayes [58], with amino acid substitution set to mixed (hence reducing assumption prior to analysis). Rate variation across sites was modeled with a four rate gamma distribution and invariant sites, while the MCMC search itself was continued for 1.000.000 generations, sampled every 100 generations, and 2500 trees were discarded as burnin. For ML, alignments were bootstrapped 1,000 times with the program SEQBOOT from the PHYLIP package [59]. Subsequently, phylogenetic trees were generated with the ML algorithm implemented in PHYML [60], whereas a consensus tree was calculated with Consense from the PHYLIP package [59]. Parameters for PHYML were set at Jones-Taylor-Thornton for amino acid substitution and gamma distribution with four classes for across-site rate variation. The alpha parameter of the gamma distribution was estimated by PHYML. At last, phylogenetic trees were visualized with either NJplot [61] or MEGA 3.1 [62]. In the tree figures shown, the topology support values are labeled on the Bayesian consensus tree in the order % Bayesian posterior probability/ % bootstrap ML to reduce and standardize the characters and figures used. Figure S1 Phylogeny by structures consisting of one PDZ and one LIM domain. Numbers indicate % Bayesian posterior probability. Sequences used are specified in supplemental table S1, with the exception of the EAT splice form 1A (consisting of 1 PDZ and 1 LIM domain) for which the acc. number CAE52906 was used. Found at: doi:10.1371/journal.pone.0000189.s001 (0.68 MB TIF)