An Intriguing Shift Occurs in the Novel Protein Phosphatase 1 Binding Partner, TCTEX1D4: Evidence of Positive Selection in a Pika Model

T-complex testis expressed protein 1 domain containing 4 (TCTEX1D4) contains the canonical phosphoprotein phosphatase 1 (PPP1) binding motif, composed by the amino acid sequence RVSF. We identified and validated the binding of TCTEX1D4 to PPP1 and demonstrated that indeed this protein is a novel PPP1 interacting protein. Analyses of twenty-one mammalian species available in public databases and seven Lagomorpha sequences obtained in this work showed that the PPP1 binding motif 90RVSF93 is present in all of them and is flanked by a palindromic sequence, PLGS, except in three species of pikas (Ochotona princeps, O. dauurica and O. pusilla). Furthermore, for the Ochotona species an extra glycosylation site, motif 96NLS98, and the loss of the palindromic sequence were observed. Comparison with other lagomorphs suggests that this event happened before the Ochotona radiation. The dN/dS for the sequence region comprising the PPP1 binding motif and the flanking palindrome highly supports the hypothesis that for Ochotona species this region has been evolving under positive selection. In addition, mutational screening shows that the ability of pikas TCTEX1D4 to bind to PPP1 is maintained, although the PPP1 binding motif is disrupted, and the N- and C-terminal surrounding residues are also abrogated. These observations suggest pika as an ideal model to study novel PPP1 complexes regulatory mechanisms.


Introduction
Phosphoprotein phosphatase 1 (PPP1), one of the major eukaryotic serine/threonine protein phosphatases, has exquisite specificities in vivo, both in terms of substrates and cellular localization. Over the past two decades, it has become apparent that PPP1 versatility is achieved by its ability to interact with multiple targeting/regulatory subunits known as PPP1 interacting proteins [1,2]. To date, more than 200 interacting proteins have been identified, most of them having the consensus PPP1 binding motif (RVxF), that binds to the catalytic subunit of PPP1 (PPP1C), determining its targeting and thus specifying cellular location and ultimately function [3,4]. The RVxF motif is present in about 70% of all PPP1 interacting proteins [3]. This motif is usually surrounded by basic residues in the N-terminal and by acidic residues in the C-terminal. The binding of this motif to a hydrophobic groove in PPP1C does not alter PPP1C conformation, but anchors the interacting proteins to PPP1C [5][6][7][8]. Nevertheless, the initial binding of this motif to PPP1C is essential to bring the PPP1 interacting proteins into its proximity, allowing for secondary interactions that strength holoenzyme binding, determining substrate specificity, enzyme activity and PPP1 isoform selectivity [9]. Therefore, the key to characterize the diverse roles of PPP1 is the identification of novel interacting proteins and understand the PPP1 complexes specific functions. Thus, several novel PPP1 interacting proteins have been identified, through a yeast two-hybrid system, using PPP1 as bait [10][11][12][13][14].
A novel partner of PPP1 was identified recently and described as a novel Tctex1 dynein light chain family member, the t-complex testis expressed protein 1 domain containing 4, TCTEX1D4 (Tctex2β) [14,15]. Cytoplasmic dyneins are protein complexes responsible for the retrograde transport, minus-end directed trafficking in the cytoskeletal microtubules [16]. More specifically, the light chains can confer specificity to cargo binding [17,18], regulate other molecules [19] or stabilize the assembly of the motor dynein complex [20]. It was already shown that TCTEX1D4 interacts with membrane receptors, inhibiting TGFβ signaling [15] and suggesting its involvement in brain response to peripheral inflammation [21]. Previous results indicate that TCTEX1D4 is evolutionarily conserved among mammals and ubiquitously expressed, particularly in ovary, spleen, lung and placenta, where PPP1 is also present [22,23]. Moreover TCTEX1D4 interacts directly with PPP1C [22] and possesses a canonical PPP1 binding motif [5,8,24,25]. We have also demonstrated that TCTEX1D4 and PPP1C colocalize in the microtubule organizing center and in microtubules having a probable role in the cytoplasmic transport of the cell [22].
In this work we compared different lagomorphs, Oryctolagus, Lepus, Sylvilagus and Ochotona TCTEX1D4. Our main goal was to validate the observation that TCTEX1D4 PPP1 binding motif is absent across Ochotona species and to evaluate the evolution of this protein in the Lagomorphs. Also, different mutants mimicking Pika PPP1 binding motif and surrounding amino acids were produced and the binding efficiency was determined by the overlay technique. These findings were applied to understand the evolutionary mechanisms that are behind these dramatic amino acid changes.  generating a PCR fragment of 657bp. A touchdown PCR was performed and the thermal profile used was the following: initial denaturation (95°C for 15min.); 5 cycles of denaturation (95°C for 30sec.), annealing (66°C for 30sec., 1°C decrease/cycle) and extension (72°C for 45sec.); 30 cycles of denaturation (95°C for 30sec.), annealing (62°C for 30sec.) and extension (72°C for 45sec.); and a final extension (72°C for 20min.). Sequencing was performed on an ABI PRISM 310 Genetic Analyzer (Perkin-Elmer, Applied Biosystems, Barcelona, Spain), where the ABI PRISM BigDye Terminator Cycle sequencing protocols were followed.
The nucleotide sequences were translated and aligned using ClustalW [46] and adjusted by visual examination (data not shown). The sequences obtained in this work have been deposited into NCBI GenBank under the accession numbers: KF360247-253 (7 sequences). Maximum Likelihood phylogenetic reconstruction was performed for the whole TCTEX1D4 gene alignment and for the specific twelve amino acid region (four upstream and four downstream of the motif 90 RVSF 93 ). As indicated by the Akaike information criterion (AIC) implemented in jModelTest v0.1.1 [47], the nucleotide substitution model TVM+G was used for the whole gene tree estimation, while the TIM2+I+G model was selected as the best-fit nucleotide substitution model for the twelve amino acid region. For the Maximum Likelihood phylogenetic analyses we used GARLI v2.0 (Genetic Algorithm for Rapid Likelihood Inference) [48] applying 1,000,000 generations and 1,000 bootstrap searches. Maximum Likelihood trees were displayed using FigTree v1.3.1 (http://tree.bio.ed.ac.uk/).

Signature of selection and sliding-window analysis
Under neutrality, the expected ratio of non-synonymous (d N ) to synonymous (d S ) substitutions in a gene is one (Ka/Ks= d N / d S =ω=1) and significant deviations from this value can be interpreted as evidence of either positive selection (ω>>1) or purifying/negative selection (ω<<1). To consider a specific pattern of nucleotide substitution, synonymous and nonsynonymous substitution rates were estimated using the Nei-Gojobori method [49] and ω was calculated. To determine the nucleotide substitution rate variation among different nucleotide regions we can plot the differences as averages by sliding a window along a sequence alignment [50]. A 234 nucleotide region (between nucleotides 151 and 384 of the codifying sequence of Ochotona princeps) encompassing the palindromic region (between nucleotides 250 and 285) was selected and the alignment was performed using the software package MEGA 4.1 [51]. The sliding-window analysis was performed using DnaSP version 5.10 [52]. A window length of 9 nucleotides and a step size of 3 were chosen for this analysis. The ratio of non-synonymous to synonymous substitutions between Rabbit/Pika, Rabbit/Mouse and Rabbit/Rat was then analyzed. Final plots were obtained using SigmaPlot (SigmaPlot v.11, Systat Software, San Jose, California, USA).

Site-direct mutagenesis
Mutagenic primers were designed according to the sequence of human TCTEX1D4 (NCBI: NM_001013632.2) and used to obtain the desired mutations (Table 1). Starting with pET-TCTEX1D4 plasmid as template, and along with appropriate mutagenic primers, the mutants HA+INL+WS, HA+WS, HA +INL, INL+WS, HA, INL and WS were created using the QuikChange ® Site-Directed Mutagenesis Kit (Stratagene, Agilent Technologies UK Ltd, Edinburgh, UK). PCR conditions for site-directed mutagenesis were as followed: initial denaturation (95°C for 1min.); 18 cycles of denaturation (95°C for 30sec.), annealing (55°C for 1min.) and extension (68°C for 7min.), using KOD polymerase (Novagen, Madison, Wisconsin, USA). DNA was then digested by DpnI restriction enzyme and transformed into E. coli XL1-Blue strain (Stratagene Agilent Technologies UK Ltd, Edinburgh, UK). Sequencing was performed on an ABI PRISM 310 Genetic Analyzer (Perkin-Elmer, Applied Biosystems, Barcelona, Spain), where the ABI PRISM BigDye Terminator Cycle sequencing protocols were followed. Positive clones were sequenced using universal T7 promoter and T7 terminator primers.

Protein expression and overlay assay
Each His-tagged mutant was transformed into E. coli Rosetta strain (Novagen, Madison, Wisconsin, USA). A single colony was selected and grown overnight at 37°C in the appropriate media until an optical density of 0.6-0.7 was reached. Expression was induced using 1M IPTG (isopropyl-β-D-thiogalactopyranoside), at 37°C with shaking, for 3hrs. Culture cells were recovered by centrifugation and treated as described elsewhere [12]. Lysates were then mass normalized using a BCA ® assay (Fisher Scientific, Loures, Portugal) and 10μg of each extract was loaded in a 12% SDS-PAGE gel. The proteins were subsequently transferred to a nitrocellulose membrane and then overlaid with 25pmol/mL of purified PPP1C gamma 1 isoform (PPP1CC1) for 1hr. Membranes were incubated with either mouse anti-His monoclonal (1:1000, Novagen, Madison, Wisconsin, USA) or rabbit CBC3C (anti-PPP1CC, 1:1000) antibodies, followed by the respective antimouse and anti-rabbit infrared secondary antibodies (1:5000, Li-Cor Biosciences UK Ltd, Cambridge, UK). Immunoreactive bands were then developed in Odyssey infrared-imaging system and quantified using Odyssey v1.2 software (Li-Cor Biosciences UK Ltd, Cambridge, UK). The same procedure was performed for pET-TCTEX1D4 (positive control) and pET vector (negative control).

Statistical analysis
SigmaPlot statistical package (SigmaPlot v.11, Systat Software, San Jose, California, USA) was used for statistical analysis. Data were tested for normal distribution and homogeneity of variances. Student's t-test (p<0.05, alpha=0.050) was used to detect the differences between each mutation by comparison to the control, pET-TCTEX1D4.

Analyses of TCTEX1D4 evolution
When comparing the TCTEX1D4 PPP1 binding motif, 90 RVSF 93 , in twenty-one mammalian species we observed that it was present in all except in Ochotona princeps, for which an extra glycosylation site (motif 90 NLS 92 ) appears. To confirm if this was not an artifact of the database, Ochotona dauurica and Ochotona pusilla TCTEX1D4 were sequenced. Additionally, for five other lagomorph species (Lepus and Sylvilagus genera), the TCTEX1D4 coding region was partially sequenced. These sequences were compared with other mammalian sequences and translated into amino acids (Figure 1). The nucleotide substitution in Ochotona species generated amino acid changes that confirmed the elimination of the consensus PPP1 binding motif, 90 RVSF 93 , and the appearance of a glycosylation site. Lepus and Sylvilagus genera maintained the canonical PPP1 binding motif.
The alignment between the sequences acquired in this work and the twenty-one sequences available in the databases for the different mammals allowed the construction of a Maximum Likelihood phylogenetic tree (Figure 2A). The topology obtained was in accordance with the mammalian taxonomy proposed and currently accepted [53], suggesting that TCTEX1D4 has been evolving under neutral selection. A new Maximum Likelihood tree ( Figure 2B) was constructed using only a twelve amino acid region, corresponding to four amino acids upstream and four amino acids downstream of the motif 90 RVSF 93 . This choice of amino acids was related with PPP1 binding motif being flanked by an unusual palindromic sequence, 86 PLGS 89 , according to Homo sapiens sequence. As expected, the obtained tree revealed that the three Ochotona species formed an independent cluster, highly supported by a bootstrap value of 97 ( Figure 2B).

Pikas TCTEX1D4 -positive selection of the palindromic region
The non-synonymous to synonymous substitution ratio was calculated for the previously referred twelve amino acids.
Comparing the ratios between all analyzed mammals, but excluding the three Ochotona species, the presented values were on average lower than 0.3, suggesting a strong purifying selection. However, when comparing ratios between the three Ochotona species and each of the mammalian sequences, on average, the obtained value was 1.6, suggesting that for Ochotona species this fragment lost the constrains on protein mutations imposed by purifying selection or/and evolved under Darwinian or positive selection. When focusing the analysis on the Superorder Glires (Order Rodentia and Order Lagomorpha), the main representatives of rodents (mouse and rat) showed a ratio of zero, meaning that TCTEX1D4 was under purifying selection for this group. When comparing rodent`s nucleotide sequences, corresponding to the twelve amino acids region, with the one from human, a total of ten substitutions causing no amino acid changes was observed ( Figure 1). On the other hand, for the Ochotona species, a total of twelve substitutions caused six amino acid alterations. Furthermore, when comparing all species from the three Leporidae genera, Lepus, Oryctolagus and Sylvilagus, with the three Ochotona species, the ratio ranged between 1.7 and 7.0.
These observations were visually reinforced by the slidingwindow analysis of the TCTEX1D4 region, up-and downstream of the palindrome (234 nucleotides, positions 151 to 384 according to Ochotona princeps). When comparing Oryctolagus and Ochotona genera, the plot clearly shows a peak of positive selection in the palindromic region (position 250 to 285, Figure 3A). However, comparing Oryctolagus with rodents no peaks were observed in the palindromic region, which indicate that this region is under purifying selection, as in other mammals ( Figure 3B & 3C).

TCTEX1D4 RVSF-palindrome studies
To further study the significance of the bioinformatic studies, mutants based on the Ochotona princeps sequence ( 83 HALGSRINLSGWS 95 ) corresponding to the human TCTEX1D4 PPP1 binding motif and flanking regions ( 85 PPLGSRVSFSGLP 97 ) were generated by site-directed mutagenesis followed by bacterial expression of those mutants and PPP1C binding screening by overlay ( Figure 4A & 4B). The band intensities indicate the amount of PPP1CC1 that is bound to the bacterial expressed TCTEX1D4 recombinant mutant proteins. Since all proteins were expressed with a Nterminal His-tag, anti-His antibody was used to normalize the amount of recombinant protein loaded in each lane. Subsequently band intensities were compared to the pET-TCTEX1D4 control. A Rosetta cell extract expressing pET vector alone was used as negative control. Results show that HA+INL+WS mutant has a binding profile similar to the wild type human TCTEX1D4 since there was no statistical difference in the binding capacity. The results of the other mutations, single or double, also show no statistical difference when comparing to the control, pET-TCTEX1D4 ( Figure 4C).

Discussion
TCTEX1D4 has already been described as a new PPP1 interacting protein [22]. This new interaction was supported by the yeast two-hybrid approach, co-immunoprecipitation and overlay techniques. Previous results showed that the TCTEX1D4 N-terminal domain, where the PPP1 binding motif is present, is essential for the binding. Furthermore, in vitro studies with TCTEX1D4 PPP1 binding mutants strengthens the importance of the PPP1 binding motif to TCTEX1D4/PPP1C interaction [22]. Indeed, the mutation of the motif RVSF to AAAA decreases binding by 35% [22], which is surprising since the mutation of the PPP1 binding motif either to AAxA [54], RAxA [8,55] or to RVxA [56] usually abrogates PPP1 complex interaction. Nevertheless, some cases exist where interaction still occurs but to a lesser extent [57][58][59][60]. Also, there are some interacting proteins that still bind PPP1C in the presence of an excess of a synthetic RVxF peptide [59] that usually disrupts  the PPP1 complex [12,61]. Other motifs besides RVxF present in these proteins and also important for the binding may at least partly explain these observations. The sequence surrounding TCTEX1D4 RVSF motif is unusual in that it contains a palindrome -PLGSRVSFSGLP. The PPP1 binding motif binds to PPP1 in a hydrophobic pocket [62]. This palindrome may form a structured arm forcing the RVSF motif, even when it is mutated into AAAA, to enter the PPP1 pocket, since the palindrome contains rigid prolines. Perhaps, if the RVSF is completely removed and the arm destroyed, TCTEX1D4 will no longer bind PPP1C.
When the twelve amino acids Maximum Likelihood tree was constructed, the three Ochotona species formed an independent cluster completely apart from all the other mammals. This observation shows that this fragment is unique for Pika, being the palindrome and the RVSF motif highly conserved among mammals but completely lost in Pika sequences ( Figure 2 and Figure 1). This could be explained by two different hypotheses: the new motif present in the Ochotona species resulted from gene conversion with adjacent genes or a pattern of nucleotide substitution in this specific motif happened. Gene conversion has been reported in other mammalian genes. For example, in leporids a gene conversion event was observed between the two chromosomally adjacent genes CCR2 and CCR5, where the sequence motif 194 QTLKMT 199 of the CCR5 protein was replaced by the HTIMRN motif, which is characteristic of CCR2 [43,63]. In the present study, none of the genes chromosomally adjacent showed a clear evidence of gene conversion with TCTEX1D4, being this event an unlikely hypothesis. Furthermore, no significant BLAST was obtained when this fragment was compared with mammalian NCBI database.
The d N /d S ratios for the twelve amino acids region between the three Ochotona species and each of the mammalian sequences, is on average 1.6. When restricting the comparison to the three Leporidae genera, the ratio is further increased, for values that varied between 1.7 and 7.0. Furthermore, comparing the rodents' nucleotide sequences corresponding to the twelve amino acids region with the one from human, a total of ten substitutions caused no amino acid changes, while for the Ochotona species, a total of twelve substitutions caused six amino acid alterations (Figure 1). These results were visually reinforced by the sliding-window analysis that clearly showed the palindromic region under positive selection in Pika when compared with the rest of the mammals (Figure 3). The obtained d N /d S ratio, clearly higher than 1, and the fact that amino acid alterations created a new putative glycosylation site, highly support the hypothesis that for Ochotona sp. this sequence fragment has been evolving under positive selection. The occurrence of this nucleotide pattern in the three Ochotona species studied in this work and its absence in the other lagomorphs, suggests that this evolutionary event happened before the radiation of the Ochotona genus (between 6 and 20 mya) [39] and after the split of Ochotonidae and Leporidae families (between 31 and 65 mya) [36][37][38].
The creation of a novel putative N-glycosylation site ( 90 NLS 92 ) [64,65] in the three Ochotona species by positive selection suggests a physiologically important function. The likelihood of Pikas` TCTEX1D4 being glycosylated is increased by the fact that this motif is located more than sixty amino acids upstream of the C-terminal [66]. The remaining unsolved question is the acquired function of TCTEX1D4 in Ochotona species. This new putative glycosylation site may increase the half-life of the protein, which in turn will stay longer in the membrane attached to endoglin [15] being a stronger inhibitor of TGFβ in Ochotona sp. than in other mammals.
Furthermore, TCTEX1D4 in Ochotona sp. lost the PPP1 binding motif and the palindromic sequence, PLGS, probably important for the binding of TCTEX1D4 to PPP1. Thus, it would be expected that it would no longer bind to PPP1 directly. Evolutionarily, it is not clear what happened first, if the loss of the palindrome with subsequent mutation of the PPP1 binding motif to a glycosylation site or the acquisition of a glycosylation site by positive selection followed by loss of the palindrome. These evolutionary analyses suggest that an alternative mechanism may exist in Ochotona for TCTEX1D4 binding to PPP1. Thus, we employed an overlay screening with different binding mutants to test this hypothesis. Results show that Pika TCTEX1D4 aberrant RVxF motif and respective nonpalindromic surrounding region, 83 HALGSRINLSGWS 95 , sustain the binding of the TCTEX1D4 mutant to PPP1CC, at the same levels of the wild type human TCTEX1D4 (Figure 4). Moreover, in single and double mutants, the binding capacity was also maintained, which clearly shows that although substantial differences were found in Pikas RVxF and surrounding regions, these do not contribute to the disruption of the binding. Earlier results have shown that a mutation of the RVSF motif to AAAA only decreases the overall binding efficiency in 35% [22]. Furthermore, we have also shown that important regions for this binding are concentrated in the N-terminal, where the RVxF is also present. This means that, either the RVxF motif is not the only point of contact, or the RVxF surrounding region is also important for this binding. Here, using Pika aberrant motif we clearly show that the second hypothesis does not explain why the binding is not abolished when we mutate the RVxF motif.
PPP1 binding motif RVxF motif is usually surrounded by basic residues (arginine, lysine and histidine) in the N-terminal and by acidic residues (aspartate and glutamate) in the Cterminal [8,25]. Analysis of 143 RVxF motifs in known and novel PPP1 interacting proteins revealed that five to six of these flanking basic and acidic residues are relatively common among PPP1 interacting proteins [3]. Human TCTEX1D4 RVSF motif is a strong motif according to this analysis but the palindromic region that surrounds it does not follow this pattern, since no basic or acidic amino acids are present. Even so, all the flanking residues are present at some extent in other PPP1 interacting proteins. By comparing the above results with ours, the PP to HA mutation would not lead to any difference because some PPP1 interacting proteins also have these amino acids in these positions (P 11%, P 4% comparing to H 4%, A 10%). Regarding the VSF to INL mutation (V 94%, S 21%, F 83% comparing to I 6%, N 5% and L 0%) we can infer that the binding would be potentially abrogated, but our results show that it is maintained. Finally, relatively to the LP to WS mutation (L 3%, P 7% comparing to W 0%, S 7%) our results show that this mutation does not alter the binding. Taken together, our results show that the palindromic sequence, evolutionarily conserved, appears to be irrelevant for the binding, since the HA and WS mutations resulted in the same binding capacity, and that the unique RVxF motif, RINL, seems to sustain the binding. The results show undoubtedly that even with the motif and flanking regions evolving under positive selection, both regions seem to still sustain the binding capacity. The hypothesis of another N-terminal region important for the binding arises and might suggest that the RVxF motif is just a point of contact that helps to stabilize the complex.
In conclusion, TCTEX1D4 evolutionary analysis revealed that in Pika the PPP1 binding motif was lost and replaced by a new putative glycosylation site. Additionally, we also observed, in Ochotona, the loss of a highly conserved palindrome present among mammals. The presence of the HA, INL and WS substitutions in Ochotona, does not alter the binding capacity. The combination of these factors in Pika species makes these a perfect model to study the biology of PPP1/TCTEX1D4 complex and can be expanded to understand PPP1 complexes, increasing the number of interacting proteins previously expected to exist based on the consensus RVxF motif. Table S1. List of mammalian species used in this study in which the coding sequence of TCTEX1D4 was retrieved from NCBI or ENSEMBL. (DOCX)