Figures
Abstract
The signal peptides, present at the N-terminus of many proteins, guide the proteins into cell membranes. In some proteins, the signal peptide is with an extended N-terminal region. Previously, it was demonstrated that the N-terminally extended signal peptide of the human PTPRJ contains a cluster of arginine residues, which attenuates translation. The analysis of the mammalian orthologous sequences revealed that this sequence is highly conserved. The PTPRJ transcripts in placentals, marsupials, and monotremes encode a stretch of 10–14 arginine residues, positioned 11–12 codons downstream of the initiating AUG. The remarkable conservation of the repeated arginine residues in the PTPRJ signal peptides points to their key role. Further, the presence of an arginine cluster in the extended signal peptides of other proteins (E3 ubiquitin-protein ligase, NOTCH3) is noted and indicates a more general importance of this cis-acting mechanism of translational suppression.
Citation: Karagyozov L, Grozdanov PN, Böhmer F-D (2020) The translation attenuating arginine-rich sequence in the extended signal peptide of the protein-tyrosine phosphatase PTPRJ/DEP1 is conserved in mammals. PLoS ONE 15(12): e0240498. https://doi.org/10.1371/journal.pone.0240498
Editor: Maria Gasset, Consejo Superior de Investigaciones Cientificas, SPAIN
Received: October 2, 2020; Accepted: November 23, 2020; Published: December 9, 2020
Copyright: © 2020 Karagyozov et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
After the start of translation, translation inhibition due to the interaction between the nascent polypeptide chain and the ribosome is reported [1, 2]. In eukaryotes, transient elongation arrest may be caused by consecutive prolines [3] or by an array of positively charged amino acids [4]. In a limited number of proteins, the signal peptides are longer than the canonical 20–25 residues; they consist of more than forty amino acid residues and contain an N-terminal extension and a hydrophobic sequence (h-region) far from the N-terminus [5]. This extension provides a convenient space for positioning of a translation attenuating amino acid sequence.
The human receptor-like protein tyrosine phosphatase, type J (PTPRJ, PTPReta, or DEP1) provides an example of the presence of a down-regulating sequence within the signal peptide. The human PTPRJ is a receptor-like protein tyrosine phosphatase of the R3 subtype characterized by an extracellular region, containing several tandem fibronectin type III (FNIII) domains, a single transmembrane region, and a single cytoplasmic catalytic domain [6]. In human embryonic lung fibroblasts, the PTPRJ expression and activity were dramatically increased when cells approached confluence in comparison to sparse cells, suggesting a possible role in cell-density-dependent inhibition of proliferation. Thus, the name high cell density-enhanced phosphatase-1 or DEP1 was proposed [7]. PTPRJ is expressed in a variety of normal tissues, notably in hematopoietic cells (CD148 antigen), in epithelial tissues, including those of the digestive tract, and in the vascular endothelium [8]. Data suggest that PTPRJ is a tumor suppressor in different tissues. In mice, the gene encoding PTPRJ was mapped to a colon cancer susceptibility locus (Scc1) [9]. Negative regulation of the signaling of several receptor-tyrosine kinases (RTKs), including the epidermal growth factor receptor (EGFR) [10], the platelet-derived growth factor receptor [11], and Fms-like tyrosine kinase 3 [12] may be important in this context. A metabolic function of PTPRJ is indicated by the negative regulation of insulin receptor and leptin receptor signaling [13–15]. PTPRJ was also identified as an effective activator of Src-kinase in different cell types. This function of PTPRJ is important for platelet activation and thrombosis [16, 17], for efficient angiogenesis [18], and for regulating airway hyper-responsiveness [19].
Previous experiments [20] showed that: (1) the human PTPRJ transcripts predominantly initiate translation at the first AUG in a favorable context, numbered AUG190. This results in a PTPRJ pre-protein with an N-terminally extended signal peptide and a hydrophobic signal sequence, which is far from the N-terminus. (2) The N-terminal extension contains an unusual arginine-rich cluster (RRTGWRRRRRRRR); its translation inhibits the overall PTPRJ expression.
To elucidate the importance of these findings it was of interest to examine the PTPRJ transcripts in mammals for sequences encoding the attenuating arginine-rich cluster. In the present paper, PTPRJ orthologs from placental mammals, marsupials, and monotremes were compared. The results revealed a similarity in the architecture of the PTPRJ transcripts. Several conserved features were noted: (1) uORFs are not present in the transcripts; (2) the PTPRJ transcripts encode signal peptides with N-terminal extension and a hydrophobic signal sequence which is rather distant from the N-terminus; (3) the extended signal peptides contain the attenuating arginine cluster. The remarkable evolutionary conservation of the attenuating sequence emphasizes the importance of suppressing the PTPRJ translation by the nascent peptide chain.
Materials and methods
Orthologous genes and data evaluation
The NCBI gene database (https://www.ncbi.nlm.nih.gov/gene) entry for the human protein tyrosine phosphatase receptor type J (PTPRJ, GeneID: 5795) was used to search for PTPRJ orthologs. The NCBI default routine designated ‘NCBI's Eukaryotic Genome Annotation pipeline’ was used, which employs the NCBI Gene dataset and a combination of protein sequence similarity and local synteny information (https://www.ncbi.nlm.nih.gov/kis/info/how-are-orthologs-calculated/). The search for orthologs was limited to mammals.
The 5′ end regions of the orthologous PTPRJ transcripts were examined for the presence of initiator AUG codons to determine the N-terminal end of the encoded proteins. The translation start in the mammalian PTPRJ transcript sequences was also predicted by the NetStart1.0 server at https://services.healthtech.dtu.dk/service.php?NetStart-1.0 [21]. The presence of a signal hydrophobic amino acid sequence (h-region) and the signal-peptidase cleavage sites were determined in silico using the SignalP 5.0 web server at https://services.healthtech.dtu.dk/service.php?SignalP-5.0 [22]. In the SignalP algorithm, the input sequence has an upper limit of 70 residues, thus, initially, the N-terminal amino acids were examined, and then—the adjoining downstream region. The distribution of the elongating and initiating ribosomes in transcripts encoded by exon 1 (450 bp) of the human PTPRJ on chromosome 11 was visualized by the genome browser https://gwips.ucc.ie/ [23]. BLAST, PHI-BLAST (NCBI) and Clone Manager Suite 8 (Scientific and Educational Software) were used to search the database, compare and analyze the nucleotide and amino acid sequences.
Protein-tyrosine phosphatase PTPRJ/DEP1 sequences
The analyzed 5′ end regions of the PTPRJ transcripts were from ten species: five placental mammals, four marsupials, and one monotreme.
Placentals.
Primates—Homo sapiens (human), mRNA transcript variant 1: NM_002843.4, protein: NP_002834.3
Rodents—Mus musculus (house mouse), mRNA transcript variant 1: NM_008982.6, protein: NP_033008.4
Cetaceans—Delphinapterus leucas (beluga whale), mRNA: XM_030764039.1, protein: XP_030619899.1
Ruminants–Bos taurus (cattle), mRNA: XM_024975918.1, protein: XP_024831686.1
Carnivore—Enhydra lutris (sea otter), mRNA: XM_022506988.1, protein: XP_022362696.1
Marsupials.
Monodelphis domestica (gray short-tailed opossum)—mRNA: XM_016422845.1, protein: XP_016278331.1
Phascolarctos cinereus (koala)—mRNA transcript variant X1: XM_021009664.1, protein: XP_020865323.1
Sarcophilus harrisii (Tasmanian devil)—mRNA transcript variant X1: XM_031941343.1, protein: XP_031797203.1
Vombatus ursinus (common wombat)—mRNA transcript variant X1: XM_027837689.1, protein: XP_027693490.1
Results and discussion
The PTPRJ orthologs in mammals
The NCBI gene database lists 123 mammalian orthologs of the human PTPRJ: placental mammals—118 orthologs, marsupials—4, monotremes—1. We analyzed five orthologs from major groups of placentals and all orthologs from marsupials and monotremes.
In some mammalian species, different splice variants encoding different PTPRJ isoforms are listed. We examined all isoforms for the presence of a hydrophobic region near the N-terminus. Only isoforms with a predicted cleavable signal peptide were analyzed further.
The AUG initiator codons in mammalian PTPRJ transcripts
The arrangement of the AUG triplets at the 5′ end of the mammalian PTPRJ mRNA is shown in Fig 1. In all transcripts, the AUG codons are in-frame with the mature protein, with no intervening stop codons between them (see also S1, S3 and S5 Figs). Thus, no uORFs exist in the mammalian PTPRJ transcripts.
The AUG codons are indicated. The position of the transcripts is adjusted to align the first initiating AUGs. Arg-cluster–repeated Arg residues; h-region–region of nonpolar amino acids; PTPRJ–the mature protein.
The number of the initiating AUGs differs between mammalian subdivisions; the placentals have three AUGs, the monotremes–two, and the marsupials–only one. Remarkably, the context of the single marsupial AUG is identical to the preferred starting site for translation in humans (AUG190), which was identified previously [20].
It is firmly established that the nucleotides surrounding the AUG codons strongly affect initiation efficiency. Mutagenesis experiments with transfected COS cells [24] established an optimal context sequence for initiation (RCCAUGG, R is A or G). The nucleotides at positions -3 and +4 (the A in the AUG codon is +1) are of particular importance. The AUG context is categorized as strong (both crucial positions match the optimal sequence), favorable (only one match), or weak (no match at positions -3 and +4).
In mammalian PTPRJ transcripts, the nucleotide context of the AUG codons varies according to their scanning order (Table 1). In placentals, the AUG close to the 5’ cap (6–15 nucleotides) is scanned first, however, it is in a weak context (CGCAUGA). The next AUG is in a favorable context with G at -3 and U at +4 (GCCAUGU); the context of the third AUG is also favorable, but with A at +4 (GCCAUGA). In monotremes, the context of both AUGs is favorable (GCCAUGU and GCCAUGA, respectively). The marsupials possess a single AUG, which is in a favorable context (GCCAUGU).
The AUG codons are arranged in 5′ to 3′ direction according to the movement of the scanning complexes.
The optimal initiator sequence is RCCAUGG, R is A or G [24]. Critical positions are R at -3 and G at +4 (the A in the AUG codon is +1). The nucleotides surrounding the AUGs are from S1, S3 and S5 Figs.
To summarize, in all mammals, the AUG in a favorable context, at which the scanning 40S subunits arrive first, is with U in position +4. In placentals, the transcripts contain a preceding AUG in a weak context. In monotremes and placentals, the transcripts possess a downstream AUG in a favorable context, but with A in position +4. The functional significance of these differences is unknown. One may speculate that they reflect subtle differences in expression regulation.
To estimate additionally the potential of the different AUG codons to initiate translation the PTPRJ nucleotide sequences from placental mammals and platypus were submitted to the NetStart-1.0 web server. To predict translation start, this server takes into account a combination of local start codon context and global sequence information. Invariably the in silico evaluation gave the highest score to the AUG preceding the Arg-rich cluster.
The efficiency of the AUG initiator codons in the human PTPRJ transcripts
In humans, there are three in-frame AUG codons positioned upstream of the hydrophobic region (Fig 1). In previous experiments, the efficiency of each of the human initiator codons was tested in reporter constructs [20]. Briefly, the outcomes were (1) when all three AUGs were mutated no reporter activity was detected; (2) the efficiency of the first initiator (AUG13) was the weakest, and (3) the efficiency of the next two initiators (AUG190 and AUG355) appeared similar. Translation of the PTPRJ mRNA started predominantly at the first initiator in a favorable context (AUG190).
These results are in agreement with published RiboSeq data (Fig 2). The sequences encoded by the first exon of the human PTPRJ do not support appreciable non-canonical translation initiation. The first initiator (AUG13), which is in a weak context, is not “tight”; it leaks scanning complexes downstream towards the second initiator. According to the profiles of the initiating ribosomes, the third initiator (AUG355) is not active.
Top–the three reading frames with initiating codons (green) and stop codons (red). The AUGs in reading frame 2 are marked. Middle–RiboSeq data, initiating ribosomes. The positions of the Arg-cluster (yellow) and the region of nonpolar amino acids (h-region, grey) are indicated. Bottom—the RefSeq gene, exon 1.
The mammalian PTPRJ transcripts code for signal peptides with N-terminal extension
The N-termini of the proteins targeted for secretion or membrane integration, usually harbor a short amino acid sequence–the signal peptide–instrumental for the translocation of the proteins into the ER membranes. In most cases, the signal peptide is 20–25 amino acids long. It contains a positively charged N-terminal region (1–5 residues), a hydrophobic region (h-region) (7–15 residues), and a signal-peptidase recognition site (1–5 residues) [25].
In several proteins, however, the signal peptide is N-terminally extended and the hydrophobic region is far from the N-terminus [5]. Recent experiments showed that in Plasmodium falciparum the N-terminal extension of the signal peptides is irrelevant for their function [26].
The signal peptides of the mammalian PTPRJ proteins are with an N-terminal extension (Fig 1). In marsupials, the signal peptides are 94–103 residues long (S4 Fig). In the platypus (monotremes), the PTPRJ mRNA encodes two possible signal peptides. The translation launched from the first AUG results in long signal peptide of 76 amino acids (S6 Fig). In placentals, the AUG initiating codons are three. In humans, according to the profiles of the initiating ribosomes, the third initiator (AUG355) is not active (Fig 2). Correspondingly, the first and–the presumably preferred—second initiators in placental mammals produce N-terminally extended signal peptides composed of 147–150 and 87–92 amino acid residues, respectively (S1 and S2 Figs).
The arginine-rich sequence in the N-terminally extended signal peptides is highly conserved
The human PTPRJ transcripts encode an extended signal peptide, which contains repeated arginine residues (RRTGWRRRRRRRR). Earlier it was demonstrated that the arginine cluster attenuates translation [20]. A comparison between mammalian orthologs revealed the presence of a similar sequence in the extended signal peptides of all mammals (Fig 3). Differences are minor. In placentals, the sequence of arginine residues is with three intervening amino acids (TGW/G). In marsupials, the string of arginine residues is interrupted by two amino acids (S/TW). In platypus, the arginine-rich cluster is 15 amino acids with one interruption (G). The arginine repeats in mammals are positioned 11–12 residues downstream of the initiating methionine.
Placentals and marsupials—consensus sequences, monotremes–platypus, see S2, S4 and S6 Figs. The initiating Met residues (green), the conserved Arg-residues (yellow) and the intervening amino acids (grey) are marked.
The conservation of the composition and location of the arginine-rich cluster emphasizes the functional importance of these features. In the absence of uORFs, this seems to be a necessary mechanism to down-regulate PTPRJ expression.
The mechanism of translation attenuation and potential biological significance
Previous experiments showed that: (1) the translation attenuation is not due to the presence of rare codons; (2) elimination of the repeated arginine residues by frame-shift mutations (plus and minus) is sufficient to up modulate PTPRJ expression [20].
Most likely, the inhibition of expression is due to the positive charge of the arginine residues. Stalling of ribosomes at positively charged residues, due to electrostatic interactions with the negatively charged exit tunnel was described in model experiments [4]. More recently RiboSeq data were interpreted to show that ribosomes in yeast and mammals stall at positively charged amino acids [27, 28]. The ribosome exit tunnel accommodates 30–40 amino acid residues [1, 2]. Therefore, it is reasonable to assume that in mammalian PTPRJ the translating ribosomes stall when the arginine residues of the nascent chain are in the exit tunnel. The result is translation inhibition. The high degree of conservation of this inhibitory sequence is a strong indication of its functional significance.
PTPRJ has numerous cellular substrates, such as RTKs, Src-family kinases, and others. It has been implicated in the regulation of a wide range of cellular functions. Enzymatic studies of the PTPRJ phosphatase domain revealed promiscuity with respect to substrate specificity and a very high intrinsic activity [29]. It appears therefore plausible that high levels of PTPRJ protein may disturb cellular functions or may even be toxic. Attenuation of PTPRJ translation by virtue of the here described features of the signal peptide may serve to prevent toxic effects and to allow fine-tuning of expression at the transcriptional level.
Other transcripts, encoding N-terminally extended signal peptides, may use a similar cis-acting mechanism to attenuate expression. An analysis of the extended signal peptides in the precursors of the human E3 ubiquitin-protein ligase ZNRF3 and of NOTCH3 revealed the presence of arginine clusters and stretches of proline residues. Both structures may cause ribosome pausing as the nascent chain is synthesized (Fig 4). In these cases, the efficiency of the nascent chains to throttle translation remains to be elucidated.
The initiating Met residues (green), the Arg-cluster (yellow), the consecutive Pro residues (underlined) and the nonpolar signal sequences (grey) are shown. The signal peptidase cleavage site (▼) was predicted in silico by the SignalP 5.0 Server.
Supporting information
S1 Fig. Alignment of the 5′ end of the PTPRJ transcripts encoding the extended signal peptides in placental mammals.
https://doi.org/10.1371/journal.pone.0240498.s001
(PDF)
S2 Fig. Alignment of the extended signal peptides of PTPRJ in placental mammals.
https://doi.org/10.1371/journal.pone.0240498.s002
(PDF)
S3 Fig. Alignment of the 5′ end of the PTPRJ transcripts encoding the extended signal peptides in marsupials.
https://doi.org/10.1371/journal.pone.0240498.s003
(PDF)
S4 Fig. Alignment of the extended signal peptides of PTPRJ in marsupials.
https://doi.org/10.1371/journal.pone.0240498.s004
(PDF)
S5 Fig. The 5′ end region of the PTPRJ mRNA in platypus encoding the extended signal peptide.
https://doi.org/10.1371/journal.pone.0240498.s005
(PDF)
S6 Fig. The extended signal peptide of PTPRJ in platypus.
https://doi.org/10.1371/journal.pone.0240498.s006
(PDF)
Acknowledgments
One of the authors (LK) highly appreciates the help of I. Stancheva for critically reading the manuscript and helpful suggestions.
References
- 1. Ito K, Chiba S. Arrest peptides: cis-acting modulators of translation. Ann Rev Biochem. 2013;82:171–202. pmid:23746254
- 2. Wilson DN, Arenz S, Beckmann R. Translation regulation via nascent polypeptide-mediated ribosome stalling. Curr Opinion Struct Biol. 2016;37:123–33.
- 3. Artieri CG, Fraser HB. Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation. Genome Res. 2014;24(12):2011–21.
- 4. Lu J, Deutsch C. Electrostatics in the ribosomal tunnel modulate chain elongation rates. J Mol Biol. 2008;384(1):73–86. pmid:18822297
- 5. Hiss JA, Resch E, Schreiner A, Meissner M, Starzinski-Powitz A, Schneider G. Domain organization of long signal peptides of single-pass integral membrane proteins reveals multiple functional capacity. Plos One. 2008;3(7):e2767.
- 6. Andersen JN, Del Vecchio RL, Kannan N, Gergel J, Neuwald AF, Tonks NK. Computational analysis of protein tyrosine phosphatases: practical guide to bioinformatics and data resources. Methods. 2005;35(1):90–114. pmid:15588990
- 7. Östman A, Yang Q, Tonks NK. Expression of DEP-1, a receptor-like protein-tyrosine-phosphatase, is enhanced with increasing cell density. Proc Nat Acad Sci USA. 1994;91(21):9680–4.
- 8. Autschbach F, Palou E, Mechtersheimer G, Rohr C, Pirotto F, Gassler N, et al. Expression of the membrane protein tyrosine phosphatase CD148 in human tissues. Tissue Antigens. 1999;54(5):485–98. pmid:10599888
- 9. Ruivenkamp CA, van Wezel T, Zanon C, Stassen AP, Vlcek C, Csikós T, et al. Ptprj is a candidate for the mouse colon-cancer susceptibility locus Scc1 and is frequently deleted in human cancers. Nat Gen. 2002;31(3):295–300.
- 10. Tarcic G, Boguslavsky SK, Wakim J, Kiuchi T, Liu A, Reinitz F, et al. An unbiased screen identifies DEP-1 tumor suppressor as a phosphatase controlling EGFR endocytosis. Curr Biol. 2009;19(21):1788–98.
- 11. Petermann A, Haase D, Wetzel A, Balavenkatraman KK, Tenev T, Gührs KH, et al. Loss of the protein‐tyrosine phosphatase DEP‐1/PTPRJ drives meningioma cell motility. Brain Pathol. 2011;21(4):405–18.
- 12. Arora D, Stopp S, Böhmer SA, Schons J, Godfrey R, Masson K, et al. Protein-tyrosine phosphatase DEP-1 controls receptor tyrosine kinase FLT3 signaling. J Biol Chem. 2011;286(13):10918–29. pmid:21262971
- 13. Shintani T, Higashi S, Takeuchi Y, Gaudio E, Trapasso F, Fusco A, et al. The R3 receptor-like protein tyrosine phosphatase subfamily inhibits insulin signalling by dephosphorylating the insulin receptor at specific sites. J Biochem. 2015;158(3):235–43.
- 14. Krüger J, Brachs S, Trappiel M, Kintscher U, Meyborg H, Wellnhofer E, et al. Enhanced insulin signaling in density-enhanced phosphatase-1 (DEP-1) knockout mice. Mol Metabolism. 2015;4(4):325–36.
- 15. Shintani T, Higashi S, Suzuki R, Takeuchi Y, Ikaga R, Yamazaki T, et al. PTPRJ inhibits leptin signaling, and induction of PTPRJ in the hypothalamus is a cause of the development of leptin resistance. Sci Rep. 2017;7(1):1–4.
- 16. Senis YA, Tomlinson MG, Ellison S, Mazharian A, Lim J, Zhao Y, et al. The tyrosine phosphatase CD148 is an essential positive regulator of platelet activation and thrombosis. Blood. 2009;113(20):4942–54.
- 17. Nagy Z, Mori J, Ivanova VS, Mazharian A, Senis YA. Interplay between the tyrosine kinases Chk and Csk and phosphatase PTPRJ is critical for regulating platelets in mice. Blood. 2020;135(18):1574–87.
- 18. Fournier P, Dussault S, Fusco A, Rivard A, Royal I. Tyrosine phosphatase PTPRJ/DEP-1 is an essential promoter of vascular permeability, angiogenesis, and tumor progression. Cancer Res. 2016;76(17):5080–91.
- 19. Katsumoto TR, Kudo M, Chen C, Sundaram A, Callahan EC, Zhu JW, et al. The phosphatase CD148 promotes airway hyperresponsiveness through SRC family kinases. J Clin Invest. 2013;123(5):2037–48. pmid:23543053
- 20. Karagyozov L, Godfrey R, Böhmer SA, Petermann A, Hölters S, Östman A, et al. The structure of the 5′-end of the protein-tyrosine phosphatase PTPRJ mRNA reveals a novel mechanism for translation attenuation. Nucl Acids Res. 2008; 36(13):4443–53.
- 21. Pedersen AG, Nielsen H. Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis. Proc Int Conf Intell Syst Mol Biol. 1997; 5:226–233.
- 22. Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019;37(4):420–3. pmid:30778233
- 23. Michel AM, Fox G, M. Kiran A, De Bo C, O’Connor PB, Heaphy SM, et al. GWIPS-viz: development of a ribo-seq genome browser. Nucl Acids Res. 2014;42(D1):D859–64.
- 24. Kozak M. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell. 1986;44(2):283–92.
- 25. von Heijne G. The signal peptide. J Membrane Biol. 1990;115(3):195–201.
- 26. Meyer C, Barniol L, Hiss JA, Przyborski JM. The N-terminal extension of the P. falciparum GBP130 signal peptide is irrelevant for signal sequence function. Int J Med Microbiol. 2018;308(1):3–12.
- 27. Charneski CA, Hurst LD. Positively charged residues are the major determinants of ribosomal velocity. PLoS Biol. 2013;11(3):e1001508.
- 28. Sabi R, Tuller T. A comparative genomics study on the effect of individual amino acids on ribosome stalling. BMC Genomics. 2015;16(S10):S5.
- 29. Barr AJ, Ugochukwu E, Lee WH, King ON, Filippakopoulos P, Alfano I, et al. Large-scale structural analysis of the classical human protein tyrosine phosphatome. Cell. 2009;136(2):352–63.