Alternative Transcripts and 3′UTR Elements Govern the Incorporation of Selenocysteine into Selenoprotein S

Selenoprotein S (SelS) is a 189 amino acid trans-membrane protein that plays an important yet undefined role in the unfolded protein response. It has been proposed that SelS may function as a reductase, with the penultimate selenocysteine (Sec188) residue participating in a selenosulfide bond with cysteine (Cys174). Cotranslational incorporation of Sec into SelS depends on the recoding of the UGA codon, which requires a Selenocysteine Insertion Sequence (SECIS) element in the 3′UTR of the transcript. Here we identify multiple mechanisms that regulate the expression of SelS. The human SelS gene encodes two transcripts (variants 1 and 2), which differ in their 3′UTR sequences due to an alternative splicing event that removes the SECIS element from the variant 1 transcript. Both transcripts are widely expressed in human cell lines, with the SECIS-containing variant 2 mRNA being more abundant. In vitro experiments demonstrate that the variant 1 3′UTR does not allow readthrough of the UGA/Sec codon. Thus, this transcript would produce a truncated protein that does not contain Sec and cannot make the selenosulfide bond. While the variant 2 3′UTR does support Sec insertion, its activity is weak. Bioinformatic analysis revealed two highly conserved stem-loop structures, one in the proximal part of the variant 2 3′UTR and the other immediately downstream of the SECIS element. The proximal stem-loop promotes Sec insertion in the native context but not when positioned far from the UGA/Sec codon in a heterologous mRNA. In contrast, the 140 nucleotides downstream of the SECIS element inhibit Sec insertion. We also show that endogenous SelS is enriched at perinuclear speckles, in addition to its known localization in the endoplasmic reticulum. Our results suggest the expression of endogenous SelS is more complex than previously appreciated, which has implications for past and future studies on the function of this protein.


Introduction
Selenoproteins are a diverse family of proteins characterized by the presence of selenocysteine (Sec), the 21 st amino acid. The incorporation of Sec into a growing peptide chain is unusual, as Sec is encoded by the UGA stop codon. Given the dual nature of this codon, specialized machinery is necessary to recode the UGA as Sec. Within the selenoprotein mRNA, a stem-loop structure called the Sec Insertion Sequence (SECIS) is required for recoding. In eukaryotes, the SECIS is found within the 39 untranslated region (UTR) [1]. Several dedicated protein factors are also necessary for Sec insertion. SECIS-binding protein 2 (SBP2) interacts with a core motif in the SECIS element and is believed to facilitate interactions between the selenoprotein mRNA and the recoding machinery [2,3,4]. The binding of SBP2 to the SECIS is required for Sec insertion to occur and mutations that disrupt this interaction can lead to human disease. Many proteins are involved in the producing the Sec-tRNA Sec , which is non-canonical in both its synthesis and final structure [5,6]. Sec insertion also requires a dedicated elongation factor, EFSec [7,8] that recognizes the Sec-charged tRNA. Additional proteins have been shown to promote recoding, or to regulate synthesis of specific selenoproteins including ribosomal protein L30 [9], nucleolin [10] and eIF4a3 [11]. For a more thorough explanation, refer to reviews of selenoprotein synthesis [12,13].
While the Sec incorporation machinery is widely expressed, the types of selenoproteins produced only partially overlap between species [14,15]. The human selenoproteome consists of 25 family members [16]. Many selenoproteins are oxidoreductases that contain Sec at the active site. However, approximately half of the human selenoproteins are without a known function and unanticipated roles for selenoproteins are continually being discovered, as studies into the selenoproteome expand. One such example is Selenoprotein S (SelS). SelS was first identified in a screen to find genes that were differentially expressed in a diabetic animal model [17], although it was not yet recognized as a selenoprotein. It was shown to be a glucose-regulated protein, with its expression inversely proportional to circulating glucose and insulin levels [17,18]. Recently, SelS was identified as one of the most widespread eukaryotic selenoproteins based on comparative genomics [19]. It was grouped in a protein family with Selenoprotein K, based on protein localization, domain organization and placement of Sec near the carboxy-terminus. The combination of the prevalence and conservation of SelS suggests that this protein performs an important biological function. The ability of SelS to act as a reductase was demonstrated in vitro [20], but an enzymatic activity for this protein has not been identified in cells. However, SelS was discovered to play a role in the unfolded protein response (UPR) [21]. The UPR refers to a group of conserved signaling pathways that are activated in response to the accumulation of unfolded proteins within the ER. The purpose of the UPR is to restore the ability of the ER to process its client proteins, both through the upregulation of molecular chaperones to increase folding capacity and the removal of misfolded proteins to reduce demand (ER-associated degradation, ERAD). SelS is involved in ERAD as part of a multiprotein complex that removes misfolded proteins from the ER to the cytoplasm for degradation [21]. SelS is also known as Valosin-containing protein (VCP)-Interacting Membrane Protein (VIMP) due to its interaction with VCP in this ERAD complex. The expression of SelS is upregulated under conditions of ER stress [18], presumably to help increase the capacity of a cell to manage misfolded proteins. The UPR is a crucial cellular pathway as failure to resolve ER stress will cause the cell to undergo apoptosis. Studies in multiple systems have shown that overexpression of SelS has protective effects against ER stress [22,23,24], while knockdown of SelS sensitizes cells to ER stress and apoptosis [22,23,25,26].
Endogenously, this increase in SelS expression is facilitated by the presence of an ER-stress element (ERSE) in its promoter [27]. A naturally occurring point mutation within the ERSE of SelS led to the discovery of a second physiological function. Patients with this mutation were unable to upregulate SelS expression under ER stress conditions [28]. These patients had increased inflammation as determined by plasma levels of IL-6, IL-1b and TNF-alpha, three acute phase cytokines [28]. This inverse relationship between the expression of SelS and acute phase cytokines suggests that SelS has a role in the negative regulation of inflammation. Furthermore, siRNA knockdown of SelS in macrophage cells led to increased release of IL-6 and TNF-alpha [28], while treatment of HepG2 cells with cytokines increased SelS expression [27]. This suggests the existence of a regulatory feedback loop to control inflammatory processes. An additional line of evidence linking SelS to inflammation is its direct interaction with serum amyloid A (SAA) [17], an acute-phase inflammatory response protein, though the significance of this interaction is unknown.
ER stress and inflammation are now known to underlie many human diseases with examples that include diabetes, metabolic syndrome disorders, atherosclerosis, Alzheimer's Disease, Parkinson's Disease and non-alcoholic fatty liver disease [29,30,31]. Understanding the molecular mechanisms that contribute to the development and resolution of ER stress and inflammatory processes will have wide ranging contributions to human health. Given its intriguing position at the crossroads of these two processes, we were interested in investigating the expression and regulation of SelS.
In this study we show that only one of the human SelS mRNA variants can encode a selenoprotein of 189 amino acids. The other transcript encodes a truncated protein of 187 amino acids that lacks selenocysteine. Additionally, elements in the 39UTR of the selenoprotein-encoding mRNA positively and negatively influence Sec insertion into SelS, and provide another mechanism to regulate the production of these two protein isoforms. The ability of 39UTR elements to influence the incorporation of Sec underscores the importance of context when examining functional RNA elements such as the SECIS. We also show that in addition to being an ER-resident protein, the subcellular localization of endogenous SelS includes enrichment at perinuclear speckles adjacent to the Golgi, which was previously unknown.

RNA and protein sequences
All sequences were obtained using NCBI and Ensembl databases. The accession numbers for all sequences are listed in Table S1. For the RNAs, only sequences with complete 39UTR reads were included. The presence of a SECIS element within the 39UTR was detected with SECISearch (http://genomics.unl.edu/ SECISearch.html). Most of the SelS protein sequences did not include the Sec residue. After confirming the presence of the SECIS element, the protein sequences were manually curated to include the last two residues. The SelS constructs with V5 epitope tags used for in vitro translation/immunoprecipitation were generated by PCR amplifying the ORF of SelS without the stop codon using the common forward primer listed above and the SelS minus stop reverse primer 59 CACTTCGAAGCCTCATCCGCCAGATGA. The PCR product was digested and subcloned into the KpnI/SfuI sites of pcDNA3.1mycHISA (Invitrogen), generating SelSmycHIS. This was subsequently digested with SfuI and AgeI and ligated with the SfuI/AgeI insert from pcDNA3.1V5His (Invitrogen), which effectively switched the epitope tag from myc to V5. The 39UTR sequences were added between the AgeI and PmeI sites, replacing the HIS tag. Sec-V5-v2 WT contains the full-length 39UTR, while Sec-V5-v2DStem removes the first 60 nucleotides of the 39UTR. The forward primers used for generating the 39UTR PCR products were V2-AgeI 59CGGACCGGTTAA-GAATCTTGTTAGTGT, DSTEM-AgeI 59 GGCACCGGT-TAAGCCTTACGCACGCTTTTC and the reverse primer was 59 CGCGTTTAAACGTAATAAAAAGCTAT. The cysteine mutant version Cys-V5-v2 was generated using DpnI site-directed mutagenesis with the following primers: 59 CCCGTC-ATCTGGCGGATGTGGCTTCGAAGGTAAGCC and 59 GGCTTACCTTCGAAGCCACATCCGCCAGATGACGGG, where the underlined nucleotide is the altered nucleotide.

Cell culture
HepG2 (human hepatoma), HEK293 (human embryonic kidney), and U251 (human glioma) were obtained from ATCC. All cells were cultured in a monolayer in DMEM with 1g/L glucose and 10% FBS, in 5% CO 2 at 37uC. Cell pellets from T47D, SW480, HT29, HCT116 and HCT8 cell lines used for RNA extraction were a gift from A. Chaudhury and were all originally obtained from ATCC.
Cells were seeded in 6-well cell culture plates at 2.5610 5 (HEK293) or 4610 5 (HepG2). HEK293 treatments were performed 16 hours later with 50 nM siRNA and Dharmafect 1 transfection reagent, according to manufacturer's instructions (Dharmacon). HepG2 cells were treated with 20 nM siRNA and Dharmafect 4 transfection reagent. After 72 hours the cells were harvested for protein or were fixed for immunofluorescence (see below). Total protein lysates were obtained by washing the cells twice with phosphate buffered saline (PBS), scraping the wells, and collecting the samples in a microfuge tube. After centrifugation, the pellets were resuspended in 20 mM Tris, pH 7.5, 1% NP-40, 150 mM NaCl, 5 mM EDTA, 1 mM phenylmethylsulfonyl fluoride and HALT protease inhibitor (Pierce). The lysates were incubated for 30 minutes on ice with occasional mixing, and then centrifuged for 15 minutes at 21000 rpm in a refrigerated centrifuge. Lysates were stored at 220uC until analyzed.

qRT-PCR
Cell pellets were obtained for each of the listed cell lines and RNA was extracted using Trizol (Invitrogen), according to manufacturer's instructions. The RNA was checked for quantity and quality using spectrophotometry and agarose gel electrophoresis. For every sample, 2 mg of RNA and random hexamer priming was used for reverse transcription using the Taqman Reverse Transcription Reagents kit (Applied Biosystems). To obtain an optimized cDNA template concentration for use in quantitative Real-Time PCR (qRT-PCR), cDNA was tested in a standard curve experiment by utilizing a10-fold dilution series over 5 points starting from the most concentrated cDNA sample. Based on these results, 2 ml of a 1:10 dilution of cDNA template was used for qRT-PCR.
Primers in the open reading frame (ORF) were used to detect the total amount of SelS (forward: 59-CGG TCA TGG AAC GCC AAG-39, and reverse: 59-GCG GAA AGC TTC TGA AAG AC-39). Variant specific products were detected using a common forward primer in the ORF For primer efficiency testing, a standard curve experiment consisting of 3 replicates of cDNA in a 10-fold dilution series using identical primer concentrations (250 nM/reaction) was performed. The primer efficiencies for each set were translated from the slope of the standard curve's linear regression line using the formula: E = (10 21/slope )21.
qRT-PCR reactions were performed in triplicate with 2X Fast SYBR Green Master Mix (Applied Biosystems) and set up in MicroAmp Fast Optical 96-well reaction plates with optical caps (Applied Biosystems). Control reactions included no-reverse transcriptase controls for each cDNA template and no template controls (NTCs) for each primer set on each plate. Plates were run in a StepOnePlus Real-Time PCR System (Applied Biosystems), using conditions suggested by the Fast SYBR Green protocol (enzyme activation step: 95uC for 20 sec for 1 cycle; denature step: 95uC for 3 sec; anneal/extend step: 60uC for 30 sec; denature and anneal/extend steps repeated for 40 cycles). Data was analyzed using StepOne Software v2.1 (Applied Biosystems).

Luciferase-based in vitro Sec Insertion Assay
Luciferase reporter plasmid DNAs were linearized and used as templates for in vitro transcription using T7 RNA polymerase (Ribomax T7; Promega). In vitro translation reactions were assembled for a total volume of 25 ml, including 70% rabbit reticulocyte lysate (Promega), complete amino acid mixture, RNase Inhibitor and 100 ng of luc/UGA 258 mRNA in the presence or absence of purified recombinant SBP2 CT [11]. The reactions were incubated at 30uC for 30 min. Each reaction was tested in triplicate by adding 2.5 ml of the translation mixture to 50 ml of luciferase substrate, using 5 second measurements in a 1420 Perkin Elmer Victor3 multi-label counter. The results are displayed as the mean from triplicate experiments with error bars that indicate one standard deviation, as calculated in Excel.

V5-surrogate Sec insertion assay
The Sec-V5-v2 and Cys-V5 plasmids were linearized and used as templates for in vitro transcription using T7 RNA polymerase (Ribomax T7; Promega). In vitro translation reactions were assembled for a total volume of ml, as described above. The reactions were incubated at 30uC for 45 min. The reactions were stopped on ice, and diluted to 250 ml with 50 mM Tris-HCl, 150 mM NaCl and 1 ml of anti-SelS Prestige antibody was added to the reactions (HPA010025, SIGMA). The mixture was incubated for 1 hour at 4uC on a rotating mixer. Protein G coupled Dynabeads (Life Technologies) were used to immunoprecipitate the protein-Ab complexes. The samples were washed 3 times with 50 mM Tris-HCl, 150 mM NaCl and eluted into 2x SDS-PAGE buffer by heating to 95uC for 10 minutes. The samples were separated by SDS-PAGE and transferred to PVDF membranes. Western blotting was performed against the V5 tag using a 1:4000 dilution of the anti-V5 antibody (R96025, Life Technologies) and a 1:10 000 dilution of anti-MouseHRP (Jackson Immunochemicals). Blots were developed as described above.

Immunofluorescence
Cells grown on coverslips were fixed with ice-cold methanol for 5 minutes or 4% paraformaldedye for 15 minutes. After washing with PBS, cells were permeabilized with 0.2% Triton X-100 in PBS for 5 minutes at room temperature with gentle mixing. Cells were washed twice with PBS and then incubated with ImageIt Signal Enhancer (Life Technologies) for 30 minutes. The primary antibodies were added for one hour at room temperature, washed twice in PBS and followed by incubation with the secondary antibody for one hour. After final washing, the samples were mounted onto slides using Prolong Gold Antifade with DAPI (Life Technologies) for standard immunofluorescence or Vectashield (Vector Labs) for confocal microscopy. The primary antibodies were a-SelS Prestige (Sigma, HPA010025) and a-golgin p97 clone CDF4 (Life Technologies, A21270). The secondary antibodies were Alexa Fluor 488 goat a-rabbit IgG and Alexa Fluor 568 goat a-mouse IgG (Life Technologies, A11034 and A11031, respectively). Images for standard immunofluorescence were collected on Leica DM5500B upright microscope (Leica Microsystems, GmbH) using ImagePro Plus software (MediaCybernetics). Confocal images were captured with a Leica TCS-SP2 Spectral Laser Scanning Confocal Microscope using Leica Confocal Software (Leica Microsystems, GmbH).

SelS has two mRNA variants in humans
SelS is a highly conserved, single-pass transmembrane protein of 189 aa that is primarily found in the endoplasmic reticulum (ER) but is also located on the cell surface. The transmembrane domain is oriented such that the small amino-terminal domain is within the ER lumen, while the larger carboxy-terminal domain is in the cytoplasm. Sec is the penultimate residue within the protein, at position 188 ( Figure 1A).
Database analysis revealed that human SelS is encoded by two mRNA transcripts; variant 1 (NM_203472.1) and variant 2 (NM_018445.4). These transcripts differ in their 39UTR sequences due to a splicing event in transcript 1 that occurs eight nucleotides into the 39UTR ( Figure 1B). Despite this difference the two transcripts are often annotated as producing the same protein, as there are no apparent alterations to their coding regions. However, the splicing event in transcript 1 excises the SECIS element, which is absolutely required for Sec insertion. Thus, these two transcripts should not be capable of producing the same protein. The variant 1 transcript would encode a 187 aa protein (without Sec), due to premature termination at the UGA codon, while the variant 2 transcript can produce the 189 aa Seccontaining protein.
We were interested in determining whether both transcripts were expressed in different cell lines. RNA samples were isolated from human cell lines derived from liver (HepG2), kidney (HEK293), colon (SW480, HT29, HCT116, HCT8), breast (T47D) and glioma (U251MG). Quantitative RT-PCR was used to examine total SelS levels using primers in the coding region, while a common forward primer and 39UTR-specific reverse primer were used to quantify the individual variants. Each RNA sample was tested for the total SelS transcript levels, as well as the relative levels of the variant 1 and variant 2 transcripts. The SECIS-containing variant 2 transcript was predominant in all samples tested (data not shown). However, as shown in Figure 1C, the variant 1 mRNA was identified in every sample, representing 5-16% of the population of SelS transcripts across the various cell lines. In addition, the variant without the SECIS element has been detected in other primates including chimps, macaques and gibbons ( Figure S1). The placement of the splice donor is preserved in other mammalian sequences ( Figure S2), however there is not sufficient EST or transcriptome data to determine whether two SelS mRNA variants are expressed in other species.
We also examined the contribution of the two variants to SelS protein production using siRNA knockdown in HEK293 cells. Total SelS mRNA was targeted using two different siRNAs against the coding region of SelS, while the transcript variants were individually targeted with siRNAs designed against the 39UTRs of each mRNA. A robust knockdown of SelS protein levels was achieved with both coding region siRNAs (80-85%), as well as the variant 2-specific siRNA (90%) when compared to treatment with a non-targeting siRNA ( Figure 1D). Only a modest reduction in SelS protein was observed with the variant-1 specific siRNA (12%). These results are in good agreement with the quantitative RT-PCR results with respect to the relative abundance of the mRNA variants. Similar siRNA knockdown experiments in U251 and HepG2 cells confirmed that variant 2 is the predominant transcript in these cell lines (unpublished observations).

Only the SelS variant 2 transcript encodes a selenoprotein
As previously mentioned, the 39UTR of the variant 1 mRNA does not contain an identifiable SECIS element. This implies that the SelS variant 1 transcript does not encode a selenoprotein, unless a highly unusual SECIS element is present. Both transcript variants are capable of producing a SelS protein at similar levels, whether expression was examined by in vitro translation or transient transfection into cells ( Figure S3). This implies that the two UTRs do not differentially effect mRNA stability or normal protein translation. As the two predicted proteins differ by 2 amino acids, they cannot be distinguished by size. We initially wished to utilize a mass spectrometry approach to discriminate between these two SelS proteins. Protein samples from untransfected cells and cells transiently transfected that overexpress SelS were examined. However, while several SelS peptides were successfully detected, the carboxy-terminal peptide was never included in the set of identified peptides, precluding mass spectrometry as a viable option. Given the technical difficulties with this approach, the ability of the two different 39UTRs to support Sec insertion was examined using an established in vitro recoding assay. This system has been previously validated to be SECIS-dependent and codonspecific (i.e. not generalized read-through) [32]. Briefly, the assay uses a luciferase reporter construct that has been modified to contain a UGA codon at position 258, rendering expression of the luciferase protein dependent on Sec insertion (lucUGA 258 ). In order to compare the ability of the two 39UTRs to support UGA recoding, the complete 39UTR from the SelS variant 1 mRNA (615 nucleotides) or variant 2 mRNA (573 nucleotides) was appended to the modified luciferase reporter. The SelS SECIS element (120 nucleotides) was used as a positive control. The three reporters were in vitro transcribed and translated in a rabbit reticulocyte lysate (RRL) system. Equal femtomoles of RNA were used in the reactions to account for differences in transcript size. As RRL lacks sufficient SBP2 to promote Sec incorporation, each reaction was supplemented with recombinant protein corresponding to the carboxy-terminal half of SBP2 (SBP2 CT), which contains all currently known activities of the protein and supports Sec insertion. The translation products were then assayed for luciferase activity.
In the absence of SBP2, very little luciferase is detected, which is due to low levels of non-specific readthrough. As shown in Figure 2, addition of recombinant SBP2 results in the induction of luciferase activity for the variant 2 and SECIS-only contructs (7 fold and 21 fold respectively). In contrast, the variant 1 construct does not respond to SBP2, confirming that this 39UTR does not support UGA recoding activity. This is not due to general effects of the variant 1 39UTR on mRNA translation as cysteine-containing versions of all the reporters were expressed at equivalent levels (data not shown). Thus, only one of the SelS mRNA variants is capable of producing a selenoprotein.

Conserved elements in the 39UTR of SelS
Our results show that the SelS SECIS element functions more efficiently in isolation than when found in the context of its natural 39UTR. This suggests that other sequences are influencing the SECIS activity. We examined the sequence of the human SelS variant 2 39UTR to look for known sequence motifs as well as potential RNA structures. Initial scanning of the sequence revealed an AU-rich region immediately downstream of the SelS SECIS element, as well as an A-rich region further downstream. No other RNA motifs were identified based on primary sequence. AU-rich elements (AREs) are well known to function in post-transcriptional gene regulation and have varied transcript-specific effects on mRNA stability and/or translational control. When SelS was used to query the AU-rich element-containing mRNA database (ARED: brp.kfshrc.edu.sa/ARED) [33], the region we identified in SelS was categorized as an ARE.
After the sequence-based searches, RNA-folding prediction programs were used to identify potential structural elements in the 39UTR of variant 2 mRNA. First, the position of the SECIS element was determined using SECISearch 2.19 (http://genome. unl.edu/SECISearch.html) [16]. RNA-folding analysis of the entire human SelS variant 2 39UTR using the RNAfold program from the Vienna RNA Websuite (http://rna.tbi.univie.ac.at/cgibin/RNAfold.cgi) [34] revealed the likelihood of two stem-loop structures within the 39UTR (Figure 3) in addition to the SECIS element. The first stem-loop structure is located at the very beginning of the 39UTR, and will be referred to as stem-loop 1 (SL1). This stem-loop begins three nucleotides into the 39UTR and is situated tantalizingly close to the site of Sec insertion, in a position likely to influence recoding. The second predicted stemloop (SL2) corresponds to the ARE identified by primary sequence analysis. This structure is predicted to form immediately downstream of the SECIS element, with the AU-rich sequence largely displayed in the loop region. AREs are often platforms for RNA-protein interactions [35]. The location of this ARE adjacent to the SECIS element makes it well placed to interfere with SECIS In order to determine whether these predicted structures are conserved, a collection of available SelS sequences was assembled from NCBI and Ensembl databases. The SelS mRNA and protein sequences were obtained for as many species as possible, resulting in 32 mammalian sequences and 4 non-mammalian sequences (Table S1). Only those sequences with complete 39UTRs were included and the presence of a SECIS element in each 39UTR was confirmed using SECISearch. Notably, the corresponding SelS proteins were often mis-annotated in the databases, with the Sec residue absent in 22 of the 36 protein sequences. The 39UTR sequences were then analyzed for conservation of SL1 and SL2.
Within the 39UTR, there is very little conservation based on primary sequence outside of the SECIS element itself, even if only the mammalian sequences are analyzed. The AU-rich character of the sequence immediately downstream of the SECIS is preserved but is not identical. However, the results are different when the sequences are examined based on structural predictions instead of primary sequence. We performed an analysis of the first 50 nucleotides from each of the 39UTRs in our collection to examine the potential structural conservation of SL1. First, the LocARNA server (http://rna.tbi.univie.ac.at/cgi-bin/LocARNA.cgi) was used to create a structural alignment of multiple RNA sequences. This output was then analyzed using the RNAalifold server (http://rna.tbi.univie.ac.at/cgi-bin/RNAalifold.cgi) to predict a consensus secondary structure for these aligned sequences. While there is some similarity across the mammalian sequences, inclusion of the non-mammalian sequences largely removes the primary sequence conservation without impacting the structural conservation of this region. Figure 4A shows the structure annotated alignment generated by the RNAalifold program [36]. The color coding of the alignment reflects the sequence covariation of this region. A mutation on one side of an RNA helix will require a matching mutation on the other side of the helix to retain the structure. Therefore, the sequence is analyzed for the six typical base pair combinations that are found in RNA helices: GC, CG, AU, UA, GU and UG. The color indicates how many of the six base pair types occur at a given position across the set of sequences. A pale version of the color denotes that not all sequences in the set can make a certain base pair. SL1 displays many examples of compensatory mutations across the predicted stem region, with several positions using multiple different base pair types. Figure 4B is the predicted consensus secondary structure for SL1 generated by the RNAalifold program. The color of the base indicates the likelihood of its involvement in a base pairing interaction. The probability scale runs from blue (low probability) to red (high probability). In addition, positions where compensatory mutations occur in the sequence set are indicated on the structure with black circles around the nucleotides. SL1 displays a high probability of forming a stem-loop structure, as the majority of the structure registers in the red range. The only exception is the base pair at the top of the stem, which likely reflects a tolerance for the helix to breathe at this position.
For SL2, the 50 nucleotides immediately downstream from the SECIS element were used to generate the alignment. The location of the SECIS in each sequence was defined using SECISearch. Figure 5A shows the RNAalifold structure annotated sequence alignment for this region. This region of the SelS 39UTR retains its AU-rich character across the sequence set but it is more difficult to discover sequence covariance in the region, particularly with the inclusion of non-mammalian species. Despite the sequence noise, Figure 5B shows the high-probability formation of a stem-loop structure in this region. The likelihood of the base pair interactions across the predicted stem is reinforced by the detection of compensatory mutations for each position, as indicated by black circles around the nucleotides involved. As the set of sequences is heavily weighted to mammals, we also conducted a pairwise analysis using the Ciona and Xenopus sequences in the combined locARNA/RNAalifold analysis. This analysis also predicts the formation of a stem-loop of similar size and length (data not shown). Thus, the ability to form a stem-loop structure downstream of the SECIS element is not restricted to mammals.

Elements in the distal 39UTR impair Sec insertion
Most of the studies examining Sec insertion have focused on identifying minimal SECIS elements and then examining them outside of their native context. In particular, Sec incorporation assays have often been done using minimal SECIS elements on the order of 50-200 nucleotides, but the 39UTRs of human selenoprotein mRNAs range from 200-5000 nucleotides. Given the influence of the 39UTR context even in this heterologous luciferase assay, we wanted to identify cis regions in the 39UTR of SelS variant 2 that affect recoding. Therefore, we designed lucUGA 258 constructs containing portions of the 39UTR of SelS variant. The 39UTR (nt 1-573) was divided into two parts based on the position of the SECIS element. The Start-SECIS construct contains nucleotides 1-441 of the UTR and ends immediately after the SECIS element. The SECIS-end construct starts just before the SECIS element and includes nucleotides 320-573 of the UTR. The complete 39UTR (1-573) and SECIS alone (nt 320-441) were used for comparison. As shown in Figure 6, the Start-SECIS construct functions similarly to the SECIS alone. In contrast, the SECIS-end construct is severely impaired for Sec insertion, indicating that inhibitory sequences are found downstream of the SECIS element. This inhibition is not due to a change in distance of the SECIS element to the recoding event as

The ORF-proximal SL1 promotes selenocysteine insertion
While the dampening effect of the 39UTR on the SelS SECIS is from downstream sequences, we were still interested in examining the upstream element SL1 for possible effects on Sec insertion. One could envision SL1 exerting a positive effect on Sec insertion by promoting ribosome pausing during translation. Conversely, SL1 could have a negative impact on selenoprotein synthesis by preventing the recoding machinery from accessing the UGA codon. The relative distance between this stem-loop and the UGA codon is very different in the endogenous and heterologous contexts. In its native context, SL1 is 9 nucleotides downstream of the UGA codon, whereas in the luciferase reporter there are several hundred nucleotides between them. Thus, effects caused by either steric inhibition of the Sec insertion machinery, or ribosomal pausing may not be observable in the luciferase system.
As there is no simple way to individually detect both the Seccontaining full-length SelS protein and a two amino acid truncated form, a V5 epitope tag was introduced between the UGA codon and the UAA stop codon (SelS-UGA-V5). The V5 tag is easily detectable and in these constructs the expression of the V5 tag is dependent on Sec insertion, as termination at the UGA codon would prevent inclusion of the tag.
Two SelS-UGA-V5 constructs were made that contained either the wildtype 39UTR of SelS variant 2, or the 39UTR with SL1 removed ( Figure 7A). In addition, a third construct containing the wildtype 39UTR was mutated to change the UGA codon to a UGU cysteine (Cys) codon (SelS-UGU-V5). The constitutive inclusion of a Cys residue instead of Sec makes the expression of the V5 tag in this construct independent of Sec insertion. This serves as a positive control for V5 expression in the assay.
The three constructs were in vitro translated using RRL in the presence or absence of SBP2 CT. As there is no detectable level of endogenous SelS in RRL, the translation products were immunoprecipitated using an anti-SelS antibody. The reactions were resolved using SDS-PAGE, transferred to PVDF membranes and immunoblotted for the V5 epitope. In order to be able to probe the samples under the same conditions, only 10% of the cysteine reaction was loaded onto the gel. As shown in Figure 7B, the SelS signal is dependent on the addition of RNA to the reactions. The SelS-UGU-V5 construct shows strong V5 signal and no dependence on SBP2-CT (left panel, lanes 2&3). As expected, both of the SelS-UGA-V5 constructs only show V5 signal in the presence of SBP2-CT. Interestingly, the removal of SL1 greatly decreases the V5 signal. This is not due a decrease in SelS production, as reprobing the membrane with an antibody directed against SelS shows that nearly equivalent SelS signals are found in both lanes ( Figure 7B, compare lanes 5&7, right panel). Thus, SL1 is a positive element that appears to facilitate Sec insertion, but only when positioned in the vicinity of the recoding event.  Endogenous SelS is found in the ER and perinuclear speckles The above results demonstrate that the potential exists to produce two different SelS protein isoforms, a full-length protein containing a penultimate Sec residue and a truncated protein that does not contain Sec. We wondered whether the different carboxy-terminal ends would affect the subcellular localization of the protein. There are several examples where exposed thiols have been shown to be important for ER localization of proteins by mediating intramolecular bonds [37,38,39,40,41]. In addition, a precedent exists for a penultimate cysteine being required for the ER retention of the secreted immunoglobulin M heavy chain [42]. Interestingly, one study has found that SelS was secreted from HepG2 cells and appeared to be full-length based on size and the presence of an intact amino-terminal epitope tag, although the secretion of SelS was specific to HepG2 cells [43]. Given that Sec is the penultimate residue of the full-length SelS, we were interested in whether an analogous mechanism might regulate the subcellular localization for the two isoforms.
SelS is a membrane protein and was previously shown to localize to the ER and plasma membrane by overexpression of epitope-tagged SelS constructs [21] or fractionation experiments [44]. Given the availability of a suitable antibody for immunofluorescence, we examined endogenous SelS localization. SelS is predominantly found in the ER, with some weak staining of the plasma membrane in some cells. More strikingly, there is an accumulation of SelS in a perinuclear region ( Figure 8A). This localization is not cell type specific as we observed similar results in U251MG (glial) and HepG2 (liver) cells (data not shown). It is also not an artifact generated during the fixation step as acetone, methanol and 4% paraformaldehyde methods all showed this accumulation ( Figure S4). Previous studies would not have observed this localization as the overexpressed SelS obscures this perinuclear signal. Given that the Golgi apparatus often shows a similar staining pattern, we concurrently stained the cells for endogenous SelS and a Golgi marker (golgin p97). As shown in Figure 8B, colocalization of these two proteins was detected next to the nucleus. In order to examine this potential colocalization more carefully, the cells were examined by confocal microscopy. A series of focal planes that spanned the depth of the cell were examined for SelS and golgin p97 localization. As shown in the image gallery, there is some spatial overlap between the two proteins, but it is not a complete colocalization (Figure 9).
In order to address whether these ER and perinuclear localizations might represent the two different SelS proteins (with and without Sec), we treated HepG2 cells with siRNAs directed against both SelS isoforms, as well as variant 1 and variant 2specific siRNAs. Localization of endogenous SelS protein was examined by immunofluorescence after siRNA treatment ( Figure 10). When treated with siRNAs that target both SelS mRNA variants, the punctate perinuclear signal persists, after the ER localization is no longer detectable. A similar staining pattern was observed using siRNA directed solely against the variant 2 transcript. In contrast, cells treated with the siRNA against transcript variant 1 looked similar to cells treated with a nontargeting control siRNA. Similar results were obtained with U251 cells (unpublished observation). Thus, the ER and perinuclear localizations are not simply due to two different protein isoforms from the variant mRNA transcripts. The functional significance of  . SL1 promotes Sec insertion when located in proximity to the recoding site.A, Schematic representation of the constructs used in this assay. The V5 epitope tag was inserted between the Sec (U) and the stop codon of the SelS open reading frame to allow detection of Sec insertion. Either the complete 39UTR (WT) or the 39UTR with SL1 deleted (DSL1) were included in the Sec constructs. A third construct that replaces the Sec (U) with a Cys (C) was included as a positive control for V5 detection in this assay. B&C, The SelS-Cys-V5 and SelS-Sec-V5 (WT and SL1) constructs were in vitro transcribed and translated, and used for immunoprecipitation (IP) against SelS. The IP reaction was resolved by SDS-PAGE and immunoblotted against the V5 epitope tag. The blot for the SelS-Sec-V5 series was stripped and reprobed for SelS. The experiment was repeated five times with similar results and a representative gel is shown. doi:10.1371/journal.pone.0062102.g007 the perinuclear localization of the residual SelS protein is unknown but it is possible that this represents a pool of SelS protein that undergoes slower turnover than the ER population at large.

Discussion
SelS expression has been shown to be regulated in response to cellular cues such as glucose and insulin levels, ER stress and inflammatory cytokines [17,18,22,23,27]. However, the intricacies of SelS expression have been underappreciated. In this study we identify multiple mechanisms of regulation that could affect SelS expression. First, human SelS is encoded by two variant mRNAs. Only one of the transcripts encodes a selenoprotein of 189 amino acids, while the other produces a truncated 187 amino acid protein. Additionally, cis sequences within the 39UTR of SelS strongly influence the activity of its SECIS, providing a second mechanism to produce SelS protein isoforms with and without Sec, even in the absence of a second mRNA variant. In addition to their effects on Sec incorporation, these cis sequences may also influence other aspects of SelS mRNA metabolism through interactions with miRNAs or other RNA-binding proteins. This underscores the importance of the significance of context when studying RNA elements, such as the SECIS. While the incorporation of Sec did not affect the subcellular localization of the two protein isoforms, the loss of Sec is likely to impact protein function given that most selenoproteins are enzymes with the Sec residue at the active site. In the absence of Sec, the truncated protein may be inactive, act like a dominant negative, or have a completely different function.
It has been previously noted that there are two mRNA variants for SelS in the GenBank database. However, our study is the first to show that only one encodes a selenoprotein, contrary to the annotated comments in the database. An alternative splicing event occurs in the 39UTR of the variant 1 transcript that removes the SECIS element, which is required for Sec insertion, while the 39UTR of variant 2 is not spliced. Experimentally, the selenoprotein-encoding transcript is the predominant SelS mRNA, but the non-selenoprotein mRNA variant was detected in each cell line tested, representing 5-16% of the SelS transcript population. The SelS mRNA variants that lack a SECIS element have only been found in primates thus far, even though the 59 donor splice site required for their production is conserved ( Figure S2). This may be due to issues with bias or coverage in sequence sampling, or may accurately reflect the biological differences between species with respect to alternative splicing. A recent study documented that levels of alternative splicing are highest in primates [45]. In addition, alternative splicing events are more likely to be conserved between different tissues of the same species than between the same tissues of different species [45].
Even in organisms with a single SelS mRNA transcript, the ability to produce two protein isoforms of SelS remains. While many RNA elements (including SECIS elements) are often considered as independent functional units, we have identified regions within the 39UTR of SelS mRNA that can act as positive and negative regulators of Sec insertion. First, SL1, located at the beginning of the 39UTR, is predicted to be highly conserved across SelS sequences. We have shown that SL1 enhances Sec insertion when located proximal to the site of recoding. Previously, a similar stem-loop structure called the Sec Redefinition Element (SRE) was found 6 nucleotides downstream of the UGA codon within the coding region of Selenoprotein N [46,47]. While a small subset of other selenoproteins are predicted to form stem-loops in locations that might act as an SRE [46], the SL1 within SelS is the second functional SRE to be identified. In addition to its function as a putative SRE, the formation of SL1 has an additional consequence in primates. RNA structures are well known to influence mRNA splicing [48]. The 59 splice site responsible for generating the SelS variant 1 mRNA is sequestered in the double-stranded stem of SL1, preventing the splicing event. Thus, factors that influence the formation of SL1 have the potential to regulate the production of SelS variant 1 mRNAs, which cannot produce the Sec-containing SelS protein.
The 140 nucleotides region downstream of the SelS SECIS element harbors sequences that strongly inhibit Sec insertion. Within this region, one candidate is SL2, which is predicted to form immediately downstream of the SECIS element. There are different mechanisms one can envision for how the presence of this conserved element might influence Sec insertion. The presence of a stable stem-loop immediately adjacent to the SECIS element may weaken the interactions at the base of the SECIS element, interfering with its ability to form or causing destabilization. SL2 also displays an ARE, which are known to modulate transcript stability and translational control, both positively and negatively. The selenoprotein Thioredoxin reductase 1 (TrxR1) contains AREs in its 39UTR that destabilize that mRNA [49]. However, the effects of AREs are transcript-specific, as are the protein factors that often mediate their effects [35]. The ARE in SelS does not affect the stability of the mRNA and further studies will be required to determine the mechanism by which the SelS ARE inhibits Sec insertion.
Given our findings, many of the results from previous studies on SelS need to be reinterpreted. With respect to RNA-based experiments, several studies used RT-PCR to examine SelS mRNA levels in human cell lines under various conditions [17,18,24,27,50,51,52,53]. However the majority of these studies were published before the two RNA variants were annotated. Most use primer pairs in the 39UTR of the variant 2 mRNA to examine SelS levels. In some cases this results in an underrepresentation of SelS mRNA levels. It is also not clear that both variants will respond similarly to stresses. In the case of SelS protein studies, similar caveats exist. Standard cell culture conditions are selenium deficient and hyperglycemic, which both inhibit SelS expression. Under conditions of limiting selenium, the cell prioritizes its use for the expression of essential selenoproteins, at the expense of non-essential selenoproteins, a phenomenon known as the selenoprotein hierarchy. For interpreting overexpression studies, it is often not clear that the 39UTR or an intact SECIS element was included in the construct, which is necessary for Sec insertion. When SelS was first discovered to be a selenoprotein, it was shown that radiolabeled 75 Se could be incorporated into a GFP-SelS fusion protein in cells, albeit relatively poorly [16]. This is likely because the SelS SECIS appears to be a poor SECIS element. In a comprehensive study that examined the minimal SECIS elements from all human selenoprotein mRNAs, the SelS SECIS was consistently among the weakest SECIS elements when tested for UGA-recoding activity in two cell lines, as well as in a cell-free system [54]. This was done using SECIS elements of ,100 nucleotides, and our study demonstrates that the activity of the SelS SECIS is further supressed in the context of its 39UTR. SBP2 binding is also a prerequisite for UGA recoding, and the interaction of SBP2 with the SelS SECIS is also weak [55]. In contrast, overexpression of SelS appears robust by immunofluorescence and western blot and can reach levels that distort the architecture of the ER itself [21]. The discrepancy between these observations could be explained by a mixed population of protein isoforms. Overexpression of a SelS construct that can produce a selenoprotein in cell culture would need to overcome the obstacles of a poor SECIS element, deficient selenium supply and competition for limiting SBP2 in order to be expressed in the selenoprotein form. Thus, it is likely a truncated SelS protein that does not contain Sec would be expressed under standard cell culture conditions.
Further support for an important role of Sec in SelS function is that the penultimate Sec participates in an intramolecular interaction that appears to affect the local conformation of the carboxy-terminus of SelS [20]. A recent study was performed to examine the structure of the cytoplasmic domain of SelS where the Sec residue was replaced by a cysteine to facilitate expression [20]. It was found that while the beginning of this protein is helical in nature, the C-terminal region is intrinsically disordered. Most interestingly, they show that a disulfide bond exists between Cys 174 and Cys 188 , which suggests the existence of a stable selenosulfide in the native protein. Conformational changes in response to the redox state of the protein were restricted to residues 173-189. We propose that the regulation of this selenosulfide bond formation would be controlled not only in response to oxidation state, but also by the presence or absence of the Sec residue. The amino acid sequence between the Cys 174 and Sec 188 is also extremely conserved across SelS proteins ( Figure 11A), despite the observation that disordered regions often mutate at higher rates [56], suggesting an important function for this region. It was proposed that the selenosulfide bond in SelS could function to reduce bonds in misfolded proteins that were resistant to unfolding within the ER lumen due to its higher reduction potential [20], although the conservation of the intervening sequence may reflect an additional function for this region. This suggests a model where this conserved region is caged when the selenoprotein is in its oxidized state, whereas it is flexible and available for interaction in its reduced state ( Figure 11B). In contrast, the non-Sec containing protein would be constantly in an open, available state due to its inability to form a selenosulfide bond. Thus, the location of Sec within SelS may make it accessible such that its incorporation can be regulated, serving as a redox rheostat to control the function of the protein.
One quarter of all human selenoproteins share a similar placement of their Sec residue in the C-terminus (SelS, SelK, SelO, TrxR1, TrxR2 and TrxR3) [57]. Of this subset, SelS, SelO and TrxR3 are in the group of six SECIS elements identified as weak with respect to their UGA-recoding activity [54]. TrxR1 SECIS activity was on the lowest end of the moderate class. These observations make it tantalizing to wonder if Sec inclusion in this subset of selenoproteins can be regulated to control protein function. Studies on TrxR1, an essential selenoprotein that contains a penultimate Sec, reveal that the loss of Sec at the Cterminus can have profound effects on function. Substitution of the Sec residue with Cys resulted in a greatly diminished enzymatic activity, while truncation abolished its activity [49]. In cell culture, introduction of truncated TrxR1 protein without Sec results in a pro-apoptotic phenotype that is not observed with the full-length Sec-containing protein [58,59]. This dramatic alteration in activity may underlie the finding that the cell protects against the production of truncated TrxR1 by allowing the insertion of Cys at the UGA codon under selenium deficient conditions [60,61]. Although it cannot be ruled out that conditions may exist where the production of the truncated TrxR1 may be induced, SelS represents the first natural example of a selenoprotein with two mRNA variants where one transcript cannot produce a selenoprotein.
The information and molecular tools developed in this study will provide a strong foundation for dissecting out the functional roles for these two protein isoforms. Future studies on SelS will be directed at discriminating between Sec-dependent and independent functions and elucidating the mechanism by which sequences in the 39UTR affect SelS function. Figure S1 The non-SECIS containing mRNA variant is found in multiple primates. Clustal Omega multiple sequence alignment of the 39UTRs from the non-SECIS containing SelS mRNA variants of macaque (EN-SMMUT00000016561), chimp (GABE01007426.1), human (NM_203472.1) and gibbon (XM_003281584.2). (DOCX) Figure S2 The 59 splice donor site for the 39UTR splicing event is conserved. Multiple sequence alignment of the first 22 nucleotides of the 39UTRs from the SelS mRNAs listed in Table 1 Figure S4 The perinuclear staining of SelS is not an artifact of the fixation method. U251 cells were fixed either by cold acetone for 5 minutes at 220uC, cold methanol for 5 minutes at 220uC, or 4% paraformaldehyde for 15 minutes at room temperature and the effect on SelS localization was compared.