Small Toxic Protein Encoded on Chromosome VII of Saccharomyces cerevisiae

In a previous study, we found an unknown element that caused growth inhibition after its copy number increased in the 3′ region of DIE2 in Saccharomyces cerevisiae. In this study, we further identified this element and observed that overexpression of a small protein (sORF2) of 57 amino acids encoded in this region caused growth inhibition. The transcriptional response and multicopy suppression of the growth inhibition caused by sORF2 overexpression suggest that sORF2 overexpression inhibits the ergosterol biosynthetic pathway. sORF2 was not required in the normal growth of S. cerevisiae, and not conserved in related yeast species including S. paradoxus. Thus, sORF2 (designated as OTO1) is an orphan ORF that determines the specificity of this species.


Introduction
We previously analyzed the copy number limits of most of the protein-coding genes in the budding yeast Saccharomyces cerevisiae using the genetic tug-of-war (gTOW) method [1]. In the gTOW method, the copy number of a plasmid containing a target gene (with its native promoter and terminator region) is increased on basis of the selection bias of the leu2d gene [2,3]. The copy number of the empty plasmid exceeds 100 in the leucine-negative condition. If the target gene has a copy number limit of <100, the plasmid copy number reflects the copy number limit.
When a target gene has the low copy number limit, we consider that overexpression of the protein encoded by the target gene (i.e., the annotated open reading frame (ORF)) results in growth inhibition. However, elements other than the target gene in the DNA fragment could determine the low copy number. For example, increasing the copy number of a DNA element, overexpression of an RNA element, or overexpression of an unannotated protein could result in growth inhibition.
To test this possibility, we previously analyzed the low limit genes by introducing a frameshift mutation to disrupt each annotated ORF and we isolated 10 DNA fragments where frameshift mutations in the annotated ORFs still obtained low copy number limits [1]. We also dissected the fragments and isolated four DNA fragments with unknown elements that determined the low copy number limits. Thus, we isolated a 600-base pair (bp) DNA fragment that contained the 3 0 region of DIE2, which resulted in a low copy number limit (Frag5 in Fig. 1A) [1].
In this study, we further analyzed this region and showed that expression of a small ORF encoding 58 codons caused growth inhibition.

Results and Discussion
Isolation of the element responsible for low copy number limits in the DIE2 region To isolate the specific element responsible for the low copy number limit in the 3 0 region of DIE2, we introduced a series of 10-bp deletions in every 100 bp of Frag5 and measured their copy number limits. As shown in Fig. 1B, deletions of two sites in the downstream region of DIE2 increased the copy number limit to >100. As shown in Fig. 1C, two small ORFs of >100 bp are encoded in Frag5 (denoted as sORF1 and sORF2). Both of these two 10-bp deletions disrupted sORF2, which indicates that sORF2 might be responsible for the low copy number limit of Frag5.
To disrupt sORF2 alone, we introduced mutations to change the potential start codons (ATG) of sORF2 into ATC. The results obtained are shown in Fig. 1D. Frag5 with a mutation that changed the first ATG codon of sORF2 into ATC possessed a copy number limit of >100. Frag5 with a mutation in the second ATG had a higher limit than the original Frag5, but the limit was still low (28.6 ± 3.5). This result strongly suggests that overexpression of the protein encoded by sORF2 causes growth inhibition when its copy number is increased in the 3 0 region of DIE2. Fig. 1E shows the amino acid sequence of sORF2.
High level expression of sORF2 driven by the GAL1 promoter inhibits cellular growth To confirm whether sORF2 overexpression alone caused growth inhibition, we tried to express sORF2 from the GAL1 promoter (P GAL1 ). As shown in Fig. 2A, yeast cells that harbored the P GAL1 -sORF2 plasmid did not grow on galactose plates. Next, we observed the growth inhibition process using time-lapse microscopic imaging. As shown in Fig. 2B, at the time point when the induction of P GAL1 -GFP was observed, each cell that expressed sORF2 ceased its proliferation and a large void structure was present. These results indicate that the high level expression of sORF2 inhibited cellular growth.
sORF2 is not required for the normal growth of S. cerevisiae To test whether sORF2 is required for the growth of S. cerevisiae, we disrupted sORF2 by replacing it with a kanamycin resistance gene cassette (KanMX), as shown in Fig. 2C. The ΔsORF2::KanMX cells exhibited the same growth as the wild-type cells in normal growth conditions (YPD, 30°C; Fig. 2C).
Increasing the copy number of the sORF2-containing DNA fragment induces the expression of ergosterol synthesis genes We performed transcriptome analysis (RNAseq) to analyze the cellular response after the overexpression of sORF2. We compared the mRNA expression profiles of cells that harbored the vector plasmids and the plasmid containing the DIE2 3 0 fragment (Rear2, Fig. 1A). Tables 1 and 2 show the genes with significantly different expression levels.
We analyzed the enriched genes based on gene ontology (GO) terms. The genes with higher expression levels in the cells that harbored the pTOW-Rear2 plasmid were significantly enriched in terms of genes involved in the ergosterol biosynthesis pathway (p = 2.2e −4 ). Eight genes (DAN1, DAN4, ERG1, ERG3, ERG11, ERG25, TIR3, and TIR4) with higher expression A. Copy number limits of DNA fragments from the DIE2 region. The data were obtained from our previous study [1]. E. Amino acid sequence of sORF2. The substituted methionines (ATG codons) in C are shown in red. A potential NLS sequence is underlined, and an amino acid sequence predicted to construct a helical structure is shown in bold letters.  A. Overexpression of sORF2 from the GAL1 promoter (P GAL1 ). The construct used in this experiment is shown. Cells with pTOW-P GAL1 -sORF2 (P GAL1 -sORF2) were streaked onto SC-glucose and SC-galactose plates. Two independent plasmid clones were analyzed. pTOW40836 (Vector) was used as an empty vector control and pTOW-P GAL1 -GFP (P GAL1 -GFP) was used to monitor the P GAL1 induction.
B. Time-lapse imaging of cells after the induction of sORF2. The cells with pTOW-P GAL1 -sORF2 (P GAL1 -ORF2) and pTOW-P GAL1 -GFP (P GAL1 -GFP) were cultured in SC-glucose mixed at a ratio of 10:1 and then cultivated in SC-galactose medium. P GAL1 -GFP was used to monitor the induction of P GAL1 . The cellular images shown were obtained every 5min. A movie is available as S1 Movie.
C. Deletion of sORF2. The construct used to delete sORF2 from the chromosome is shown. The strain with sORF2 deleted was streaked onto a YPD agar plate. The strain BY4741 was used as a wild-type control.
doi:10.1371/journal.pone.0120678.g002 levels were identified as genes that could be induced by treatment with ketoconazole [4]. Ketoconazole is known to inhibit the ergosterol biosynthetic pathway [5]; thus, sORF2 overexpression appeared to affect this pathway. The genes with lower expression levels were not significantly enriched with respect to GO terms. They however contained many genes encoding transporters and membrane proteins, such as ADY2, ENA1, FMP43, FMP45, HXT6, HXT7, JEN1, PHO89, SMA1, and YNL194C, suggesting that sORF2 overexpression modulates the expression of membrane proteins.

Expression analysis of sORF2
We analyzed the RNAseq data to determine whether sORF2 is transcribed. As shown in Fig. 3A, transcript reads containing sORF2 were not detected in the mRNAs from BY4741 that harbored an empty vector pTOWug2-836, whereas a large number of transcript reads were detected in the mRNAs that harbored pTOW-Rear2. To test whether sORF2 was translated, we attached the tandem affinity purification (TAP) tag to sORF2 and attempted to detect the TAP-tagged sORF2 by Western blotting. As shown in Fig. 3B, sORF2-TAP expressed from its genomic region was detected, and the expression of sORF2-TAP from the plasmid was highly increased. The expression of sORF2 from its genomic region was detected in the log phage cell lysate, but not in the post-log phase lysate (S1 Fig.). The expression was not increased under mating conditions (S1 Fig.). We further estimated the expression level of sORF2-TAP in comparison to the expression level of a reference protein Pop5-TAP, whose protein copy number was previously determined (2230 copies/cell) [6]. As the result, the expression level of sORF2-TAP from its genomic region was estimated to be 45 copies/cell, which corresponds to the level of lowly expressed proteins [6]. The estimated expression level of sORF2-TAP from the plasmid was 1938 copies/cell. It should be noted that the copy number limit of the plasmid that contained the sORF2-TAP DNA fragment was >100 (data not shown). This suggests that the small size of sORF2 itself is required to inhibit growth. Currently, we do not know the reason why we could not detect the mRNA of sORF2 expressed from its genomic region by our RNAseq analysis above. Although it is possible that integrating TAP-tag sequence and a marker gene stimulated the expression of sORF2, the result still suggests that there is an expression potential from the sORF2 locus. Supporting this idea, there is a TA repeat in the upstream region of sORF2, which provides potential binding sites for transcriptional factors such as the TATA-binding protein Spt15 (S2 Fig.). Notably, the TA repeat is far shorter in the corresponding genomic region of S. paradoxus, which lacks sORF2 (Fig. 4A). These binding sites might function as promoters for sORF2.
Multicopy UBP7 and PRM1 suppress the growth inhibition caused by the high copy number sORF2-containing DNA fragment To further elucidate the molecular mechanism responsible for growth inhibition by sORF2, we attempted to isolate multicopy suppressors of the growth inhibition caused by high copy   B. Overexpression of sORF2 without the potential NLS (sORF2 ΔKKRK ). The construct used in this experiment is shown. Cells with pTOW-P GAL1 -sORF2 (P GAL1 -sORF2) or pTOW-P GAL1 -sORF2 ΔKKRK (P GAL1 -sORF2 ΔKKRK ) were streaked onto SC-glucose and SC-galactose plates and incubated for indicated days. pTOW40836 (Vector) was used as an empty vector control and pTOW-P GAL1 -GFP (P GAL1 -GFP) was used to monitor the P GAL1 induction.

Structural analysis of sORF2
In order to speculate the molecular function of sORF2, we performed some bioinformatics analyses. We first performed the BLAST search toward the protein sequences stored at NCBI database (http://blast.ncbi.nlm.nih.gov/), but we could not obtain any significantly similar protein.
The corresponding ORF was not conserved in any closely-related yeast species (S. paradoxus, S. bayanus, S. mikatae, S. castellii, and S. kudriavzevii). Fig. 4A shows the comparison of the corresponding genomic locus from S. cerevisiae and S. paradoxus (most closely-related species to S. cerevisae), as an example.
During the structural analysis, we noticed that sORF2 contained a consensus sequence of nuclear localization signals (K-K/R-X-K/R) [11] at its C-terminal (underlined in Fig. 1E). To test if this potential nuclear localization signal (NLS) is important for the toxicity of sORF2, we overexpressed sORF2 without the sequence (sORF2 ΔKKRK ). As shown in Fig. 4B, yeast cells that harbored the P GAL1 -sORF2 ΔKKRK plasmid grew on galactose plates, but much slower than the cells with the empty vector or P GAL1 -GFP plasmids. This result indicates that the potential NLS is partly required (but not essential) for the toxicity of sORF2.
We next tried to predict the secondary and tertiary structure of sORF2 using a protein homology/analogy recognition engine, Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2/). The analysis predicted that there was a helical structure in the middle of the protein (shown in bold letters in Fig. 1E) based on its similarity with two template proteins (d1k78a1 and d6paxa1) with the confidence scores > 70 (the prediction results are summarized in S3 Fig.). Because the template proteins were both structurally classified into DNA/RNA-binding 3-helical bundle (Fold), homeodomain-like (superfamily), and paired domain (family), sORF2 might have DNA/RNA binding activity.

sORF2 (OTO1/YGR228C-A) as an orphan ORF
In this study, we obtained evidence that overexpression of a small ORF of 58 codons (sORF2) encoded within the 3 0 region of DIE2 causes growth inhibition. Our results also suggest that sORF2 overexpression affects the ergosterol synthetic pathway. Based on the fact that sORF2 has a potential NLS and a helical structure involved in DNA/RNA binding, sORF2 might function through its nuclear function such as transcriptional regulation. sORF2 was not identified in previous studies that aimed to detect small ORFs based on their expression and evolutionary conservation [12][13][14][15]. In fact, sORF2 is not conserved in the corresponding genomic region of the most closely-related yeast species S. paradoxus (Fig. 4A). We thus think that sORF2 is an orphan ORF (ORFan) [16,17], which distinguishes species by functioning in species-specific cellular situations, and propose its name as OTO1 (ORFan toxic when overexpressed) with its locus name YGR228C-A.
Our gTOW approach might be useful for isolating other ORFans. In fact, we had isolated three more genomic loci potentially contain unannotated toxic elements when the copy numbers were increased [1].

Materials and Methods
Strains and growth conditions BY4741 (MATa his3Δ0 leu2 Δ0 met15 Δ0 ura3 Δ0) [18] was used as the host yeast strain to test the toxicity of DIE2 fragments and sORF2. The sORF2 deletion strain was created, as follows: The genomic region of sORF2 (from ATG to stop) in BY4743 (MATa/α his3 Δ1/his3 Δ1 leu2 Δ0/leu2 Δ0 LYS2/lys2 Δ0 met15 Δ0/MET15 ura3 Δ0/ura3 Δ0) [18] was replaced by the KanMX6 cassette using a DNA fragment, which was amplified by PCR with the primers OHM0969 and OHM0970 using pKT127 [19] as a template. The strains were sporulated, and the tetrads were dissected. After genotypic analysis of the tetrads, haploid deletion strains were isolated. The sORF2-TAP strain was created, as follows: A sORF2-TAP fragment was amplified by PCR with the primers OHM1030 and OHM1032 using pTOW-sORF2-TAP. A hphMX4 fragment was amplified by PCR with primers the OHM1031 and OH1033 using pAG34 [20]. Both fragments were introduced into BY4741 to integrate sORF2-TAP-hphMX4 into the genomic region of sORF2. BY4742 (MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0) [18] was used for a mating partner of BY4741 with sORF2-TAP-hphMX4.
Yeast cells were grown in standard growth conditions [21]. The PCR primers used to amplify the DNA fragments employed in strain construction are listed in S1 Table. Plasmids used in this study The plasmids used in this study are listed in Table 3. The plasmids were constructed on the basis of the homologous recombination activity of yeast cells [22]. The PCR primers used to amplify the DNA fragments, which were employed in plasmid construction are listed in S1 Table. Measurement of the plasmid copy number limit The copy number limits of plasmids were measured as described in our previous study [1]. Briefly, DNA from yeast cells grown in SC-Ura, SC-Ura-Leu, or SC-Ura-His medium were extracted, and the relative plasmid copy number compared with the genomic DNA in the DNA solution was measured using real-time PCR. HIS3, LEU2, and LEU3 genes were detected as indicators of the plasmid copy number for pRS423ks, pTOWug2-836/40836, and genomic DNA, respectively. More than two independent experiments were performed for each experiment otherwise stated.

Microscopic observation
Cells were cultivated in SC-Ura medium until the mid-log phase and the cells were then transferred to SC-galactose-Ura medium, before being applied to a PDMS microfluidic chamber (YC-1, Warner instruments). Cellular images were acquired every 5 min using a Leica DM6000 B microscope. GFP fluorescence was determined using a GFP filter cube (excitation filter 470/40 and emission filter 525/50).

Western blot analysis
Western blotting was performed as described previously [24]. Briefly, proteins extracted from the 0.25 OD 600 cells (with indicated fold dilutions) cultivated in the indicated medium were separated by SDS-PAGE and transferred onto a PVDF membrane. The TAP-tagged protein was then detected using peroxidase anti-peroxidase soluble complex (P1901l, Sigma-Aldrich). The chemiluminescent image was taken and the intensity of each band was measured using the LAS-4000 image analyzer (GE Healthcare).

Multicopy suppressor screening
A multicopy plasmid library where most of the genes in S. cerevisiae were cloned into pRS423ks (our laboratory stock) was introduced into yeast strains that harbored pTOW-Rear2. Next, the colonies were grown on SC-Ura-His plates and then replica-plated onto SC -Ura-Leu-His plates. The plasmids were recovered from the colonies grown on SC-Ura-Leu-His and the DNA sequences of inserts in the plasmids were determined. The suppressor activities of the isolated candidates were re-evaluated by measuring the growth of the cells that harbored both pTOW-Rear2 and the suppressor plasmids in SC-Ura-His medium. Cellular growth was measured by monitoring OD 595 every 30 min using a microplate reader (Infinite F200, TECAN). The maximum growth rate was calculated as described previously [2,3].
(TIF) S1 Movie. Time-lapse movie of cells after the induction of sORF2. (MOV)