Extensive Association of Functionally and Cytotopically Related mRNAs with Puf Family RNA-Binding Proteins in Yeast

Genes encoding RNA-binding proteins are diverse and abundant in eukaryotic genomes. Although some have been shown to have roles in post-transcriptional regulation of the expression of specific genes, few of these proteins have been studied systematically. We have used an affinity tag to isolate each of the five members of the Puf family of RNA-binding proteins in Saccharomyces cerevisiae and DNA microarrays to comprehensively identify the associated mRNAs. Distinct groups of 40–220 different mRNAs with striking common themes in the functions and subcellular localization of the proteins they encode are associated with each of the five Puf proteins: Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. We identified distinct sequence motifs in the 3′-untranslated regions of the mRNAs bound by Puf3p, Puf4p, and Puf5p. Three-hybrid assays confirmed the role of these motifs in specific RNA–protein interactions in vivo. The results suggest that combinatorial tagging of transcripts by specific RNA-binding proteins may be a general mechanism for coordinated control of the localization, translation, and decay of mRNAs and thus an integral part of the global gene expression program.


Introduction
The dynamic structure and physiology of a cell depend on coordinated synthesis, assembly, and localization of its macromolecular components (Orphanides and Reinberg 2002).The timing and level of expression of the genes that encode these components are controlled by transcription factors that regulate initiation of transcription in a genespecific manner by binding to specific DNA sequences proximal to the genes they regulate.The combinatorial binding and activity of specific transcription factors confer a distinctive program of regulation on each individual gene while enabling coherent global responses of large sets of genes in physiological and developmental programs.Much less is known about either the system architecture or molecular mechanisms that underlie regulation of the posttranscriptional steps in the gene expression program.
There are approximately 15,000 mRNA molecules in each Saccharomyces cerevisiae cell during exponential growth in rich medium (Hereford and Rosbash 1977) and at least a 10-fold larger number in a typical mammalian cell (Hastie and Bishop 1976).The extent to which the location, activity, and fates of these diverse populations of mRNAs are coordinated and the post-transcriptional mechanisms that might mediate their coordinated regulation remain largely unknown.RNAbinding proteins (RBPs) have been implicated in diverse aspects of post-transcriptional gene regulation, including RNA processing, export, localization, degradation, and translational control (Dreyfuss et al. 2002;Maniatis and Reed 2002;Mazumder et al. 2003).Although there appear to be hundreds of RBPs encoded in eukaryotic genomes (Costanzo et al. 2001;Issel-Tarver et al. 2002), for only a few of these proteins have the RNA targets been systematically identified (Takizawa et al. 2000;Tenenbaum et al. 2000;Brown et al. 2001;Hieronymus and Silver 2003;Li et al. 2003;Shepard et al. 2003;Waggoner and Liebhaber 2003).For example, a recent study in S. cerevisiae found that two nuclear RNA export factors were each associated with large and distinct mRNA populations, and common functional themes were found among the 1,000 or so proteins encoded by each population (Hieronymus and Silver 2003).These observations support a role for RBPs in the coordinated regulation of mRNA subpopulations (Keene and Tenenbaum 2002;Keene 2003).
Systematic identification of the mRNA targets of RBPs can be a powerful approach to understanding the cellular roles of RBPs and the mechanisms by which they might regulate the post-transcriptional lives of mRNAs.We have focused first on the Pumilio-Fem-3-binding factor (FBF) (Puf) proteins from S. cerevisiae, which belong to a structurally related family of cytoplasmic RBPs that are implicated in developmental processes in various eukaryotes (Wickens et al. 2002).Puf proteins are defined by the presence of several (typically eight) consecutive repeats of the Pumilio homology domain (Pum-HD), which confers RNA binding activity (Zamore et al. 1997;Wang et al. 2002a).The Puf proteins characterized to date have been reported to bind to 39-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.Despite the widespread occurrence of Puf family members, only a few mRNA targets have been identified for these RBPs (Wickens et al. 2002).For example, in Drosophila, the PUMILIO protein binds maternal hunchback mRNA and, in concert with NANOS protein, represses translation of the mRNA at the posterior pole during early embryogenesis.The Caenorhabditis elegans Puf homologs, called Fem-3-binding factors (FBFs), regulate the switch from spermatogenesis to oogenesis by repressing fem-3 translation, and they are implicated in the propagation of germline stem cells through binding and inhibition of gld-1 mRNA expression (Zhang et al. 1997;Crittenden et al. 2002).
Less is known about the human homologs: PUMILIO-2 protein interacts with DAZ (deleted in azoospermia) protein and is expressed in embryonic stem cells and germ cells, whereas PUMILIO-1 is almost ubiquitously expressed (Moore et al. 2003).
In S. cerevisiae, five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats (Figure 1).Little is known about the physiological function of these proteins.Mutations in either PUF4 or PUF5 result in diminished longevity (Kennedy et al. 1997).PUF1 was isolated as a multicopy suppressor of certain microtubule mutants (Machin et al. 1995), and a PUF2 null mutant displayed increased resistance to cycloheximide and paromomycin (Waskiewicz-Staniorowska et al. 1998).However, S. cerevisiae mutants lacking all five PUF genes are viable (Olivas and Parker 2000).A genome-wide analysis of mRNA expression patterns in yeast mutants lacking all five PUF genes found differential expression of 7%-8% of all mRNAs under steady-state conditions, but no common theme was found among the affected genes (Olivas and Parker 2000).
Only two specific mRNA targets have been identified for yeast Puf proteins: Puf3p binds to the COX17 mRNA 39-UTR in vitro and may regulate its turnover (Olivas and Parker 2000), and Puf5p negatively regulates expression of reporter genes substituting for the HO endonuclease (Tadauchi et al. 2001).
Using DNA microarrays to identify the specific mRNAs that interact with the five S. cerevisiae Puf proteins, we have found that each Puf protein bound to a large set of distinct and functionally related mRNAs.We identified novel and conserved sequence elements in the mRNAs bound by Puf3p, Puf4p, and Puf5p.The results suggest a system for large-scale coordinated control of cytoplasmic mRNAs and provide insights into the physiological logic of the gene expression program.

Systematic Identification of mRNAs Associated with Specific RBPs
To identify RNAs associated with Puf proteins, tandemaffinity purification (TAP)-tagged proteins were purified from whole-cell extracts of S. cerevisiae (Figure 2).The TAP tag (Rigaut et al. 1999), a sequence encoding two IgG-binding units of protein A, a specific protease recognition site, and a calmodulin-binding domain, was fused in-frame at the Cterminus of the respective open reading frame (ORF) in its original chromosomal location (Ghaemmaghami et al. 2003).This design was intended to preserve normal regulation of the expression of the fusion protein.Cells of the TAP-tagged strains showed growth rates and cell morphologies similar to wild-type cells.Cells were grown to mid-log phase in rich medium, extracts were prepared, and ribonucleoprotein complexes were recovered by affinity selection on IgG beads and subsequent cleavage with tobacco etch virus (TEV) protease (see Materials and Methods).To control for nonspecifically enriched mRNAs, the same procedure was performed with wild-type cells lacking the TAP tag.TEV protease cleavage was superior to direct elution of proteins from beads, as it gave lower contamination from nonspecifically interacting RNAs in the resulting purified fractions (data not shown).RNA was isolated from the purified protein samples and from extracts.We obtained 0.8-2 lg of RNA from the Puf affinity-isolated samples gathered from 1-l cultures, but no detectable RNA (,0.1 lg) was recovered when the same procedure was applied to untagged control cells.The yield of RNA from the Puf affinity isolation procedure was sufficient to perform further labeling steps directly, without amplification of RNA by PCR, as had been required in previous studies (Takizawa et al. 2000;Hieronymus and Silver 2003).Two samples from each cell population, total RNA, and RNA isolated by the Puf affinity procedure were used to prepare cDNA probes labeled with different fluorescent dyes, which were mixed and hybridized to S. cerevisiae DNA microarrays containing all known and putative ORFs, introns, and the mitochondrial genome (see Materials and Methods).The ratio of the fluorescent hybridization signals from the two differentially labeled RNA samples, at the array element representing each specific gene, provided an assay for enrichment of the corresponding mRNA by the Puf-affinity procedure.
Puf3p is the only one of the five S. cerevisiae Puf proteins for which direct in vitro interaction with an mRNA (COX17) has previously been described, thereby providing an internal positive control (Olivas and Parker 2000).COX17 mRNA was substantially and consistently enriched in four independent Puf3p affinity isolations (ratio = 10 6 1.4; Figure 3A), but not in mock isolations (ratio = 0.8 6 1.2).In general, after filtering for spots with high background or irregular shapes, enrichment values for the entire set of arrayed sequences were reproducible (median of standard deviations in all arrayed spots = 0.35 on a log 2 scale) (see Materials and Methods).To define targets specific to each Puf protein, we first selected all sequences for which enrichment factors in the corresponding affinity isolation procedures were at least two standard deviations above the mean for all arrayed sequences (Figure S1; for samples isolated by the Puf3paffinity procedure, this corresponded to an enrichment factor of greater than or equal to 2.5).Second, we eliminated from this selected group any sequences that were also consistently enriched in the mock procedure (see Materials and Methods).Although no cutoff can perfectly distinguish the actual physiological targets from false positives, the high reproducibility of the results (see Figure 3B), the occurrence of distinct mRNA populations associated with the different Puf proteins, and the characterization of these targets described in the subsequent sections, including the identification of distinct sequence motifs and in vivo confirmation of the role of these motifs in specific RNA-protein interactions, strongly support the validity of the majority of the targets.Finally, the list of target mRNAs did not change substantially by application of other statistical methods for selection (see Lieb et al. 2001).
A large number of arrayed sequences, 818, identified transcripts associated with at least one Puf protein (see Figure 3B; Table S1), with 735 encoding distinct ORFs.This represents approximately 12% of the known and predicted protein-coding sequences in the S. cerevisiae genome.Of these, 90 transcripts interact with more than one Puf protein.The largest overlap was observed between the groups of tran-scripts associated with Puf1p and Puf2p-which also have the greatest overall similarity in amino acid sequence among the Puf proteins (45% identical); 36 of the 40 Puf1p targets were also associated with Puf2p.Twenty-eight mRNAs were bound by both Puf4p and Puf5p, and 16 were bound both by Puf2p and Puf5p.Seven transcripts were enriched with three different Puf proteins (DHH1 and YOL109w mRNAs with Puf1p, Puf2p, and Puf5p; NOP1 mRNA with Puf1p, Puf4p, and Puf5p; SUR7 and SFL1 mRNAs with Puf2p, Puf4p, and Puf5p; and IFM1 mRNA with Puf3p, Puf4p, and Puf5p).The remaining 645 target mRNAs were each associated with only one of the Puf proteins.Thus, each Puf protein associates with a distinct and highly specific subset of mRNAs (see Tables S3-S7).
We estimated the number of Puf proteins per cell by a filter affinity blot analysis using protein A as a standard for calibration (Table S2).We found that Puf1p, Puf2p, Puf3p, and Puf5p were similar in abundance, with 350-400 molecules per cell.Puf4p was approximately twice as abundant (approximately 900 molecules per cell).The relatively low abundance of the Puf proteins is therefore comparable to that of transcription factors, protein kinases, and cell cycle proteins (Futcher et al. 1999).Moreover, our measurements imply that the intracellular concentrations of the Puf proteins range between 20 and 50 nM, approximately one order of magnitude higher than the dissociation constants for binding of their metazoan homologs to the cognate target RNAs.The number of Puf proteins per cell approximates the estimated numbers of cognate Puf target mRNA molecules present in the cell (Holstege et al. 1998;Wang et al. 2002b) (Table S2), consistent with a model in which each Puf protein molecule is associated with one mRNA molecule in the cell.

Puf3p Specifically Binds mRNAs Encoding Mitochondrial Proteins
As a first step toward identifying functional themes among the mRNAs associated with each Puf protein, we retrieved the Gene Ontology (GO) annotations for process, function, and compartment from the Saccharomyces Genome Database (SGD) (Issel-Tarver et al. 2002).(The target mRNAs for each Puf protein are listed in Tables S3-S7.)We then searched for significant shared GO terms in the lists of Puf mRNA targets (Table S8).
Puf3p associated almost exclusively with transcripts of nuclear genes that encode mitochondrial proteins (p , 10 À88 ; see Table S5).In particular, of the 154 Puf3p-associated transcripts for which GO annotation of subcellular localization was available, 135 (87%) were assigned to mitochondria (Figure 4A).Of the Puf3p-associated mitochondrial gene products, 80 (59%) are involved in protein biosynthesis, including structural components of the ribosome (55 genes), tRNA ligases (12 genes), and translational regulators (nine genes).Twenty-two of the Puf3p-bound transcripts are involved in mitochondrial organization and biogenesis, 17 in aerobic respiration, and 12 in mitochondrial translocation.Based on this striking cytotopic (relating to location in the cell) concordance, we suggest that the remaining 66 Puf3p mRNA substrates (30%) for which no GO annotations were available are likely to encode mitochondrial proteins.(While this paper was under review, a genome-wide analysis of protein localization in S. cerevisiae [Huh et al. 2003] reported a mitochondrial localization for 27 additional Puf3p targets, raising the total to 162 of the 220 putative Puf3p mRNA targets encoding mitochondrial proteins.)
Transcripts encoding nucleolar proteins were highly enriched among the Puf4p-bound mRNAs: 36 of the 133 (27%) annotated genes in this group encode nucleolar proteins, as compared to 3% of all the annotated genes in the S. cerevisiae genome (p , 10 À12 ).Of these 36, 29 are directly involved in ribosomal RNA (rRNA) synthesis, processing, and ribosome maturation (p , 10 À15 ), major functions of the nucleolus (Fatica and Tollervey 2002;Gerbi et al. 2003) (see Tables S5 and S8).
Twenty-eight transcripts were enriched in both the Puf4p and Puf5p affinity isolations, including six transcripts encoding components of the nucleosome (p , 10 À11 ), among them the four core histone proteins (histones 2A and 2B, histone 3, and histone 4; note that histones 2A and 2B are 98% identical and therefore cross-hybridize).

Diverse Functional Links among Transcripts Associated with Each Puf Protein
In addition to the cytotopic relationships within each group of Puf-associated mRNAs, we were struck by the frequency with which transcripts encoding different components of protein complexes or systems of interacting proteins were bound by the Puf proteins.For example, most of the nuclear transcripts encoding components of the mitochondrial ribosome (55 out of the 77 known genes; Gan et al. 2002) were Puf3p-associated.This observation prompted us to search for other protein complexes and functional systems that shared similarly Puf-associated mRNAs.
Other examples of coordinate ''tagging'' of transcripts encoding subunits of multiprotein complexes include Puf4p association of mRNAs encoding three of the four protein components of the H/ACA core particle (Cbf5p, Gar1p, and Nhp2p), which synthesizes pseudouridine in rRNAs (Henras et al. 1998) (Figure S2; no data were obtained for the fourth component, Nop10p).Puf5p bound mRNAs encoding histone acetylases (Ada2p, Spt8p, and Hfi1p), which are components of the Spt-Ada-Gcn5-acetyltransferase (SAGA) complex, and transcripts encoding at least four of the six members of the RSC (remodels the structure of chromatin) family of DNAstimulated ATPases with bromodomains (Bdf1p, Bdf2p, Rsc2p, and Rsc4p; no array data were obtained for the two other members, Rsc1p and Spt7p).As mentioned above, the mRNAs encoding at least three of the four core histones were enriched in both Puf4p and Puf5p affinity isolations.
We also found numerous cases in which the transcripts encoding multiple members of a functional group of proteins were bound by the same Puf protein.For example, the transcripts encoding the Tpo1, Tpo2, and Tpo3 proteins, the three known spermine transporters in the plasma membrane (Albertsen et al. 2003; see note above about cross-hybridization), and the two known genes implicated in the nonclassical protein export pathway (NCE101, NCE102) (Cleves et al. 1996) were bound by Puf1p and Puf2p and by Puf2p, respectively.Puf5p was associated with all of the histone deacetylases (HDACs) that act on histones located around coding sequences-Sin3p (a class I HDAC), Hda1p (a class II HDAC), and both components of the Set3C complex (Hst1p and Snt1p) (Kurdistani and Grunstein 2003).(Two other HDACs, Hos1p and Hos3p, which deacetylate histones around the ribosomal DNA locus, were not enriched in Puf5p affinity isolations.) Finally, we identified cases in which the mRNAs encoding multiple components of a specific regulatory system were bound by the same Puf protein.For example, Puf2p associates with mRNAs encoding diverse proteins regulating Pma1p, which is an ATP-dependent proton transporter located in the plasma membrane, and with PMA1 mRNA itself (Figure S2).All of the mRNAs encoding nucleolar glycine/arginine-rich (GAR) domain-bearing proteins (Sbp1p, Nsr1p, Nop1p, Gar1p) as well as HMT1 mRNA, encoding a dimethylase that modifies the nucleolar GAR proteins (Xu et al. 2003), were associated with Puf4p, while none of the mRNAs encoding the distinct group of nonnucleolar GAR proteins were bound by Puf4p (Figure S2).

Sequence Motifs in the 39-UTR of mRNA Targets Direct Binding by Puf Proteins
The Puf homologs in Drosophila and C. elegans bind to sequences in the 39-UTR of mRNAs (Wickens et al. 2002).We therefore examined the sets of mRNAs associated with each of the S. cerevisiae Puf proteins for the presence of common sequence motifs in 59-UTRs and 39-UTRs, using multiple expectation maximization for motif elicitation (MEME) as a motif discovery tool (Bailey and Elkan 1994).We identified distinct 10-or 11-nucleotide sequence motifs in the 39-UTR among the mRNAs interacting with Puf3p, Puf4p, and Puf5p (Figure 5A, Tables S9-S11).We have thus far been unable to identify conserved sequence elements among Puf1p and Puf2p targets; these proteins may recognize structural elements in the RNA rather than simple sequence strings, possibly via their classical RNA-binding domains instead of their six-repeat Pumillio domains.
The conserved motifs we identified in the Puf3p, Puf4p, and Puf5p targets each include a UGUR tetranucleotide sequence, which is a feature of all previously reported RNA targets of Puf family proteins (Wickens et al. 2002).We classified the targets by combining both GO and YPD annotations (May 2003).''Plasma membrane'' (light blue) is a subpopulation of the total membrane-associated proteins (blue).Soluble cytoplasmic or nuclear proteins were classified as ''non-membrane.''''All'' refers to the genome-wide compartmentalization of characterized genes, and respective numbers were retrieved from YPD. ''Puf2 Top 40'' refers to the 40 highest enriched Puf2p targets and equals the total number of Puf1p targets.DOI: 10.1371/journal.pbio.0020079.g004Furthermore, in each case, the consensus sequence contains a conserved dinucleotide (UA), located two, three, or four nucleotides downstream of the UGUR motif, in the consensus sites for Puf3p, Puf4p, and Puf5p.Remarkably, the Puf3p consensus motif matches a sequence (CYUGUAAAUA) previously identified by computational tools in 39-UTR sequences of nuclear genes coding for mitochondrial proteins (Jacobs Anderson and Parker 2000).
We examined the distribution of the consensus sequence motifs in the entire S. cerevisiae genome (Table 1).Of the genes whose mRNAs were predicted by computational analysis to contain one of these three target sequences in their 39-UTRs, 42% were identified experimentally as targets in the corresponding affinity isolation procedure (Table 1).The consensus motifs were occasionally found in the coding sequence of an experimentally identified target gene, but were much rarer in the predicted 59-UTR sequences (Table 1).Moreover, only a few mRNAs had two copies of the motifs: five mRNAs among the Puf3p targets, six among the Puf4p targets, and one among the Puf5p targets (see Tables S5-S7).As our computational method did not detect the cognate consensus sequence elements in all the experimentally identified targets, alternative sequences or structural elements in RNAs might also allow specific interactions with Puf proteins, some mRNAs may be associated indirectly as part of larger complexes, and some of the putative mRNA targets identified by our affinity procedure are likely to be false positives.
To test the in vivo function of the putative recognition elements identified by the computational analysis, we assayed RNA-protein interactions in vivo using the yeast threehybrid system (Bernstein et al. 2002) (see Figure 5B).Puf3p, Puf4p, and Puf5p bound specifically to a sequence matching to the cognate consensus sequence, as assayed by activation of the lacZ and HIS3 reporter genes (see Figure 5C and 5D).For Puf3p and Puf4p, the Pum-HD alone was sufficient to confer specific binding (see Figure 5C and 5D), but no interaction could be seen with the Puf5p Pum-HD alone (data not shown).These interactions were specific: mutations in the UGU of the Puf3p consensus sequence disrupted binding, and each Puf protein interacted with its cognate consensus sequence in preference to the closely related consensus sequences recognized by the other Puf proteins.We detected a weak interaction between Puf3p and the Puf4p target sequence, an interaction that was not seen with the Puf3p Pum-HD alone.These results suggest that binding of the Puf proteins to these specific cis-acting elements directs their functions to specific sets of mRNAs.

Subcellular Distribution of Puf Proteins
We investigated the localization of the TAP-tagged Puf proteins by immunofluorescence with antibodies against the TAP tag (see Materials and Methods).All five Puf proteins were predominantly localized to multiple discrete foci in the cytoplasm (Figure 6).The predominantly cytoplasmic localization is consistent with previous reports for S. cerevisiae Puf3p and Puf5p (Tadauchi et al. 2001) and for the homologous proteins in higher eukaryotes (Lehmann and Nu ¨sslein-Volhard 1991;Zhang et al. 1997).The distribution of the foci of Puf proteins was not obviously related to distinct cellular organelles or structures, with the exception of Puf1p and Puf2p, which localized in foci enriched near the periphery of the cell.Because of the diffuse and pleiomorphic distribution of mitochondria in the cell, we cannot exclude the possibility that Puf3p, which specifically bound transcripts of proteins destined for the mitochondria, is associated with mitochondria.Letters with less than 10% appearance were omitted.Fraction of genes bearing a motif in the 39-UTR sequence is indicated to the right.Y-helicase proteins are nearly identical in sequence and were excluded from this analysis.(B) Scheme of three-hybrid assay for monitoring RNA-protein interactions in vivo (Bernstein et al. 2002).(C) b-Galactosidase activity for threehybrid assay.Proteins assayed are indicated on top, RNAs to the left.Abbreviations: pum, pum-HD; cons., consensus motif; UGU/AGA, UGU in consensus sequence mutated to AGA. (D) Activation of HIS3 reporter gene and resistance to 3-aminotriazole (3-AT), a competitive inhibitor of the HIS3 gene product, in a three-hybrid assay (Bernstein et al. 2002).DOI: 10.1371/journal.pbio.0020079.g005

Altered Levels of Puf3p-Associated mRNAs in a puf3D Mutant
A previous study compared steady-state mRNAs levels of cells bearing deletions of all five Puf proteins and wild-type cells grown in rich media (Olivas and Parker 2000).Only 12 of the 148 (8%) mRNAs whose abundance changed by more than 2-fold were selectively enriched in our affinity isolations with Puf proteins.The lack of a simple relationship between the mRNA binding specificity we observed and the reported effects of these multiple mutations on global gene expression prompted us to design a more specific experiment to search for a possible connection between specific mRNAs levels and binding to Puf proteins.We focused on Puf3p, as its strong association with mRNA-encoding mitochondrial proteins suggested that we should look for a regulatory function for this protein in mitochondrial physiology.Indeed, we found that puf3D cells grew more slowly than isogenic puf3 þ cells on minimal media plates with glycerol as the carbon source (Figure S3).We therefore compared mRNA levels in the puf3D and puf3 þ cells grown under these conditions by DNA microarray hybridization.Although the magnitude of the change was small, the relative expression levels of the 220 Puf3p-associated mRNAs were selectively increased in puf3D cells, compared to all other mRNAs analyzed (p , 10 À34 ) (Figure 7).Of the 16 mRNAs whose abundance was increased by more than 2-fold in the puf3D mutant, 11 (70%) were among the transcripts identified as Puf3p targets by our copurification experiments, and all encode mitochondrial proteins.This result could reflect a direct effect of Puf3p on its target mRNAs, for example, by promoting mRNA decay (Olivas and Parker 2000).However, the levels of transcripts involved in respiration and mitochondrial function, including many that did not appear to be bound directly by Puf3p, were increased in the puf3D mutant, suggesting the possibility that the elevated abundance of Puf3p target mRNAs could instead be an indirect response to impaired mitochondrial and respiratorial function in puf3D cells.

Discussion
In an analysis of just five of the hundreds of RBPs encoded by the S. cerevisiae genome, we found that more than 700 transcripts appeared to be specifically bound by one or more RBPs, with each of the five Puf family proteins ''tagging'' a distinct set of mRNAs.These sets encode functionally and cytotopically related proteins.For three of the Puf proteins, we identified distinct short sequences in the associated specific set of mRNAs, typically in the 39-UTR, which were sufficient for specific binding to the cognate Puf protein in vivo.Many sets of mRNAs encoding proteins localized to the same subcellular compartment, protein complex, or functional system were bound by the same Puf protein.Puf3p, which specifically associated with cytoplasmic mRNAs encoding mitochondrial proteins, generally affected the steadystate levels of its mRNA targets as reflected by their increased abundance in puf3 mutant cells.
The selective ''tagging'' by sequence-specific RBPs of mRNAs that share common physiological roles suggests a general and widespread mechanism for coordinated control of their expression.Previous reports have identified coordinated regulation of small sets of functionally related mRNAs by specific RBPs.For example, mammalian stem-loop binding protein (SLBP) associates with all five classes of histone mRNAs and guides proper 39-end formation (Dominski and Marzluff 1999).Iron regulatory proteins (IRPs) bind to and regulate translation of five different mRNAs encoding proteins involved in iron metabolism (Eisenstein and Ross 2003), and a cytoplasmic poly(A) polymerase regulates multiple mRNAs in early development (Mendez and Richter 2001).Based on these and other examples (Tenenbaum et al. 2000), Keene and Tenenbaum (2002) have suggested that messenger RBPs could define ''post-transcriptional operons.''Our results provide strong support for this general idea of coordination of gene expression via RBPs and suggest that the post-transcriptional control afforded by combinatorial binding of RBPs to mRNAs could allow greater regulatory flexibility than a simple operon (see also Keene and Tenenbaum 2002).Further, we suggest that RBPs may play important roles in subcellular localization and efficient assembly of protein complexes.The RBPs encoded in eukaryotic genomes rival specific transcription factors in their numbers and diversity, raising the intriguing possibility that specific regulation of the localization, translation, and survival of mRNAs might be comparable in their richness and complexity to regulation of transcription itself.Each of the five Puf proteins interacts with a distinct large set of mRNAs, comprising more than different mRNAs in total.Five other RBPs in S. cerevisiae have been subjected to a similar genome-wide survey of their mRNA targets.She2p, which plays a critical role in selective targeting of specific mRNAs to the bud tip (Shepard et al. 2003), Khd1p, which has also been implicated in localizing gene expression to the nascent bud (A.P. Gerber, unpublished data), and Scp160p, an RBP implicated in genome stability (Li et al. 2003), were each found to bind from 20 to hundreds of distinct mRNAs, and two proteins implicated in RNA export from the nucleus, Yra1p and Mex67p, were each associated with more than 1,000 mRNAs (Hieronymus and Silver 2003).Thus, just ten of the 567 S. cerevisiae proteins known or predicted from the genome sequence to have RNA binding activity (Costanzo et al. 2001) have been found to bind, in a functionally specific pattern, a total of approximately 2,500 different transcripts (approximately 40% of the transcriptome).The extent and specificity of the RNAprotein interactions represented by the proteins studied to date, extrapolated to the hundreds of putative RBPs that remain to be investigated, suggest the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its ''lifecycle.''It further suggests a potential regulatory repertoire comparable in its diversity and richness to that of the DNA-binding transcription factors (Figure 8).Indeed, the combinatorial binding of mRNAs by multiple RBPs could, in principle, define a specific post-transcriptional fate for each individual mRNA (for an example, see Sonoda and Wharton 2001).
Many sets of mRNAs bound by the same Puf protein encode proteins that act in the same subcellular location, form stochiometric complexes, or are implicated in the same cellular pathway.This organization is most clearly exemplified by Puf3p, which selectively bound mRNAs encoding mitochondrial proteins, including at least 70% of all mitochondrial ribosomal proteins (see Figure 4).Combinations of RBPs could specify smaller sets of RNAs encoding ), whereas Puf4p targets were not (p .0.05).Thirty-nine genes involved in aerobic respiration (according to GO annotation and SGD), but not bound by Puf3p, were similarly enriched (p , 5 3 10 À5 ) in the puf3 mutant as random sets of 39 Puf3p targets (p , 10 À6 ).Likewise, 220 randomly selected mRNAs coding for mitochondrial proteins that were not associated with Puf3p in the experiments herein were weakly enriched in the mutant (p , 10 À8 ).DOI: 10.1371/journal.pbio.0020079.g007more precisely defined functional groups of proteins.For example, the mRNAs encoding the core histone proteins were among the small set of mRNAs that were associated with both Puf4p and Puf5p.These results therefore hint that networks of functional and physical interactions among proteins could be reflected in a corresponding network of mRNA-protein interactions that coordinate post-transcriptional control of their expression and fate.
For three of the Puf proteins, we found that RNA-protein interactions were directed by compact sequence elements, usually located in the 39-UTR of the mRNA (see Figure 5).Interactions with 39-UTR sequences have been described for many cytoplasmic RBPs involved in post-transcriptional regulation (Mazumder et al. 2003).Our analysis has revealed that such recognition elements are probably much more widespread than previously recognized.Sequence and structural elements in mRNAs that are related to the function or cellular localization of the encoded proteins may be a general feature of eukaryotic genes, paralleling the role of the DNA sequences that direct specific transcription factors to promoters and enhancers (Cliften et al. 2003).
The multifocal cytoplasmic distribution of Puf proteins raises the possibility that the mRNAs associated with each Puf protein are colocalized (see Figure 6).In mammalian cells, specific mRNA molecules and specific messenger RBPs have also been found to be localized to specific ''granular'' subcytoplasmic loci, although the generality of this phenomenon has not been established (Andersen and Kedersha 2002;Eystathioy et al. 2002;Farina et al. 2003).One function of the Puf proteins and related proteins that bind specific families of mRNAs could be to localize functionally related mRNAs to specific cytoplasmic loci.Physical clustering of functionally related groups of mRNAs could aid the assembly of complexes and the coordinated control of translation or RNA turnover.In support of this idea, it has recently been suggested that mRNA decay in the cytoplasm of S. cerevisiae occurs in distinct loci (Sheth and Parker 2003) and, further, that mRNAs encoding different subunits of stoichiometric complexes do indeed have concordant decay rates (Wang et al. 2002b).We propose that the location in the cell at which any mRNA is translated or degraded is not left to chance.Instead, every mRNA that leaves the nucleus may be delivered, in a process directed by specific protein-RNA interactions, to one of a limited number of specific foci in the cytoplasm, designated as destinations for a specific functionally related family of mRNAs.These foci could serve to colocalize and coregulate synthesis of proteins that need to assemble or act together, thereby facilitating efficient and rapid assembly and localization of the proteins.The number of distinct families of functionally specialized foci may be quite large.The locations of these foci need not correspond to recognizable cellular features, but may simply be ad hoc sites for localized, coordinated translation of proteins that are to be assembled into a complex or a functional unit.Specific predictions of this hypothesis, such as colocalized translation of the subunits of stoichiometric complexes, should be amenable to direct experimental tests.
Combinatorial binding of mRNAs by specific regulatory proteins, linking their post-transcriptional regulation to specific signal transduction pathways, could allow rapid and efficient reprogramming of gene expression during development or in response to changing physiological conditions.Indeed, regulation of specific genes by external signals via RPBs has been described in higher eukaryotes (Lasko 2003).For example, the signal transduction and activation of RNA (STAR) proteins contain RNA-binding motifs combined with protein-protein interaction domains and phosphorylation sites, which could allow integration of stimuli conducted by signal transduction cascades (Lasko 2003).Similarly, the Puf proteins contain numerous putative phosphorylation motifs, as well as domains with characteristics often implicated in protein-protein interactions, such as glutamine/arginine-rich regions (Michelitsch and Weissman 2000) (see Figure 1).
Coordination of cellular processes has long been thought to be mediated primarily at the transcriptional and posttranslational level.Our results join a growing body of studies At the post-transcriptional level (middle), RBPs regulate decay, translation, or localization of mRNAs in a coordinated fashion by interaction with sequence/structural elements in the RNA that are often found in 39-UTR regions (red box).Functional relations at the protein level (bottom) can be reflected at both the transcriptional and post-transcriptional levels: sets of genes that encode functionally related proteins, such as subunits of stochiometric complexes (blue) or components of the same regulatory or metabolic pathway (gray and cross-hatched boxes), may be regulated by common transcription factors and their mRNAs post-transcriptionally coregulated by RBPs (dashed interactions).DOI: 10.1371/journal.pbio.0020079.g008(Tenenbaum et al. 2000;Eystathioy et al. 2002;Wang et al. 2002b;Hieronymus and Silver 2003;Shepard et al. 2003; see also Keene and Tenenbaum 2002) that suggest that the localization, translation, and stability of mRNAs are subject to extensive and important regulation and coordination by interaction with a diverse set of RBPs.Systematic mapping of these interactions and deciphering their roles, molecular mechanisms, and coordination will undoubtedly yield important new insights into biological regulation and the gene expression program.
Three-hybrid assays.Three-hybrid assays were performed as described elsewhere (Bernstein et al. 2002).

Figure 1 .
Figure 1.Protein Domain Structure of Yeast Puf Proteins Pum-HD repeats (Zamore et al. 1997) are shown as red ovals and classical RNA-binding domains (RBDs) are depicted as blue boxes.Regions of low complexity, such as proline-, serine-, threonine-, and/ or methionine-rich domains, are shown in gray boxes; asparagine stretches are striped.The numbers correspond to the length of proteins in amino acids.DOI: 10.1371/journal.pbio.0020079.g001Figure 2. Strategy for Analyzing Genome-Wide RNA-Protein Interactions Protein A-tagged Puf proteins were captured with IgG-Sepharose and released from the beads by cleavage with TEV protease.RNAs associated with the released proteins were isolated, and cDNA copies were fluorescently labeled and hybridized to yeast DNA microarrays.The Cy5/Cy3 fluorescence ratio for each locus reflects its enrichment by affinity for the cognate protein.DOI: 10.1371/journal.pbio.0020079.g002

Figure 3 .
Figure 3. Defining Puf Target RNAs (A) Distribution of average Cy5/Cy3 fluorescence ratios from four independent microarray hybridizations analyzing Puf3p targets.The arrow depicts enrichment of COX17 mRNA, which is known to bind to Puf3p (Olivas and Parker 2000).The red dashed line indicates the threshold applied for defining 220 target RNAs (a magnification is shown of the enriched region).(B) Cluster of RNA targets for Puf proteins.Rows represent genes (unique cDNA elements) and columns represent individual experimental samples.Each Puf protein and an untagged strain (mock control) were assayed in quadruplicate.The color code indicates enrichments (green-red color scale).The number of mRNAs interacting with each Puf protein is indicated in parentheses.mRNAs clustering with the mock controls were removed as false positives (see Materials and Methods).DOI: 10.1371/journal.pbio.0020079.g003

Figure 4 .
Figure 4. Classification of mRNAs Interacting with Puf Proteins (A) Column charts showing compartmentalization of characterized gene products encoded by the Puf targets.The same compartments are shown for the entire genome in the columns designed ''All'' (YPD, May 2003).The number of genes represented in the charts is indicated on the top of columns.An asterisk indicates classes with p values of less than 0.001.(B) Fraction of membrane-associated gene products among the Puf targets.We classified the targets by combining bothGO and YPD annotations (May  2003).''Plasma membrane'' (light blue) is a subpopulation of the total membrane-associated proteins (blue).Soluble cytoplasmic or nuclear proteins were classified as ''non-membrane.''''All'' refers to the genome-wide compartmentalization of characterized genes, and respective numbers were retrieved from YPD. ''Puf2 Top 40'' refers to the 40 highest enriched Puf2p targets and equals the total number of Puf1p targets.DOI: 10.1371/journal.pbio.0020079.g004

Figure 5 .
Figure5.Sequence Motifs Interacting with Puf Proteins (A) Consensus motifs detected within 39-UTR sequences of Puf3p, Puf4p, and Puf5p target mRNAs.Height of the letters specifies the probability of appearing at the position in the motif.Letters with less than 10% appearance were omitted.Fraction of genes bearing a motif in the 39-UTR sequence is indicated to the right.Y-helicase proteins are nearly identical in sequence and were excluded from this analysis.(B) Scheme of three-hybrid assay for monitoring RNA-protein interactions in vivo(Bernstein et al. 2002).(C) b-Galactosidase activity for threehybrid assay.Proteins assayed are indicated on top, RNAs to the left.Abbreviations: pum, pum-HD; cons., consensus motif; UGU/AGA, UGU in consensus sequence mutated to AGA. (D) Activation of HIS3 reporter gene and resistance to 3-aminotriazole (3-AT), a competitive inhibitor of the HIS3 gene product, in a three-hybrid assay(Bernstein et al. 2002).DOI: 10.1371/journal.pbio.0020079.g005

Figure 8 .
Figure 8. Specific Proteins Bind Functional Groups of Genes for Regulation At the transcriptional level (top), transcription factors (TFs) regulate initiation of transcription (green arrow) in the nucleus by binding to sequence elements (yellow box) proximal to their target coding regions (boxes).At the post-transcriptional level (middle), RBPs regulate decay, translation, or localization of mRNAs in a coordinated fashion by interaction with sequence/structural elements in the RNA that are often found in 39-UTR regions (red box).Functional relations at the protein level (bottom) can be reflected at both the transcriptional and post-transcriptional levels: sets of genes that encode functionally related proteins, such as subunits of stochiometric complexes (blue) or components of the same regulatory or metabolic pathway (gray and cross-hatched boxes), may be regulated by common transcription factors and their mRNAs post-transcriptionally coregulated by RBPs (dashed interactions).DOI: 10.1371/journal.pbio.0020079.g008

Table 1 .
Number of Consensus Motifs Found in the Genome and in Puf Targets The probability that the motifs are enriched in Puf targets by chance. b