Yeast two-hybrid (Y2H) is a well-established genetics-based system that uses yeast to selectively display binary protein-protein interactions (PPIs). To meet the current need to unravel complex PPI networks, several adaptations have been made to establish medium- to high-throughput Y2H screening platforms, with several having successfully incorporated the use of the next-generation sequencing (NGS) technology to increase the scale and sensitivity of the method. However, these have been to date mainly restricted to the use of fully annotated custom-made open reading frame (ORF) libraries and subject to complex downstream data processing. Here, a streamlined Y2H library screening strategy, based on integration of Y2H with NGS, called Y2H-seq, was developed, which allows efficient and reliable screening of Y2H cDNA libraries. To generate proof of concept, the method was applied to screen for interaction partners of two key components of the jasmonate signaling machinery in the model plant Arabidopsis thaliana, resulting in the identification of several previously reported as well as hitherto unknown interactors. Our Y2H-seq method offers a user-friendly, specific and sensitive screening method that allows identification of PPIs without prior knowledge of the organism’s ORFs, thereby extending the method to organisms of which the genome has not entirely been annotated yet. The quantitative NGS readout allows to increase genome coverage, thereby overcoming some of the bottlenecks of current Y2H technologies, which will further strengthen the value of the Y2H technology as a discovery platform.
Citation: Erffelinck M-L, Ribeiro B, Perassolo M, Pauwels L, Pollier J, Storme V, et al. (2018) A user-friendly platform for yeast two-hybrid library screening using next generation sequencing. PLoS ONE 13(12): e0201270. https://doi.org/10.1371/journal.pone.0201270
Editor: Alan M. Moses, University of Toronto, CANADA
Received: July 9, 2018; Accepted: November 26, 2018; Published: December 21, 2018
Copyright: © 2018 Erffelinck et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This research was supported by the Research Foundation Flanders by a research project grant to A.G. (G004515N) and for postdoctoral fellowships to J.P. and L.P., the Program Ciências Sem Fronteiras for a predoctoral fellowship to B.R. (Grant 201135/2014-0), the BEC.AR program for overseas training of Argentine professionals in the fields of science, technology and productive innovation for a scholarship to M.P. and the Special Research Fund from Ghent University (project O1J14813). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Disentangling protein-protein interaction (PPI) networks is crucial for our understanding of cellular organization and function. To achieve this, a wide range of technologies to identify PPIs has been developed over the last decade [1, 2]. One of the most advanced and commonly used methods to identify PPIs in vivo under near-physiological conditions is affinity purification coupled to mass spectrometry (AP-MS) [3–5]. Equivalent comprehensive assays to specifically identify binary PPIs include protein domain microarrays and in vivo protein fragment complementation assays (PCAs) [6–10]. The principle of PCA is based on the fusion of two hypothetically interacting proteins (bait and prey) to two fragments of a reporter protein. Interaction between the bait and prey proteins results in the reassembly of the reporter protein, followed by its activation. The signal readout can be bioluminescence, fluorescence or cell survival. In the popular yeast two-hybrid (Y2H) method, the bait protein is fused to the DNA binding domain (DBD) and the prey (or prey library in the case of a comprehensive Y2H screening) is fused to the activation domain (AD) of a transcription factor (TF) . Upon association of the hypothetical interactors, the TF is functionally reconstituted and drives the expression of a reporter gene that can be scored by selective growth. Typically, conventional medium-throughput Y2H library screenings are subject to laborious one-by-one clonal identification of interaction partners, but today, proteome-wide mapping of PPIs demands a high-throughput approach. This led for instance to the development of a matrix-based Y2H method that bypassed the inefficient identification by DNA sequencing . Collections of bait and prey strains were automatically combined and arrayed on fixed matrix positions and PPIs were scored as visual readouts. A major drawback of this strategy is the need for pre-assembled libraries based on defined gene models and expensive robotics that are not accessible to every researcher.
Clonal identification of Y2H screening with DNA sequencing has a tremendous negative effect on the efficiency, cost and labor of the method. Furthermore, given the labor-penalty involved with increasing transformation titers, the clonal identification of Y2H interactions is usually not compatible with quantitative assessment of PPI abundances. Therefore, replacing the conventional Y2H screening strategy with a pool-based selection and global identification by NGS, can have three major implications: (i) cost reduction by high-capacity sequencing, (ii) higher sensitivity and (iii) quantification of the abundance of bait-specific interactions. The lab of Marc Vidal pioneered the implementation of the NGS technology for massive parallel Y2H screening in the Stitch-Seq method, mainly to map the human interactome. Herein, single amplicons, concatenating sequences of potentially interacting proteins, serve as template for NGS . Nonetheless, this method remains laborious because it requires clonal isolation and several PCR rounds for PPI identification for each selected colony. The lab of Ulrich Stelzl developed the Y2H-seq method, thereby illustrating the advantage of NGS for Y2H towards scalability by mapping the protein methylation interactome . In this strategy, the use of barcode indexing enables simultaneous sequencing of interacting preys of multiple separate baits in a single Illumina run. This strategy is based on mixing bait and prey pools prior mating, followed by selective growth, and deep-sequencing, but still requires a post-screen binary testing of interacting baits with each of the identified preys. The use of barcodes was further exploited in the Barcode Fusion Genetics‐Yeast Two‐Hybrid (BFG‐Y2H) method. This matrix-Y2H strategy uses Cre-recombinase to create intracellular chimeric barcodes that are derived from protein pairs, thereby enabling immediate identification and quantification of each interaction pair through NGS . Prior to screening and NGS, isolation and sequencing of each barcoded bait and prey clone are essential to associate barcodes to ORFs, which may pose a cost restriction for massive screening purposes. Lewis et al., (2012) developed the Quantitative Interactor Screen Sequencing (QIS-Seq) approach, which provides a quantitative measurement of enrichment for each interactor relative to its frequency in the library without the use of barcode fusion proteins . The latter was addressed in CrY2H-seq, which introduced a Cre-recombinase interaction reporter that endorses fusion of the coding sequences of two interacting proteins, followed by NGS to identify these interactions en masse . The latter method was employed to uncover the transcription factor interactome of A. thaliana.
All of the above-mentioned Y2H-NGS strategies focus on increased capacity, efficiency and sensitivity, although they may face some lack in specificity, are laborious or do not fully exploit the quantification potential of NGS coupled to Y2H. Furthermore, construction of full-length ORF libraries is necessary, thereby restricting these methods to organisms of which the genomes are well annotated or to ‘defined’ gene models, which for instance cannot take alternative splicing, alternative start codon use or transcript processing into account.
Here, we discuss a user-friendly and standardized Y2H-NGS workflow (‘Y2H-seq’), complementary to the matrix-Y2H approaches, which allows rapid identification of interaction partners of a bait of interest in the organism of choice without the need for expensive robotics. The Y2H-seq screening method generates a quantitative readout that, through the use of control screens, allows to eliminate false-positive PPIs to boost the specificity of the method and thereby avoiding unnecessary downstream experimental binary interaction verification. Furthermore, the method is not dependent on predefined and prefabricated ORF libraries but on cDNA libraries, and is therefore principally applicable to every organism regardless of the annotation status of its genome. The functionality of our methodology is validated here by implementing it on two well-studied members of the jasmonate (JA) signaling cascade in the model plant Arabidopsis thaliana, i.e. TOPLESS (TPL) and Novel Interactor of JAZ (NINJA), respectively encoded by the loci AT1G15750 and AT4G28910 [18–26].
Material and methods
All cloning was carried out by Gateway recombination (Thermo Fisher Scientific, Waltham, MA, USA). The full-length coding sequence of IAA17, AT4G36480, AT1G34340, AT3G06850, AT4G05553, AT3G50000, AT3G54390, AT2G40260, AT3G05670, AT2G33550 and AT3G19860 were PCR-amplified (for primers, see S1 Table) using cDNA of Arabidopsis seedlings and recombined in the donor vector pDONR221. All the entry clones used in checkpoint 1 had previously been generated [18, 27].
Binary Y2H analysis
Y2H analysis was performed as described  using the GAL4 system , in which bait and prey were fused to the GAL4-AD or GAL4-BD via cloning into pDEST22 or pDEST32, respectively. The Saccharomyces cerevisiae PJ69-4α yeast strain  was co-transformed with bait and prey constructs using the polyethylene glycol (PEG)/lithium acetate method. Transformants were selected on SD medium lacking Leu and Trp (Clontech, France). Three individual colonies were grown overnight in liquid cultures at 30°C and 10- or 100-fold dilutions were dropped on control (SD-Leu-Trp) and selective media (SD-Leu-Trp-His).
Yeast transformation was performed as described by Cuéllar-Pérez et al., (2013) . The S. cerevisiae PJ69-4α yeast strain was transformed in two transformation rounds, respectively with 0.5 μg of bait plasmid DNA and 50 μg of cDNA prey library plasmid DNA using the PEG/lithium acetate method. At least 106 transformants were plated on control (SD-Leu- Trp) and selective media lacking Leu, Trp and His supplemented with 5 mM 3-AT (Sigma-Aldrich, Saint Louis, MO, USA).
Y2H cDNA library used to perform the Y2H screening
The ProQuest two-hybrid cDNA library was generated by cDNA synthesis from RNA extracted from A. thaliana suspension cells AT7, cloned into pEXP-AD502 vector (ProQuest), equivalent to pDEST22 vector (Thermo Fisher Scientific) and electroporated in the DH10B-Ton A (T1 and T5 phage resistance) cells (Thermo Fisher Scientific). The average insert size was 1.1 kb and the number of primary clones was 5.3 x 106 cfu with a 100% insert coverage.
A minimum of ten random colonies of the Y2H screening plates were streaked out on solid SD-Leu-Trp-His selective medium with 5mM 3-AT (Sigma-Aldrich, Saint Louis, MO, USA) and incubated for 48 h at 30°C. Each streaked out colony was inoculated in liquid SD-Leu-Trp-His selective medium and incubated overnight at 30°C at 230 rpm. Subsequent yeast plasmid isolation was carried out using the Zymoprep Yeast Plasmid Miniprep I Kit (Zymo Research, Irvine, CA, USA) according to the manufacturer’s instructions. The cDNA inserts of the prey plasmids (pDEST22-insert) were PCR-amplified using backbone-specific primers (S1 Table) and Sanger-sequenced.
Colonies of the Y2H screening plates were dissolved and pooled in 10–15 mL of ultrapure water and plasmids were collected using the Zymoprep Yeast Plasmid Miniprep II kit (Zymo Research, Irvine, CA, USA). Prey constructs were amplified via PCR using Q5 High-Fidelity DNA Polymerase (New England Biolabs, Ipswich, MA, USA) and generic pDEST22 primers that bind to the GAL4AD and the region flanking the attR1 site (S1 Table). The following program was used: initial denaturation (98°C, 30 s), 35 amplification cycles (denaturation 98°C, 10 s; annealing 55°C, 30 s; elongation 72°C, 2.5 min), final extension (72°C, 5 min). The PCR mixture was purified using the CleanPCR kit (CleanNA, Alphen aan den Rijn, The Netherlands) and 40 ng of the purified PCR product was used for semi-quantitative qPCRs, which were carried out with a Lightcycler 480 (Roche Diagnostics, Brussels, Belgium) and the Lightcycler 480 SYBR Green I Master kit (Roche). Specific primers (S1 Table) and GoTaq DNA polymerase (Promega, Fitchburg, WI, USA) were used for amplification of 40 ng of purified PCR product in 3 technical replicates with the following program: initial denaturation (95°C, 5 min), 40 amplification cycles (denaturation 95°C, 30 s; annealing 60°C, 30 s; elongation 72°C, 60 s), final extension (72°C, 5 min). Three technical replicates per sample were performed. As a reference, a short sequence originating from the AD of pDEST22 was used. For the relative quantification with the reference gene, qBase was used .
NGS data processing
The samples were sequenced by Illumina HiSeq 2000 125-bp paired-end reads. Data mapping and filtering were carried out through an in-house generated pipeline. To avoid sequencing artifacts such as read errors, primers, adapter and vector sequence contamination and PCR bias, a quality check was performed on the raw sequencing data. The quality control and trimming were performed with Trimmomatic . Subsequently, the processed sequencing reads were mapped against the Arabidopsis reference genome, downloaded from TAIR (The Arabidopsis Information Resource, http://arabidopsis.org), by TopHat , which uses the Bowtie program as an alignment engine. In addition, TopHat requires SAM (Sequence Alignment/Map) tools to be installed. The cufflinks program was used to count the expression of each gene and report it as raw reads and FPKM. To determine possible interactors, following steps were taken. Genes with less than six read counts were not considered. Zero counts in the negative control sample were replaced by 1 to avoid division by 0. These genes were flagged to keep track of these imputations. FPKM values were calculated for each gene in both the sample and the negative control. Subsequently, the SNR was calculated for each gene as the ratio of the sample FPKM value to the negative control FPKM value. Genes with an SNRNINJA/EMPTY or SNRTPL-N/EMPTY higher than an arbitrary threshold calculated based on the 99.5th percentiles of SNRNINJA/EMPTY and SNRTPL-N/EMPTY, were considered to be potential interaction partners of the bait gene.
Selection of baits
JAs are phytohormones that regulate the plant’s defense and modulate several developmental processes. The production of JAs via the oxylipin biosynthetic pathway leads to the accumulation of bioactive (+)-7-iso-jasmonoyl-L-isoleucine (JA-Ile). JA-Ile functions as a ligand between the F-box protein coronatine insensitive 1 (COI1) and the JA-ZIM (JAZ) repressor proteins, thereby promoting ubiquitination and subsequent proteasomal degradation of the JAZ proteins [33, 34]. Together with the TIFY8, peapod (PPD) and ZIM proteins, the JAZ proteins belong to the TIFY super-family [33, 35–38]. A key regulator in JA signaling in A. thaliana is the basic helix-loop-helix (bHLH) TF MYC2, encoded by the locus AT1G32640 [39, 40]. In the absence of JA-Ile, MYC2 can physically interact with the JAZ proteins via the Jas motif, which in turn recruit the transcriptional repressor TPL and TPL-related proteins (TRPs) through the adaptor protein NINJA . NINJA acts as a transcriptional repressor that harbors an intrinsic TPL-binding ETHYLENE RESPONSE FACTOR (ERF)-associated amphiphilic repression (EAR) motif mediating its activity (Fig 1) . NINJA can also interact with non-JAZ TIFY proteins, demonstrating its role in processes other than JA signaling [18, 35, 38, 41, 42]. Likewise, TPL is associated with various cellular processes through its capacity to interact with a compendium of diverse proteins [18–25, 38]. For instance, TPL can bind to PEAPOD proteins through the adapter proteins KIX8 and KIX9 to negatively regulate meristemoidal division in A. thaliana . A role for TPL modulating brassinazole resistant 1 (BZR1)-regulated cell elongation and brassinosteroid-mediated control of shoot boundaries and root meristem development through interaction with the TF bri1-ems-suppressor 1 (BES1) has been described [23, 25]. TPL can also be recruited by CC-type glutaredoxins to target TGA-dependent promoters to control development- and stress-associated processes.
(A) In the absence of JAs, bHLH-type MYC TFs interact with the Jas domain of JAZ proteins that in turn interact with NINJA via their ZIM domain. The EAR motif of NINJA is essential for recruitment of the TPL co-repressors through the TPL domain (TPD). (B) In the presence of JA-Ile, JAZ proteins interact with the ubiquitin E3 ligase SCFCOI1 complex, leading to the proteasomal degradation of JAZs and consequent release of the NINJA–TPL complex from the MYC TFs, which leads to the transcriptional activation of JA-responsive genes by de-repressed MYC TFs.
Because various direct interactors have been described for both NINJA and TPL proteins and because these are currently still heavily investigated for potential novel roles and links with different signaling pathways and cellular processes, NINJA and TPL were chosen as ideal bait proteins to develop, establish and validate our Y2H-seq methodology. Notably, whereas we used the full-length ORF of NINJA as a bait, for TPL only the amino-terminal region (AA 1–188; TPL-N) was used as a bait because this domain contains the lissencephaly homologous (LisH) dimerization and C-terminal to LisH (CTLH) motifs, which are together required and sufficient for interaction with transcriptional repressors through their EAR motif.
The Y2H-seq flow-chart
An illustration of the general workflow of our Y2H-Seq strategy is given in Fig 2. As indicated above NINJA and TPL-N were used as baits and a Y2H cDNA library originating from A. thaliana AT7 suspension cells was used as prey.
The first checkpoint involves analysis of bait functionality through a Y2H assay with known interactors (preys). Alternatively, immunoblot analysis can be carried out in checkpoint 1 to validate protein expression of the fusion proteins in case no interactors are known of a particular bait prior to the screen. Subsequently, the actual Y2H screening with the functional bait is carried out by supertransformation of the bait yeast strain with the prey cDNA library. In checkpoint 2, the plasmids are extracted and Sanger sequenced for some of the colonies obtained on the selective plates. Next, pooling of all colonies on the plates is carried out, all plasmids from the pool are isolated in a single extraction and the inserts of the plasmids are amplified by PCR. In checkpoint 3, qPCR is performed to verify for enrichment of expected preys with a particular bait Y2H-seq PCR mix. For that, semi-quantitative qPCR analysis is carried out relative to the PCR product of a screen with an empty bait vector or the cDNA library itself. Finally, NGS and data analysis is performed to obtain a final list of putative interactors of the interested bait proteins.
After transformation of the Y2H reporter strain PJ69-4α with the bait plasmids, a first checkpoint is introduced, in which the bait strains were individually co-transformed with positive and negative control prey expression clones to verify functional expression of the baits, to exclude possible auto-activation and to corroborate binding with previously reported interaction partners (Fig 2). Next, the bait strains were used for Y2H-seq screening with the A. thaliana Y2H cDNA prey library. Simultaneously, a control screening was performed with the empty expression vector, which will hereafter be referred to as EMPTY.
Subsequent to five days of selective growth of the transformed yeast cells, the prey cDNA inserts of about ten individual yeast colonies per screen were Sanger-sequenced (Fig 2). This second checkpoint allowed us to confirm the retrieval of reported interactors as preys.
Subsequently, all yeast colonies that survived selective growth were pooled per screen and the cDNA inserts of the prey plasmid pools were amplified by PCR. A third checkpoint consisted of a qPCR analysis with specific primers for genes corresponding to known bait interactors, which allows to assess the representation of known interactors in both screens in a quantitative manner (Fig 2). Prey abundance was quantified relative to that in the A. thaliana Y2H cDNA library.
Upon complying the expectations of all three checkpoints, the amplicons of the pooled prey cDNA inserts were sequenced by NGS (Illumina HiSeq 2000 sequencing, 125-bp paired-end reads). The NGS-output was analyzed by an adapted RNA-Seq data processing pipeline, providing a quantitative selection of known and potentially new interactors of NINJA and TPL-N, using the EMPTY screen as control to eliminate false-positive interactions and to correct for the abundance of each prey represented by the Y2H cDNA library.
In a first checkpoint, we explore auto-activation and functionality of the bait strains. The bait strains were individually co-transformed with positive and negative control preys (Fig 3 and S2 Table) to determine the level of auto-activation of the bait strain and to check whether the bait protein is functionally expressed and consequently can bind previously reported interaction partners [18, 26, 38]. Alternatively, immunoblot analysis can be carried out to validate protein expression of the fusion proteins in case no interactors are known of a particular bait prior to the screen.
Y2H analysis of NINJA and TPL-N baits, fused to the DBD, and preys, fused to the AD, grown for 2 days on selective medium Synthetic Defined (SD)-Leu-Trp-His (-3). Transformed PJ69-4α yeast strains were also grown for 2 days on SD-Leu-Trp (-2) medium to confirm growth capacity. Direct interaction was confirmed between (A) NINJA and PPD1, JAZ1, JAZ2 and JAZ4, and (B) TPL-N and auxin/indole-3-acetic acid 17 (IAA17) and NINJA.
As expected, the binary interaction between the NINJA bait and the preys PPD1, JAZ1, JAZ2 and JAZ4 was confirmed (Fig 3A). Likewise, the TPL-N bait strain showed interaction with the preys auxin/indole-3-acetic acid 17 (IAA17) and NINJA (Fig 3B). Furthermore, neither of the bait strains exhibited auto-activation, which indicated that NINJA as well as TPL-N were functionally expressed in the bait strains.
In a second checkpoint, we evaluate the functionality of the Y2H-seq screening with bait strains by Sanger sequencing. For the actual Y2H screening, the bait strains were transformed with the A. thaliana Y2H cDNA prey library, followed by transformation efficiency assessment and five days of selective growth (Fig 4 and S2 Table).
(A-C) The EMPTY (A), NINJA (B), and TPL-N (C) Y2H-seq screenings were performed on selective SD-Leu-Trp-His + 5mM 3-AT plates. Per bait, a single transformation reaction is carried out, after which each transformation mix is plated on 3 plates for selection. A representative plate for each bait is shown.
A minimum transformation efficiency of 1 x 106 colony-forming units should be attained for a full Y2H cDNA library screening coverage . This benchmark was achieved for all Y2H screenings we performed (Table 1 and S1 Fig).
To ensure a full screening coverage of the A. thaliana Y2H cDNA library, screening of at least 1 x 106 yeast colonies is advised . Titers are based on the numbers of colony forming units on control plates as illustrated for the TPL-N screen in S1 Fig.
A minimum of ten individual colonies per screening were isolated, plasmids purified and the cDNA inserts of the prey plasmids Sanger-sequenced. In this second checkpoint, several known interactors could already be identified (Table 2). The nine sequences originating from the NINJA screening corresponded to two unique interaction partners that were previously described as NINJA interactors . Likewise, the 12 prey sequences that corresponded to potential interactors of TPL-N were derived from six different, all known interactors .
In a third checkpoint, semi-quantitative qPCR is carried out, as a complementary approach to evaluate the quality of a Y2H-seq screening. In this third checkpoint, the quality of the Y2H-seq screening was further assessed. All selectively grown yeast colonies were pooled per screening (Fig 2) and cDNA inserts of the prey plasmid pools were PCR-amplified with vector-specific primers (S1 Table). To examine whether potential interaction partners of the baits were overrepresented relative to the cDNA library control, a qPCR was performed using prey-specific qPCR primers (S1 Table). In the NINJA screen, compared to the control library, the genes encoding JAZ1, JAZ2, JAZ12, TIFY8 and PPD1 were overrepresented (Fig 5), in agreement with previous literature reports [18, 35]. Hence, this shows the value of this qPCR assay set-up as a final checkpoint before the actual Y2H-seq analysis, at least for baits with a limited set of known interactors.
JAZ1, JAZ2, JAZ12, TIFY8 and PPD1 were overrepresented in the PCR products of the NINJA screening compared to the PCR products of the A. thaliana cDNA library (Library). Statistical significance was determined by a Student’s t-test (***P<0.001).
In contrast to NINJA, TPL can interact with potentially hundreds of proteins . IAA30, the only protein identified in the checkpoint 2 that we also tested by qPCR, was enriched in the TPL-N pool. Conversely however, other EAR-motif containing proteins (from other families than the IAAs) that are known to interact with TPL, but were not readily identified in checkpoint 2, were not enriched (Fig 6, Table 2). Y2H cDNA library screenings are prone to false negatives, i.e. missing interactions, due among others to aberrant folding, clones with truncated genes or absence of the gene in the cDNA library. In the case of TPL-N, for example, the NINJA clone that is represented by the A. thaliana Y2H cDNA library was found to be truncated and missing the EAR domain necessary for binding with TPL-N.
Only IAA30 was overrepresented in the PCR products of the TPL-N screening compared to the A. thaliana cDNA library (Library) cDNA insert amplicons. Statistical significance was determined by a Student’s t-test (***P<0.001).
Beyond the checkpoints: NGS of the amplified prey cDNA inserts
The prey pool amplicons of the EMPTY, NINJA, and TPL-N screenings were used as input for NGS by Illumina HiSeq 2000 (125-bp paired-end reads). Here, we used a pipeline relying on TopHat for read mapping and Cufflinks for gene expression quantification. The method presented here aims to compare the NINJA and TPL-N Y2H-seq screens with the EMPTY control screen to enrich for specific interaction partners while maximally avoiding the retrieval of false-positive interactions.
First, a quality check was performed on the raw reads. Thereby, adapters, low-quality sequences and partial vector sequences were trimmed (EMPTY: 2,418,105 kept reads and 22,651 discarded; NINJA: 3,555,612 kept reads and 10,727 discarded; TPL-N: 2,908,652 kept reads and 33,094 discarded). Concomitantly, paired-end and orphan single-end reads were split. The processed reads were then mapped to the reference genome (TAIR10) using TopHat. To avoid overestimation of short genes, only one mate-pair per read was used for mapping. The resulting alignments were used as input for Cufflinks, which generates the raw expression quantification data for each of the analyzed raw sequencing files. For the subsequent analysis of the raw expression data, a Y2H-seq pipeline was drafted in R-studio.
Mapped genes in the TPL-N and NINJA Y2H screenings with raw read counts less than six were eliminated. Genes in the EMPTY screening that had no raw read counts were given an arbitrary value of 1 and flagged as imputed. After calculating the Fragments Per Kilobase of Exon Per Million Fragments Mapped (FPKM) values, the signal to noise ratio (SNR) was defined for NINJA and TPL-N compared to EMPTY. Intuitively, one would expect little NGS data to be derived from the EMPTY screening, given that no yeast cells survived selective growth (Fig 4). However, this was not the case and can be explained by the pooling method employed here: ‘scraping’ all yeast cells from the selection plates includes also dead or arrested yeast cells that may still contain intact prey plasmids. Hence, genes with a high representation in the cDNA library, and thus genes with a high expression level in Arabidopsis suspension cells, are identified in the EMPTY NGS data set.
Next, to allow setting relevant arbitrary thresholds, the 99.5th percentiles of SNRNINJA/EMPTY and SNRTPL-N/EMPTY were calculated, leading to thresholds of 7.2 for NINJA and 6.0 for TPL-N screenings, respectively (Tables 3 and 4). With this first threshold, overall, from the 71 potential interactors of NINJA, seven were known to be interactors [18, 35], whereas for TPL-N, 12 out of the 51 potential interactors had been previously reported .
Genes with SNRNINJA/EMPTY>7.2 were retained, listed and ranked from high to low SNR. Flagged genes are italicized. Previously reported interactors of NINJA are indicated in bold. Potential interactors that were tested for binary interaction in further validation assays are underlined.
Genes with SNRTPL-N/EMPTY>6 were retained, listed and ranked from high to low SNR. Flagged genes are italicized. Previously reported interactors of TPL are indicated in bold. Potential interactors that were tested for binary interaction in further validation assays are underlined. A ‘Y’ in bold font indicates the presence of an EAR domain in the wrong frame or in an untranslated region of the gene.
When super-implying a second threshold, in this case of >100 on the FPKMNINJA and FPKMTPL-N values, nearly all retained interactors were either reported already or very plausible. Indeed, in the case of NINJA, only TIFY-domain containing proteins were retained (Fig 7, Table 3). In the case of TPL-N, all but one of the retained proteins using this second threshold contained an EAR-motif , the conventional TPL recruitment domain (Fig 8, Table 4), and also includes proteins not yet individually reported as TPL-interactors, but belonging to multigene families such as the AGAMOUS-LIKE (AGL) and INDOLE-3-ACETIC ACID INDUCIBLE (IAA) proteins, many members of which have already been reported as TPL interactors [19, 26]. Together, this demonstrates the robustness and potential of the designed Y2H-seq platform.
The depth of the NGS coverage for each gene, visualized by the coverage track, is aligned to the gene model. Coding sequences are represented by thick black boxes, 5’ and 3’ untranslated regions by thin black boxes and introns by thin black lines, respectively. The grey boxes in the gene model correspond to the EAR motif.
The depth of the NGS coverage for each gene, visualized by the coverage track, is aligned to the gene model. Coding sequences are represented by thick black boxes, 5’ and 3’ untranslated regions by thin black boxes and introns by thin black lines, respectively. The grey boxes in the gene model correspond to the EAR motif.
To assess whether the retrieved preys that did not pass our stringent cut-offs, nonetheless represent true potential interactors of NINJA and N-TPL, additional Y2H experiments were carried out. For NINJA, the first four potential interaction partners with SNRNINJA/EMPTY>7.2 and FPKMNINJA<100 were tested in a binary Y2H assay (Table 3 and Fig 9). However, none of them showed interaction with NINJA, indicating that the installed threshold of SNRNINJA/EMPTY>7.2 and FPKMNINJA >100 served as a good selection criterion, at least for NINJA.
Y2H analysis of NINJA, fused to the DBD, and potential interaction partners, fused to the AD of the GAL4 TF (in the pDEST22 vector), grown on selective medium SD-Leu-Trp-His (-3). Transformed PJ69-4α yeast strains were also grown on SD-Leu-Trp (-2) medium confirm growth capacity. No direct interactions could be observed for potential preys identified in the Y2H-seq with scoring values below the threshold of SNRNINJA/EMPTY>7.2 and FPKMNINJA values>100.
In the retained list of potential interactors using threshold SNRTPL-N/EMPTY>6 with FPKMN-TPL>100 values, the one candidate ATCKA2 (AT3G50000) that did not contain an EAR-domain was tested for direct interaction with N-TPL in a Y2H assay, besides five candidates with FPKMN-TPL<100 (Table 4 and Fig 10). For the latter set, we specifically avoided to pick candidates from the AGL and IAA families, which are most likely true, but less abundant interactors, and chose both candidates with and without an EAR domain. ATCKA2 interaction with N-TPL could not be confirmed with binary Y2H, suggesting it was a false positive caused by the Y2H-seq pipeline. In contrast however, interaction between TPL-N and the five other candidates were all confirmed, demonstrating that they do not represent artefacts of the Y2H-seq methodology and may be true interactors. Hence, in contrast to NINJA, this implicates that the arbitrary threshold of SNRTPL-N/EMPTY>6 with FPKMN-TPL>100 was too stringent for N-TPL. Perhaps this may be due to the pleiotropic function of TPL, which has an exceptionally high number of protein interactors, often from multigene families. For proteins such as NINJA, with a more defined role and a well-defined set of interactors, a stricter threshold may be justified. For proteins such as TPL, one may need to be more relaxed in determining candidate interactors. As exemplified here, this leads to the identification of potential novel interactors from gene families previously unreported to be capable of interacting with TPL, including EAR-domain containing proteins such as the RING/U-box protein AT3G05670, or proteins that do not contain an EAR domain such as the putative TF AT3G54390, the homeodomain TF AT2G40260 and the bHLH TF AT3G19860 (Table 4 and Fig 10).
Y2H analysis of N-TPL, fused to the DBD, and potential interaction partners, fused to the AD of the GAL4 TF (in the pDEST22 vector), grown on selective medium SD-Leu-Trp-His (-3). Co-transformed PJ69-4α yeast strains were also grown on SD-Leu-Trp (-2) medium to confirm growth capacity. No direct interaction was confirmed between ATCKA2 encoded by AT3G5000 and N-TPL, in contrast to the interactions with all other potential interactors selected from the list with a threshold of SNRNINJA/EMPTY>7.2 and FPKMNINJA<100 values. * indicates a truncated version of the protein, as it was present in the Y2H cDNA library.
Here, we present a newly designed Y2H-seq strategy to identify PPIs, which enables exploiting the full qualitative and quantitative potential of Y2H library screenings in an unprecedented way. Our method circumvents multiple shortcomings of a conventional Y2H library screening. As such, for instance consumable and DNA sequencing costs are significantly cut by using a pool-based NGS-strategy instead of the conventional isolation, manipulation and sequencing of individual yeast clones that survive the screening selection. Moreover, a higher PPI coverage can be achieved in our Y2H-seq strategy by increasing library titers. A factor that will determine the impact of future Y2H-seq screenings more than ever, will be the choice and the quality of the Y2H cDNA library. For instance, full-length protein libraries may mask PPIs by steric hindrance, hence the use of more complex Y2H cDNA libraries encoding protein fragments as well as full-length proteins may now be considered, and screened in one effort, which could lead to a more comprehensive coverage of the PPI space. The utility of fragment-based Y2H approaches has previously been demonstrated [45, 46]. By playing with sample preparations to generate cDNA libraries, one could increase the genome coverage with no extra effort in the Y2H screening. For instance, different organs from a single plant, different developmental stages of a single organ, or explants subjected to different environmental cues or chemicals can now be pooled in a single cDNA library. This will allow expanding the number of genes screened in a single event, as well as different versions of the same gene, e.g. following expression after alternative splicing or translation start events. As such, the Y2H-seq strategy can provide a more effective way to map the PPI potential of a bait protein, allowing further exploration of biological pathways and their regulation. Furthermore, the use of cDNA libraries makes it possible to identify novel interaction partners of organisms of which the genome has not been fully annotated yet, unlike the use of ORF libraries based on known and completely fixed gene models.
The Y2H-seq strategy implements a quantitative readout system, with a straightforward and adaptable scoring procedure. The comparison of quantitative NGS readouts from a Y2H-seq screen with a particular bait protein to those from a control screen with the ‘empty’ control vectors is essential to discriminate potential true interactors from background from the library, for instance from abundantly expressed genes. Likewise, exhaustive screening of a particular cDNA library with different baits may allow comparison of the readouts of the different screens to further discriminate specific interactors screenings from ‘sticky’ proteins and thereby further increase the efficiency of the method. Indeed, as is also the case with other PPI discovery methods, such as tandem affinity purification [47, 48], a specific ‘blacklist’ of returning Y2H-seq interactors for each cDNA library can be composed by marking common interactors of seemingly unrelated bait proteins. This may allow fine-tuning the thresholds to be set up in the filtering of the Y2H-seq NGS data, and thereby enable determining robust priority lists and reducing laborious and needless downstream validation assays to a minimum.
Finally, this strategy can also easily be extended to Y1H screenings, for which the same cDNA library could be screened, but for which considerably higher false-positive rates are typically obtained as compared to Y2H screenings [49, 50]. As such, we anticipate that the cost and labor reduction along with the increased detection and quantification potential of our Y2H-seq strategy can give an important upgrade to this long-existing, but far from fully exploited screening tool.
S1 Fig. Control plates used to determine the number of colony-forming units to calculate the efficiency of transformation and library titer for the TPL-N screening.
S1 Table. Primers used in this study.
S2 Table. Yeast strains generated in this study.
We thank Annick Bleys for helping to prepare the manuscript and Frederik Coppens for helpful advice on NGS. This research was supported by the Research Foundation Flanders by a research project grant to A.G. (G004515N) and postdoctoral fellowships to J.P. and L.P., the Program Ciências Sem Fronteiras for a predoctoral fellowship to B.R. (Grant 201135/2014-0), the BEC.AR program for overseas training of Argentine professionals in the fields of science, technology and productive innovation for a scholarship to M.P. and the Special Research Fund from Ghent University (project O1J14813).
- 1. Stynen B., Tournu H., Tavernier J., and Van Dijck P., Diversity in genetic in vivo methods for protein-protein interaction studies: from the yeast two-hybrid system to the mammalian split-luciferase system. Microbiology and molecular biology reviews, 2012. 76(2): p. 331–382. pmid:22688816
- 2. Xing S., Wallmeroth N., Berendzen K.W., and Grefen C., Techniques for the analysis of protein-protein interactions in vivo. Plant physiology, 2016. 171(2): p. 727–758. pmid:27208310
- 3. Dunham W.H., Mullin M., and Gingras A.C., Affinity-purification coupled to mass spectrometry: basic principles and strategies. Proteomics, 2012. 12(10): p. 1576–90. pmid:22611051
- 4. Bensimon A., Heck A.J.R., and Aebersold R., Mass spectrometry–based proteomics and network biology. Annual review of biochemistry, 2012. 81: p. 379–405. pmid:22439968
- 5. Dedecker M., Van Leene J., and De Jaeger G., Unravelling plant molecular machineries through affinity purification coupled to mass spectrometry. Current Opinion in Plant Biology, 2015. 24: p. 1–9. pmid:25603557
- 6. Remy I. and Michnick S.W., Mapping biochemical networks with protein fragment complementation assays. Methods in Molecular Biology, 2015. 1278: p. 467–481. pmid:25859970
- 7. Kaushansky A., Allen J.E., Gordus A., Stiffler M.A., Karp E.S., Chang B.H., et al., Quantifying protein–protein interactions in high throughput using protein domain microarrays. Nature protocols, 2010. 5(4): p. 773–790. pmid:20360771
- 8. Johnsson N. and Varshavsky A., Split ubiquitin as a sensor of protein interactions in vivo. Proceedings of the National Academy of Sciences of the United States of America, 1994. 91(22): p. 10340–10344. pmid:7937952
- 9. Kittanakom S., Chuk M., Wong V., Snyder J., Edmonds D., Lydakis A., et al., Analysis of membrane protein complexes using the split-ubiquitin membrane yeast two-hybrid system. Methods in Molecular Biology, 2009. 548: p. 247–271. pmid:19521829
- 10. Müller J. and Johnsson N., Split-ubiquitin and the split-protein sensors: chessman for the endgame. Chembiochem, 2008. 9(13): p. 2029–2038. pmid:18677736
- 11. Fields S. and Song O.-k., A novel genetic system to detect protein–protein interactions. Nature, 1989. 340(6230): p. 245–246. pmid:2547163
- 12. Häuser R., Stellberger T., Rajagopala S.V., and Uetz P., Matrix-based yeast two-hybrid screen strategies and comparison of systems. Methods in Molecular Biology, 2012. 812: p. 1–20. pmid:22218851
- 13. Yu H., Tardivo L., Tam S., Weiner E., Gebreab F., Fan C., et al., Next-generation sequencing to generate interactome datasets. Nature methods, 2011. 8(6): p. 478–480. pmid:21516116
- 14. Weimann M., Grossmann A., Woodsmith J., Özkan Z., Birth P., Meierhofer D., et al., A Y2H-seq approach defines the human protein methyltransferase interactome. Nature methods, 2013. 10(4): p. 339–342. pmid:23455924
- 15. Yachie N., Petsalaki E., Mellor J.C., Weile J., Jacob Y., Verby M., et al., Pooled‐matrix protein interaction screens using Barcode Fusion Genetics. Molecular systems biology, 2016. 12(4): p. 863. pmid:27107012
- 16. Lewis J.D., Wan J., Ford R., Gong Y., Fung P., Nahal H., et al., Quantitative Interactor Screening with next-generation Sequencing (QIS-Seq) identifies Arabidopsis thaliana MLO2 as a target of the Pseudomonas syringae type III effector HopZ2. BMC Genomics, 2012. 13: p. 8. pmid:22230763
- 17. Trigg S.A., Garza R.M., MacWilliams A., Nery J.R., Bartlett A., Castanon R., et al., CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping. Nature methods, 2017. 14(8): p. 819–825. pmid:28650476
- 18. Pauwels L., Barbero G.F., Geerinck J., Tilleman S., Grunewald W., Cuéllar-Pérez A., et al., NINJA connects the co-repressor TOPLESS to jasmonate signalling. Nature, 2010. 464(7289): p. 788–791. pmid:20360743
- 19. Causier B., Ashworth M., Guo W., and Davies B., The TOPLESS interactome: a framework for gene repression in Arabidopsis. Plant Physiology, 2012. 158(1): p. 423–438. pmid:22065421
- 20. Lynch T.J., Erickson B.J., Miller D.R., and Finkelstein R.R., ABI5-binding proteins (AFPs) alter transcription of ABA-induced genes via a variety of interactions with chromatin modifiers. Plant molecular biology, 2017. 93(4–5): p. 403–418. pmid:27942958
- 21. Uhrig J.F., Huang L.J., Barghahn S., Willmer M., Thurow C., and C., CC-type glutaredoxins recruit the transcriptional co-repressor TOPLESS to TGA-dependent target promoters in Arabidopsis thaliana. Biochimica et Biophysica Acta—Gene Regulatory Mechanisms, 2017. 1860(2): p. 218–226. pmid:27838237
- 22. Goralogia G.S., Liu T.K., Zhao L., Panipinto P.M., Groover E.D., Bains Y.S., et al., CYCLING DOF FACTOR 1 represses transcription through the TOPLESS co‐repressor to control photoperiodic flowering in Arabidopsis. Plant Journal, 2017. 92(2): p. 244–262. pmid:28752516
- 23. Espinosa-Ruiz A., Martínez C., de Lucas M., Fàbregas N., Bosch N., Caño-Delgado A.I., et al., TOPLESS mediates brassinosteroid control of shoot boundaries and root meristem development in Arabidopsis thaliana. Development, 2017. 144(9): p. 1619–1628. pmid:28320734
- 24. Graeff M., Straub D., Eguen T., Dolde U., Rodrigues V., Brandt R., et al., MicroProtein-mediated recruitment of CONSTANS into a TOPLESS trimeric complex represses flowering in Arabidopsis. PLoS genetics, 2016. 12(3): p. e1005959. pmid:27015278
- 25. Oh E., Zhu J.-Y., Ryu H., Hwang I., and Wang Z.-Y., TOPLESS mediates brassinosteroid-induced transcriptional repression through interaction with BZR1. Nature communications, 2014. 5: p. 4140. pmid:24938363
- 26. Szemenyei H., Hannon M., and Long J.A., TOPLESS mediates auxin-dependent transcriptional repression during Arabidopsis embryogenesis. Science, 2008. 319(5868): p. 1384–1386. pmid:18258861
- 27. Nagels Durand A., Moses T., De Clercq R., Goossens A., and Pauwels L., A MultiSite GatewayTM vector set for the functional analysis of genes in the model Saccharomyces cerevisiae. BMC Molecular Biology, 2012. 13(1): p. 30.
- 28. Cuéllar Pérez A., Pauwels L., De Clercq R., and Goossens A., Yeast two-hybrid analysis of jasmonate signaling proteins. Methods in Molecular Biology, 2013. 1011: p. 173–185. pmid:23615996
- 29. James P., Halladay J., and Craig E.A., Genomic libraries and a host strain designed for highly efficient two-hybrid selection in yeast. Genetics, 1996. 144(4): p. 1425–1436. pmid:8978031
- 30. Hellemans J., Mortier G., De Paepe A., Speleman F., and Vandesompele J., qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biology, 2007. 8(2): p. R19. pmid:17291332
- 31. Bolger A.M., Lohse M., and Usadel B., Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 2014. 30(15): p. 2114–20. pmid:24695404
Gnimpieba, E.Z., RNA-Seq workflow with TopHat and Cufflinks using Bioextract Server. Available at: http://www.myexperiment.org/workflows/3895.html?version=1., 2013.
- 33. Chini A., Gimenez-Ibanez S., Goossens A., and Solano R., Redundancy and specificity in jasmonate signalling. Current opinion in plant biology, 2016. 33: p. 147–156. pmid:27490895
- 34. Goossens J., Fernández-Calvo P., Schweizer F., and Goossens A., Jasmonates: signal transduction components and their roles in environmental stress responses. Plant molecular biology, 2016. 91(6): p. 673–689. pmid:27086135
- 35. Cuéllar Pérez A., Nagels Durand A., Vanden Bossche R., De Clercq R., Persiau G., Van Wees S.C., et al., The non-JAZ TIFY protein TIFY8 from Arabidopsis thaliana is a transcriptional repressor. PLoS ONE, 2014. 9(1): p. e84891. pmid:24416306
- 36. Chico J.M., Chini A., Fonseca S., and Solano R., JAZ repressors set the rhythm in jasmonate signaling. Current Opinion in Plant Biology, 2008. 11(5): p. 486–494. pmid:18653378
- 37. Vanholme B., Grunewald W., Bateman A., Kohchi T., and Gheysen G., The tify family previously known as ZIM. Trends in Plant Science, 2007. 12(6): p. 239–244. pmid:17499004
- 38. Gonzalez N., Pauwels L., Baekelandt A., De Milde L., Van Leene J., Besbrugge N., et al., A repressor protein complex regulates leaf growth in Arabidopsis. Plant Cell, 2015. 27(8): p. 2273–2287. pmid:26232487
- 39. Kazan K. and Manners J.M., MYC2: the master in action. Molecular Plant, 2013. 6(3): p. 686–703. pmid:23142764
- 40. Goossens J., Swinnen G., Vanden Bossche R., Pauwels L., and Goossens A., Change of a conserved amino acid in the MYC2 and MYC3 transcription factors leads to release of JAZ repression and increased activity. New Phytologist, 2015. 206(4): p. 1229–1237. pmid:25817565
- 41. Gasperini D., Chételat A., Acosta I.F., Goossens J., Pauwels L., Goossens A., et al., Multilayered organization of jasmonate signalling in the regulation of root growth. PLoS Genetics, 2015. 11(6): p. e1005300. pmid:26070206
- 42. Acosta I.F., Gasperini D., Chételat A., Stolz S., Santuari L., and Farmer E.E., Role of NINJA in root jasmonate signaling. Proceedings of the National Academy of Sciences of the United States of America, 2013. 110(38): p. 15473–15478. pmid:24003128
- 43. Gietz R.D., Yeast transformation by the LiAc/SS carrier DNA/PEG method. Methods in Molecular Biology, 2014. 1163: p. 33–44. pmid:24841298
- 44. Kazan K., Negative regulation of defence and stress genes by EAR-motif-containing repressors. Trends in Plant Science, 2006. 11(3): p. 109–112. pmid:16473545
- 45. Boxem M., Maliga Z., Klitgord N., Li N., Lemmens I., Mana M., et al., A protein domain-based interactome network for C. elegans early embryogenesis. Cell, 2008. 134(3): p. 534–545. pmid:18692475
- 46. Waaijers S., Koorman T., Kerver J., and Boxem M., Identification of human protein interaction domains using an ORFeome-based yeast two-hybrid fragment library. Journal of Proteome Research, 2013. 12(7): p. 3181–3192. pmid:23718855
- 47. Van Leene J., Eeckhout D., Cannoot B., De Winne N., Persiau G., Van De Slijke E., et al., An improved toolbox to unravel the plant cellular machinery by tandem affinity purification of Arabidopsis protein complexes. Nature Protocols, 2015. 10(1): p. 169–187. pmid:25521792
- 48. Goossens J., De Geyter N., Walton A., Eeckhout D., Mertens J., Pollier J., et al., Isolation of protein complexes from the model legume Medicago truncatula by tandem affinity purification in hairy root cultures. Plant Journal, 2016. 88(3): p. 476–489. pmid:27377668
- 49. Li J.J. and Herskowitz I., Isolation of ORC6, a component of the yeast origin recognition complex by a one-hybrid system. Science, 1993. 262(5141): p. 1870–1874. pmid:8266075
- 50. Wang M.M. and Reed R.R., Molecular cloning of the olfactory neuronal transcription factor Olf-1 by genetic selection in yeast. Nature, 1993. 364(6433): p. 121–6. pmid:8321284