Type B and type A influenza polymerases have evolved distinct binding interfaces to recruit the RNA polymerase II CTD

During annual influenza epidemics, influenza B viruses (IBVs) co-circulate with influenza A viruses (IAVs), can become predominant and cause severe morbidity and mortality. Phylogenetic analyses suggest that IAVs (primarily avian viruses) and IBVs (primarily human viruses) have diverged over long time scales. Identifying their common and distinctive features is an effective approach to increase knowledge about the molecular details of influenza infection. The virus-encoded RNA-dependent RNA polymerases (FluPolB and FluPolA) are PB1-PB2-PA heterotrimers that perform transcription and replication of the viral genome in the nucleus of infected cells. Initiation of viral mRNA synthesis requires a direct association of FluPol with the host RNA polymerase II (RNAP II), in particular the repetitive C-terminal domain (CTD) of the major RNAP II subunit, to enable “cap-snatching” whereby 5’-capped oligomers derived from nascent RNAP II transcripts are pirated to prime viral transcription. Here, we present the first high-resolution co-crystal structure of FluPolB bound to a CTD mimicking peptide at a binding site crossing from PA to PB2. By performing structure-based mutagenesis of FluPolB and FluPolA followed by a systematic investigation of FluPol-CTD binding, FluPol activity and viral phenotype, we demonstrate that IBVs and IAVs have evolved distinct binding interfaces to recruit the RNAP II CTD, despite the CTD sequence being highly conserved across host species. We find that the PB2 627 subdomain, a major determinant of FluPol-host cell interactions and IAV host-range, is involved in CTD-binding for IBVs but not for IAVs, and we show that FluPolB and FluPolA bind to the host RNAP II independently of the CTD. Altogether, our results suggest that the CTD-binding modes of IAV and IBV may represent avian- and human-optimized binding modes, respectively, and that their divergent evolution was shaped by the broader interaction network between the FluPol and the host transcriptional machinery.


Introduction
Influenza viruses are members of the Orthomyxoviridae family and are classified into four genera: influenza A, B, C and D viruses. Influenza A viruses (IAVs) and influenza B viruses (IBVs) are of public health importance, as they co-circulate in humans with a seasonal epidemic pattern and cause a significant morbidity and mortality, especially in the aged or immunocompromised population [1]. IBV infections account for an estimated 23% of all influenza cases [2], can become predominant during annual influenza epidemics, and can cause severe disease in children [3]. IBVs have received less attention because, unlike IAVs which continuously circulate in a wide range of birds and mammalian species [4], they have no known potential to cause pandemics. Based on sequence analysis of the viral hemagglutinin, the evolutionary divergence between IBVs and IAVs was estimated to have occurred about 4000 years ago [5]. The recent identification of IBV-like viruses in non-mammalian vertebrate species suggest that IBVs and IAVs have actually diverged over much longer time scales [6].
IBVs and IAVs share the same genome organization of eight single-stranded negative RNA segments, and major features of the viral replication cycle such as transcription and replication of the viral genome in the nucleus of infected cells. However their genes have undergone functional divergence, as reflected notably by the lack of intertypic genetic reassortment [7]. To identify common and distinctive features of IBVs and IAVs is an effective approach to improve our understanding of the molecular mechanisms of influenza infection and our ability to fight influenza disease.
The genomic RNA segments of IAVs and IBVs are organized into viral ribonucleoprotein complexes (vRNPs) [8]. In the vRNP, the 5 0 and 3 0 terminal viral RNA sequences are associated with one copy of the RNA-dependent RNA polymerase complex (FluPol) while the RNA is covered by multiple copies of the viral nucleoprotein (NP) [9][10][11]. FluPol is a heterotrimer composed of PB1 (polymerase basic protein 1), PB2 (polymerase basic protein 2), and PA (polymerase acidic protein) [12], which replicates and transcribes the viral RNA in the nucleus of infected host cells. Replication is a primer-independent two-step process, which relies on de novo initiation by FluPol [13,14]. In contrast, viral transcription is primer-dependent and results in the synthesis of 5 0 capped and 3 0 polyadenylated mRNAs, which are translated by the host translation machinery [15,16]. Polyadenylation is achieved by stuttering of FluPol at a 5 0 proximal oligo(U) stretch present on the genomic RNA [17,18]. In contrast to other RNA virus polymerases, FluPol cannot synthesize 5 0 cap structures [19]. In a process referred to as cap-snatching [20], FluPol binds the 5 0 cap of nascent host RNA polymerase II (RNAP II) transcripts by the PB2 cap-binding domain. Then, the PA endonuclease domain [21] cleaves 10-15 nts downstream of the 5 0 cap thereby generating primers that are used by FluPol to initiate transcription [18,19,22].
To perform cap-snatching, FluPol needs access to nascent capped RNAP II-derived RNAs, which represents a challenge as host cap structures are rapidly sequestered co-transcriptionally by the cap-binding complex [23]. The cellular RNAP II consists of 12 subunits [24], and the largest subunit (RPB1) is characterised by a unique long unstructured C-terminal domain (CTD) which in mammals consists of 52 repeats of the consensus sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser (Y 1 S 2 P 3 T 4 S 5 P 6 S 7 ). Post-translational modifications of the CTD during the transcription process are controlling the spatiotemporal regulation of RNAP II transcription [25,26]. FluPol binds specifically to S5 phosphorylated CTD (CTD pS5) [27,28] and it was proposed that it targets RNAP II for cap-snatching in the paused elongation state, of which CTD pS5 is the hallmark modification [29][30][31].
Structural studies revealed bipartite CTD binding sites on the FluPol of influenza A, B and C viruses (FluPol A , FluPol B and FluPol c ) with notable differences from one type to another [32,33]. However, the original crystal structure data for FluPol B were of insufficient resolution and only one of the CTD binding sites could be modelled, therefore preventing functional studies. In this study, we report the first high-resolution co-crystal structure of FluPol B bound to a CTD pS5 mimicking peptide that allows the modelling of both CTD-binding sites, one exclusively on PA also observed on FluPol A , and another, crossing from PA to PB2, specific for FluPol B . We used these novel data to perform structure-guided mutagenesis of FluPol B and FluPol A , followed by a systematic investigation of cell-based CTD-binding, cell-based polymerase activity and plaque phenotype of recombinant viruses. Our findings demonstrate that type B and type A influenza polymerases have evolved distinct binding interfaces to recruit the RNAP II CTD, which is intriguing as the RNAPI II CTD is highly conserved across influenza host species. We find that the PB2 627 subdomain, a major determinant of FluPol-host cell interactions and IAV host-range, is involved in CTD-binding for IBVs but not for IAVs. Finally, we provide evidence for additional FluPol-RNAP II interactions that do not involve the CTD.

Purification, crystallisation, data collection and structure determination of FluPol B with bound CTD peptide
Influenza B/Memphis/13/2003 polymerase, wild type or with the PA K135A mutation to eliminate endonuclease activity, was expressed and purified as described previously [22].
For crystals enabling high resolution visualisation of CTD binding in site 2B, FluPol B PA mutant K135A at 9 mg ml −1 (35 μM) was mixed with 40 μM of nucleotides 1-13 vRNA 5' end (5 0 -pAGUAGUAACAAGA-3 0 ) and 1.8 mM 28-mer CTD peptide (YSPTpSPS) 4 in a buffer containing 50 mM HEPES pH 7.5, 500 mM NaCl, 5% glycerol, 2 mM TCEP. Hanging drops for crystallisation were set up at 20˚C. Rod-shaped crystals growing up to 700 μm in length appeared one week after set-up in mother liquor containing 100 mM tri-sodium citrate and 13% PEG 3350 with a drop ratio of 0.5 μl + 2 μl protein to well solution. Crystals were cryo-protected with additional 20% glycerol and 1.8 mM CTD peptide in mother liquor and

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface flash-frozen in liquid nitrogen. Data were collected on ESRF beamline ID29 and integrated with an ellipsoidal mask using AUTOPROC/STARANISO to an anisotropic resolution of 2.42-2.95 Å. The structure was solved using molecular replacement with PHASER [34] using PDB:5FMZ as model [35]. The model was iteratively corrected and refined using COOT [36] and REFMAC5 [37] and quality-controlled using MOLPROBITY [38]. See Table 1 for data collection and refinement statistics.

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface with a drop ratio of 1 μl + 2 μl protein to well solution. The drops were soaked with 840 μM CTD peptide for 17 days. Crystals were cryo-protected with an additional 30% glycerol and 885 μM peptide in mother liquor and flash-frozen in liquid nitrogen. Data were collected on ESRF beamline ID30A1 (MASSIF) and processed and refined as described above, using PDB:5MSG as model for molecular replacement. See Table 1.

Structure determination of FluPol A (H7N9) core with bound CTD peptide
The core of influenza A/Zhejiang/DTID-ZJU01/2013(H7N9) polymerase comprising PA 201-716, PB1 full-length, PB2 1-127 was expressed and purified from insect cells as described previously [18]. A/H7N9 polymerase core at a concentration of 9 mg/ml was co-crystallised with 60 μm of a 12-mer of the vRNA 5' end  in sitting drops at 4˚C in conditions of 0.1 M Tris pH 7.0, 13% PEG 8K, 0.2 M MgCl 2 , 0.1 M guanidine hydrochloride with drop mixing ratios of 1:2 (protein:well). Crystals grew typically within 4-5 days and diffracted to around 3.5 Å resolution. A four-repeat pS5 CTD mimicking peptide (Tyr-Ser-Pro-Thr-pSer-Pro-Ser) 4 was soaked into existing crystals at a concentration of~2 mM over a period of 24 h. Data were collected on ESRF beamline ID29 and processed and refined as described above, using previously described apo-H7N9 core structure ( [18,39], PDB:6TU5) as model for molecular replacement. See Table 1 for data collection and refinement statistics. The PDB numbers for the new protein structures are provided in Table 1: 7Z42, 7Z43 and 7Z4O.
The resulting amplicon was cloned in frame downstream the G2 sequence into the pCI vector (G2-CTD). A sequence in which each CTD serine 5 residue was replaced by an alanine was ordered as synthetic gene (GenScript) and subcloned in place of the wild-type CTD sequence into the G2-CTD construct (G2-CTD-S5A). The pCI-G2-NUP62 plasmid was described

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface previously [44]. Mutations were introduced by an adapted QuickChange site-directed mutagenesis (Agilent Technologies) protocol [45]. Primers and plasmid sequences are available upon request.

Protein complementation and minigenome assays
HEK-293T cells were seeded in 96-well white opaque plates (Greiner Bio-One) the day before transfection. For the split-luciferase complementation assays, cells were co-transfected in technical triplicates with 25 ng plasmid encoding the polymerase subunits PB2, PB1 and PA (either PB2-G1 or PA-G1, respectively) and 100 ng of the G2-tagged targets (CTD, RPB1 or RPB2, respectively) using polyethyleneimine (PEI-max, #24765-1 Polysciences Inc). When indicated, the CDK7 inhibitor BS-181-HC (Tocris Bioscience) was added 24 hours post-transfection (hpt) at a final concentration of 20 μM for 1 h. DMSO 0.2% was used as a control. Cells were lysed 20-24 hpt in Renilla lysis buffer (Promega) for 45 min at room temperature under steady shaking (650 rpm) and the Gaussia princeps luciferase enzymatic activity was measured on a Centro XS LB960 microplate luminometer (Berthold Technologies, reading time 10 s after injection of 50 μl Renilla luciferase reagent (Promega)). The Normalized Lumines-cence Ratios (NLRs) were calculated as follows: the luminescence activity (Relative Light Units or RLU) measured in cells co-transfected with the plasmids encoding the PB2/PA-G1 and G2-CTD/RPB1/RPB2 fusion proteins was divided by the sum of the luminescence activities (RLU) measured in control samples co-transfected with either the G2 and PB2/PA-G1 plasmids or the G1 and G2-CTD/RPB1/RPB2 plasmids. For the minigenome assays, cells were cotransfected in technical triplicates with 25 ng of each pcDNA3.1 PB2, PB1, PA, in conjunction with 50, 10 and 5 ng of the pCI-NP, pPolI-Firefly and pTK-Renilla plasmids, respectively. Luciferase activities were measured 20-24 hpt using the the Dual-Glo Luciferase Assay system (Promega) according to the manufacturer's instructions.

Production and characterisation of recombinant viruses
The recombinant viruses were produced by transfection of a co-culture of HEK-293T and MDCK cells as described previously [40,41]. The reverse genetics supernatants were titrated on MDCK cells in a standard plaque assay as described before [47]. Plaque diameters were measured upon staining with crystal violet using Fiji [48].

In vitro endonuclease and transcription activity assays
RNA for the activity assays was produced in vitro with T7 polymerase. Recombinant polymerases used corresponding to A/little yellow-shouldered bat/Guatemala/060/2010 and B/Memphis/13/2003 were purified as previously described [42]. 23 nt RNA (5'-GAAUCUAUACAUA AAGACCAGGC-3') was capped with vaccinia capping enzyme and 2'-O-methyltransferase ) and 250 μM NTP mix (ThermoFisher). 50 μM CTD peptides were added at concentrations corresponding to at least a 10-fold excess over the K D of the lowest measured affinity for a two-repeat peptide. Two-and four-repeat phosphoserine 5 (pS5) CTD peptides were purchased from Covalab and six-repeat pS5 CTD peptide was synthesised at the Chemical Biology Core Facility at EMBL Heidelberg.

PLOS PATHOGENS
Reactions were incubated at 30˚C for 30 min and quenched with RNA loading dye (formamide, 8 M urea, 0.1% SDS, 0.01% bromophenol blue, 0.01% xylene cyanol), supplemented with 50 mM EDTA and boiled at 95˚C. The reaction products were separated on 20% denaturing acrylamide gel (containing 8 M urea) in Tris-Borate-EDTA (TBE) buffer, exposed on a Storage Phosphor screen and recorded with a Typhoon reader. DECADE marker was used as ladder.
As the predicted RefSeq sequences available for the Gallus gallus (XP_040551262) and Anas platyrhyncos (XM_038172734) RPB1 subunits were only partial, we designed a targeted protein sequence assembly strategy data based on RNA-seq and/or WGS SRA public data available for these two species. To obtain the Gallus gallus RPB1 complete sequence (1969 aa), we first aligned Illumina RNA-seq short reads (ERR2664216) on the human RefSeq curated protein sequence (NP_000928) using DIAMOND algorithm [50], and then used the aligned reads for subsequent Trinity transcript assembly ("-longreads XP_040551262" option to use the partial sequence as a guide) followed by Transdecoder for the ORF prediction [51]. The Anas platyrhyncos RPB1 complete sequence (1970 aa) was obtained by aligning Illumina RNA-seq short reads (SRR10176883) and PACBIO long reads (SRR8718129, SRR8718130) on the JACEUL010000271.1 genomic scaffold by using respectively HISAT2 [52] and minimap2 [53] followed by Stringtie2 [54] with the "-mix" option to allow hybrid de novo gene assembly. The RPB1 coding sequences from Gallus gallus and Anas platyrhyncos are held on the Zenodo repository: https://doi.org/10.5281/zenodo.6467097. The CTD sequences were aligned with SnapGene 6.0 and visualised by Espript 3.0 [55].

Cocrystal structures reveal distinct CTD binding sites in FluPol B and FluPol A
Previous structural studies using a four repeat CTD pS5 peptide mimic (YSPTpSPS) 4 [32] revealed two distinct CTD binding sites on FluPol B , denoted site 1B and site 2B. Site 1B, exclusively on the PA subunit and in which the pS5 phosphate is bound by PA basic residues K631 and R634, is essentially the same as site 1A for bat influenza A polymerase [32] and is thereafter named site 1AB. Site 2B, which extends across the PA-PB2 interface, is unique to FluPol B and distinct from site 2A for FluPol A , which is again exclusive to the PA subunit [32]. However, the original crystal structure data for FluPol B were of insufficient resolution to be able to

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface construct a model for the CTD peptide in site 2B, nor even to define its directionality. To overcome this limitation, we co-crystallised the four repeat pS5 peptide with influenza B/Memphis/13/2003 polymerase in a different P2 1 crystal form, previously used to obtain a structure with the 5' end of the vRNA [35], and measured anisotropic diffraction data to a resolution of 2.42-2.95 Å ( Table 1). The resultant map, which contains two heterotrimers in the asymmetric unit, showed clear electron density in site 2B for both trimers (S1A Fig), into which an unambiguous model for the CTD peptide could be built (Figs 1A and S2A). No significant differences were observed between the two heterotrimers except for small displacements in the poorly defined endonuclease and cap-binding domains, nor in the mode of CTD binding to site 2B, although the density for one of the binding-sites was slightly better due to B-factor differences. Only very weak density for the CTD peptide is observed in site 1AB, perhaps because of competition with a phosphate bound at the position of the phosphoserine. To reconfirm that sites 1B and 2B could be occupied simultaneously, we re-crystallised full promoter-bound FluPol B with the CTD peptide in the original P3 2 21 crystal form, but this time with a capped primer and at lower pH. Under these conditions, the extremity of the vRNA 3' end is in the RNA synthesis active site [56]. Anisotropic diffraction data to a resolution of 3.12-3.56 Å was measured and the resultant map showed clear electron density for the CTD peptide bound in both sites 1AB and 2B ( Fig 1B and Table 1), as reported previously for this crystal form [32] but with slightly improved resolution. Unexpectedly, the CTD peptides bound in site 1AB and site 2B are orientated such that they cannot be linked by the shortest path, as this would be between both N-termini, which are~17 Å apart, whereas the straight-line distance between the C-ter of site 1AB and N-ter of site 2B is~36 (44) Å. These distances suggest that a minimum of 6, probably 7, heptad repeats would be required to occupy both sites contiguously ( Fig  1B, dotted red line). This contrasts with the situation in FluPol A , where the peptide directionality in sites 1AB and 2A allow them to be linked by the shortest path, implying that four heptad repeats is sufficient to occupy both sites ( Fig 1C) [32]. Three repeats (designated repeats a, b and c) of the CTD peptide (i.e. Y1aS2aP3aT4apS5a-P6aS7a-Y1bS2bP3bT4bpS5bP6bS7b-Y1cS2cP3cT4cpS5cP6c) are visible in site 2B in both structures, including two well-defined phosphoserines (in bold). The N-terminal part of the CTD peptide (Y1a-S2b) forms a compact structure comprising two successive proline turns stabilised by four intra-peptide hydrogen bonds, with P3a stacked on Y1b and P6a stacked on PA/Y597 (Fig 1D). PB2 R134 partially stacks against the other side of the Y1b sidechain, whose hydroxyl group hydrogen bonds to the main-chain of PB2 I135. The phosphate of pS5a forms a strong salt-bridge with PA R608 as well as hydrogen bonding with S7a. FluPol B -specific PA R608 is in a four-residue insertion (606-GDRV-609) compared to FluPol A , with hydrogen bond interactions from PA D607 and N611 positioning the side-chain of PA Y597 under the CTD peptide. This configuration of residues seems specifically designed to accommodate the compactly folded CTD peptide. Interestingly, recently identified FluPol B -like polymerases from fish and amphibians [57] also possess the four-residue insertion in PA. However, only in the Wuhan spiny-eel influenza virus polymerase, which is remarkably similar to human Flu-Pol B , are all the functional residues Y597, D607, R608, N611 conserved [6,57]. The rest of the CTD peptide (P3b-T4c) has an extended conformation and lies across the PB2 627-domain ( Fig 1D). To create the CTD binding surface requires concerted side chain reorientations of PB2 W553, M572 and W575 (Figs 1D and S2A), allowing P6b to pack on W553 and Y1c on M572 and L561, with its hydroxyl group hydrogen bonding to D571. PB2 K556 forms a salt bridge with pS5b.
Most functional studies on CTD are performed with human or avian influenza A polymerase, whereas CTD binding has only been structurally characterised for bat A/little yellowshouldered bat/Guatemala/060/2010(H17N10) polymerase [32] and C/Johannesburg/1/1966

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface [33]. Although sequence alignments and mutational studies strongly suggest that the mode of CTD binding is conserved for all IAV-like polymerase [32], we attempted to confirm this by determining the structure of a CTD mimicking peptide bound to influenza A/Zhejiang/ DTID-ZJU01/2013(H7N9) polymerase. Previously, we have reported crystals of the A/H7N9 core (PA 201-716, PB1 full-length, PB2 1-127) in the apo-state, which forms symmetrical dimers as described elsewhere [18,58,59]. Here, we soaked the four-repeat pS5 CTD peptide mimic into co-crystals of H7N9 core with the vRNA 5' hook. The crystals diffracted to a maximum resolution of 3.41 Å (Table 1) and again contain symmetrical dimers of the polymerase core (S2B Fig). We observed clear electron density, not only for the 5' hook, but also for the CTD peptide bound in site 2A (S1B and S2C Figs), essentially identically bound as previously seen for bat influenza A/H17N10 polymerase (S2D Fig). However, there was no CTD peptide bound in site 1AB. The most likely explanation for this is that in the symmetrical dimeric form of influenza A (core only, or full trimer), both polymerases are in the so-called 'dislocated' conformation [18] with an open active site. In particular, PA regions 425-452 and 586-618 are rotated by~20˚, compared to the active, monomeric promoter bound state (e.g. A/H3N2 polymerase structure, [58], PDB:6RR7). This particularly affects the position of key site 1AB binding site residues Y445, E449 and F612 (S2D

The FluPol-CTD interaction can be monitored using a cell-based luciferase complementation assay
To confirm the structural findings of this study and investigate the distinctive features of CTD binding sites in FluPol B and FluPol A in the cellular context, we set up a CTD-binding assay using the Gaussia princeps luciferase trans-complementing fragments (G1 and G2) [43]. The full-length CTD was fused to G2 and the SV40 nuclear localization signal (G2-CTD). PB2 or PA were fused to G1 at their C-terminus (PB2-G1 and PA-G1, respectively, schematically represented in red in Fig 2A and 2B) as FluPol was shown to retain activity when tagged at these sites [44]. Upon co-expression of G2-CTD and the three polymerase subunits (including PB2-G1 or PA-G1), a luminescence signal resulting from the FluPol-CTD interaction was measured, which was generally higher for FluPol B compared to FluPol A (Fig 2A and 2B). The interaction signal decreased when PB1 was omitted, in agreement with previous reports that 6.0. Key residues for CTD binding are indicated in bold. Identical, similar and non-similar residues are highlighted in purple, light blue and orange, respectively. Grey boxes indicate residues that form a loop. Residues submitted to mutagenesis in this study are indicated with their numbers above (FluPol A ) and below (FluPol B ) the alignment, respectively. https://doi.org/10.1371/journal.ppat.1010328.g001

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface the FluPol-CTD interaction depends on FluPol assembly [28], and it was independent of Flu-Pol catalytic activity (Fig 2A and 2B, PB1 D444A-D445A mutant [60]). When key CTD-contacting residues of FluPol A were mutated (PA K289A and R638A [32]), the interaction signal was significantly decreased compared to PA wt (Fig 2C and 2D). To test whether the were C-terminally tagged in frame with G1. As controls, the wild-type (wt) PB1 was replaced by the catalytic inactive PB1 D444A D445A mutant (i) or was omitted (-). Luciferase activities were measured in cell lysates at 24 hpt. Normalised luciferase ratios (NLRs) were calculated as described in the Materials and Methods section. The data shown are the mean ± SD of at least three independent experiments performed in technical triplicates. �� p � 0.002; ��� p � 0.001 (two-way ANOVA; Dunnett's multiple comparisons test). C-D. The CTD binding of FluPol A mutants PA K289A and R638A was investigated. HEK-293T cells were transfected as described in (A) and (B), respectively. Relative Light Units (RLUs) are expressed as percentages relative to the FluPol A PA wt. The data shown are the mean ± SD of three independent experiments performed in technical triplicates. �� p � 0.002; ��� p � 0.001 (one-way ANOVA; Dunnett's multiple comparisons test). The dotted line labelled "Ctrl" indicates the background signal i.e. the sum of the luminescence activities measured in control samples co-transfected with either the FluPol-G1 and G2 plasmids, or the G1 and G2-CTD plasmids. E. Schematic representation of the CTD constructs used in (F) and (G): the wild-type G2-CTD (wt, top) and the G2-CTD in which all serine 5 residues were replaced with an alanine (S5A, bottom). F-G. The interaction of the wt or the S5A mutated CTD to FluPol B (F) or FluPol A (G) was investigated by transient transfection in HEK-293T cells as described in (A-B). The data shown are the mean ± SD of four independent experiments performed in technical triplicates. � p � 0.033; ��� p � 0.001 (two-way ANOVA; Sidak's multiple comparisons test). In parallel, cell lysates were analysed by western blot using antibodies specific for the pS5 or pS2 CTD, G.princeps luciferase (Gluc) and tubulin. The slow-and fast-migrating bands detected with the pS2 CTD antibody, which likely correspond to the hyperphosphorylated and hypophosphorylated forms of the CTD, respectively, are indicated by a star and a triangle, respectively. The smeared signal in the G2-S5A-CTD samples likely corresponds to the detection of phosphorylation intermediates. https://doi.org/10.1371/journal.ppat.1010328.g002

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface FluPol-CTD binding assay reflects the dependency on the phosphorylation of the CTD S5 moiety, all S5 residues of the CTD were mutated to alanine (schematically represented in Fig  2E), which prevented S5 phosphorylation as documented by western blot (Fig 2F and 2G, bottom). Although the wt and S5A CTD showed similar steady-state levels of expression, the binding of the S5A CTD to FluPol B and FluPol A was significantly decreased compared to WT CTD. Consistently, pharmacological inhibition of CDK7, which represents the major kinase for CTD S5 phosphorylation [61], specifically reduced FluPol A/B binding to the CTD but not to the FluPol A interaction partner NUP62 [44] (S6A and S6B Fig). Overall, these data demonstrate that the FluPol-CTD interaction can be accurately monitored in cells using our splitluciferase complementation assay conditions.

Structure-driven mutagenesis confirms FluPol B and FluPol A have distinct CTD binding modes on PA
To systematically assess in vivo FluPol-CTD binding, we mutated key residues forming the CTD binding sites in the FluPol B and/or FluPol A co-crystal structures and measured the impact of these mutations on CTD-binding using the split-luciferase complementation assay described above (Fig 2). In parallel, we investigated polymerase activity in a minireplicon assay using a viral-like Firefly luciferase reporter RNA, and we rescued recombinant mutant IBVs and IAVs and measured plaque diameters on reverse genetic supernatants as a read-out for viral growth capacity. The A/WSN/1933 residue nature and numbering is used in the text and figures, except when indicated.
The structure and key residues of the CTD binding site 1AB are conserved between FluPol B and FluPol A (Figs 1E and 3A). We mutated the pS5 interacting residues PA K635 and R638 to alanines. The mutations did not affect PA accumulation levels ( Fig 3B) but significantly decreased in vivo binding to the full-length CTD for both FluPol B and FluPol A (Figs 3C and S7A), which is in line with biochemical data obtained in vitro with CTD-mimicking peptides [32]. Consistently, the corresponding recombinant mutant IBVs and IAVs were attenuated or could not be rescued (Fig 3D), and the mutant FluPols activity was generally reduced (Fig 3E). The CTD-binding and minireplicon data were consistent for the PA-K635A-R638A double mutant. However, the single mutants had a stronger impact on FluPol B than FluPol A in the CTD-binding assay whereas the opposite trend was observed in the minigenome assay (Fig 3C  and 3E). The PA-R638A mutation, which decreased FluPol B CTD-binding ( Fig 3C) and reduced viral growth (Fig 3D), even increased FluPol B activity in the minigenome assay ( Fig  3E). We and others have previously documented discrepancies between FluPol activity as measured in a minigenome assay and viral growth capacity (e.g. [62,63]). The most likely explanation for our data is that each of the PA-K635A and PA-R638A mutations on its own has no or a mild effect on CTD-binding when FluPol B is incorporated into a transiently expressed vRNP.
CTD binding site 2A differs substantially between FluPol B and FluPol A (Figs 1E and 4A). We introduced mutations at residues PA K289, R454 and S420, which are critical for CTD binding to FluPol A and are not conserved in FluPol B , and we deleted the PA 550 loop, which buttresses the CTD in FluPol A and is considerably shortened in FluPol B . These modifications did not affect PA accumulation levels ( Fig 4B) and specifically decreased in vivo CTD binding of FluPol A but not FluPol B (Figs 4C and S7B). Consistently, the PA R454A and S420E mutations in the IBV background (IBV numbering: K450A and K416E) did not impair viral growth (Fig 4D) nor did PA K450A affect FluPol B polymerase activity (Fig 4E). The PA S420E and PA 550 loop deletion impaired FluPol B activity (Fig 4E), indicating that they hinder a function of the polymerase besides CTD binding.

The PB2 627 domain is involved in CTD binding for FluPol B but not FluPol A
The key CTD binding residues and the 3D structure of site 2B are often conserved between FluPol B and FluPol A (Figs 1E and 5A). However, CTD binding at site 2B has never been observed in vitro with FluPol A , and the inserted PA 608 loop (IBV numbering) which buttresses the CTD at the junction between the PA-Cter and PB2 627 domains in FluPol B is absent in FluPol A . The PA R608A mutation significantly decreased FluPol B CTD binding in our cellbased complementation assay (Fig 5B middle panel, no counterpart residue in FluPol A ). Consistently, the corresponding recombinant mutant IBV could not be rescued upon reverse genetics (S8 Fig), and the PA R608 mutant FluPol B showed reduced polymerase activity ( Fig  5B, right panel). We then mutated to alanines the residues W552 and R555 (W553 and R556 according to IBV numbering), which are located on the PB2 627 domain, make contact with the CTD pS5 in the FluPol B co-crystal and are conserved between IBVs and IAVs (Figs 1E and  5A). The mutations did not affect PB2 accumulation levels ( Fig 5C) and either decreased

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface (R555A) or increased (W552A) CTD binding of FluPol B , whereas they had no effect on Flu-Pol A CTD binding (Figs 5D and S7C). We speculate that the W552A mutation (W553A according to FluPol B numbering) could make it easier for the FluPol B PB2 627 domain to adopt a CTD-bound conformation. To rule out any CTD binding activity on the FluPol A PB2 627 domain, we deleted the whole domain as described before [64]. The deletion strongly and specifically decreased CTD binding to FluPol B but not to FluPol A (Fig 5E). Nevertheless, single amino acid substitutions at residues PB2 W552 and R555 impaired viral growth and polymerase activity of FluPol B as well as FluPol A , however with weaker effects on FluPol A (Figs 1F and 5G). Given the multiple functions attributed to the PB2 627 domain [64], the most likely interpretation of our data is that residues on the PB2 627 domain contribute to the CTD recruitment exclusively for IBVs while they have overlapping CTD-unrelated functions for IBVs and IAVs.
We asked whether this major difference between FluPol B and FluPol A CTD binding modes results in different levels of transcriptional activation by CTD mimicking peptides in vitro. A model has been proposed for FluPol C in which the CTD stabilizes a transcriptioncompetent conformation by binding at the interface of PB1, P3 (PA equivalent), and the flexible PB2 C-terminus [33]. Our observations suggest that the same model could apply to Flu-Pol B and not to FluPol A . Therefore, we tested the impact of pS5 CTD mimicking peptides of varying lengths (two, four, or six YSPTSPS repeats) on FluPol B and FluPol A in vitro transcriptional activity (Fig 6). The FluPol B in vitro endonuclease activity (Fig 6A, lane 4) and elongation activity ( Fig 6A, lane 8) were increased in the presence of the six-repeat pS5 CTD mimicking peptide compared to the mock control (Fig 6A, lanes 1 and 5), and a similar trend was observed with FluPol A (Fig 6B, lanes 4 and 8 compared to lanes 1 and 5, respectively). These data complement previous reports that CTD pS5 binding facilitates FluPol A and FluPol C transcriptional activity [33] and strengthen the hypothesis that the CTD stabilises FluPol in a transcription-competent conformation [20]. However, our finding that the

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface FluPol A PB2 627 domain has no CTD binding activity ( Fig 5E) indicates that FluPol A has evolved a divergent mechanism by which the CTD stabilizes the FluPol transcriptase. It also questions whether bridging of the PB2 627 and PA-Cter domains per se is needed for transcriptional activation of FluPols.

FluPol B and FluPol A bind to the host RNAP II independently of the CTD
The RNAP II transcriptional machinery is highly conserved across eukaryotes, and the CTD in particular shows almost no sequence differences among vertebrate (mammalian and avian) host species susceptible to IAV or IBV infection (S9 Fig). It is therefore unlikely that differences in the CTD amino acid sequence drove the evolution of divergent IBV and IAV CTDbinding modes. We investigated whether FluPol A/B can interact with the two major RNAP II subunits (RPB1, RPB2) independently of the CTD, using the split-gaussia luciferase complementation assay. The G2-RPB1 and RPB2-G2 fusion proteins were co-expressed with Normalised luciferase ratios (NLRs) were calculated as described in the Materials and Methods section. ��� p � 0.001 (two-way ANOVA; Sidak's multiple comparisons test). Cell lysates were analysed in parallel by western blot with antibodies specific for the pS5 CTD, G. princeps luciferase (PB2-G1) and tubulin. F-G. Characterisation of recombinant IAV and IBV viruses (F) and polymerase activity (G) of CTD-binding site 2B mutants. Experiments were performed as described in Fig 3D-

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface G1-tagged FluPol (PA-G1) by transient transfection as described above. Both combinations resulted in robust and comparable interaction signals (Fig 7A). Interestingly, in the presence of a truncated RPB1 deleted from the CTD (RPB1ΔCTD), a stable interaction signal with Flu-Pol A/B could still be measured, most likely corresponding to a direct interaction between Flu-Pol A/B and the core domain of RPB1. RNPA II can form clusters in the nucleus [65] and bridging of the G2-RPB1ΔCTD to G1-tagged FluPol through a wild-type RNAP II complex cannot be formally excluded. However, it is unlikely that an active G. princeps luciferase can be reconstituted in these conditions, given the large size of the complexes involved (RNAPII, > 500 kD; FluPol, > 240 kD). Moreover, as the G2-RPB1ΔCTD protein accumulates at substantially higher levels than the endogenous RPB1 protein (Fig 7A), it is likely to compete with the endogenous RPB1 protein for assembly into RNPAII complexes and their subsequent import into the nucleus, and therefore RNAP II nuclear clusters are likely to contain a majority of G2-tagged RNPA II complexes. Mutations in site 1AB which reduced CTD binding to background levels (Fig 3C)   In the FluPol B samples, a secondary transcription product larger in size than those expected from the size of the template is detected, most likely resulting from stable hybridization of template and primary product as previously described [18]. Quantification of the reaction products of four independent experiments is shown below (FluPol B in blue and FluPol A in grey, respectively). The products of the reactions are normalised to the total RNA amount for each reaction and are presented as fractions of the activity of the reaction without peptide. � p � 0.033, ��� p � 0.001 (one-way ANOVA; Dunnett's multiple comparisons test). https://doi.org/10.1371/journal.ppat.1010328.g006

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface PB2 K556A in Fig 7C). These findings, taken together with the relatively low affinity of FluPol for pS5 CTD peptides [32], suggest that the CTD is not the only interface between FluPol and the host RNAP II, and it may not be essential to connect FluPol to the RNAP II but rather to coordinate FluPol cap-snatching.

Discussion
Here we report co-crystal structures of a human FluPol B and an avian (isolated from human) FluPol A bound to pS5 CTD mimicking peptides. We uncover the conformation and directionality of the CTD peptide bound to FluPol B at a site that crosses over from the PA-Cter to the PB2 627 domain (site 2B), and has no counterpart on FluPol A or FluPol C . Two CTD binding sites have been characterised on FluPol A (sites 1A and 2A) ( [32] and this study) and on Flu-Pol C (sites 1C and 2C, distinct from sites 1A and 2A) [33]. On the FluPol B co-crystal structure, site 1B is similar to site 1A, whereas site 2B is distinct from site 2A and 2C.
By performing structure-based mutagenesis of FluPol B and FluPol A followed by a systematic investigation of FluPol-CTD binding, FluPol transcription/replication activity and viral phenotype, we confirm that CTD binding involves the same key residues at site 1AB for

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface FluPol B and FluPol A , but distinct and specific residues at site 2A for FluPol A and site 2B for FluPol B , respectively. In particular, we demonstrate that the PA 606-609 loop, which buttresses the CTD at the junction between PA and PB2 in the FluPol B co-crystal structure and is not conserved in FluPol A or FluPol C , is an essential component of site 2B.
Our data and others' [32,33] demonstrate that IAVs, IBVs and ICVs have evolved divergent CTD binding modes, and raise questions about the driving force behind this divergent evolution. Large-scale meta-transcriptomic approaches have identified IBV-like and IDV-like viruses in fish and amphibians, suggesting that the influenza viruses of all four genera might be distributed among a much wider range of vertebrate species than recognised so far [6,57]. Phylogenetic analyses, although limited by strong sampling biases across species, indicate that both virus-host co-divergence over long timescales and cross-species transmissions have shaped the evolution of influenza viruses. With one of the two CTD binding sites being conserved between IAVs and IBVs but absent in ICVs, the divergence of the bipartite CTD binding mode apparently matches the evolutionary distance between the three types of influenza viruses [66]. Interestingly however, we demonstrate that, in contrast to what is observed for IBVs and ICVs ( [32,33] and this study), the PB2 627 domain is not involved in CTD binding for IAVs. Therefore, from a mechanistic point of view, the CTD-dependent transcriptional activation of FluPol might be closer between IBVs and ICVs than between IBVs and IAVs as a consequence of a distinctive evolutionary pressure exerted on IAVs. The FluPol A CTD binding mode presumably reflects an avian-optimised mode and co-evolved with protein interfaces between avian host factors and the PB2 627 domain, known to restrict avian IAV replication in humans (the principal hosts of IBVs and ICVs).
Another example of a functional interaction with RNAP II being achieved through distinct CTD binding is provided by the cellular mRNA capping enzyme (CE). The CEs from Schizosaccharomyces pombe, Candida albicans and Mus musculus were shown to bind directly S5 CTD repeats with very distinct binding interfaces and distinct conformations of the bound CTD [67]. These distantly related species show major differences in the CTD length and sequence [26] which could at least partially account for the divergence in CE-CTD binding modes. In contrast, the CTD is highly conserved among host species susceptible to IAV, IBV and ICV infections (S9 Fig). There is considerable evidence, however, that the FluPol-CTD interaction is only part of a more complex interaction pattern between the viral and cellular transcription machineries, raising the possibility that interactions between the FluPol and less conserved components of the cellular transcriptional machinery could have indirectly shaped the evolution of distinct CTD binding modes. We observed that a truncated RPB1 subunit, which lacks the CTD, retains partial binding to FluPol (Fig 7). Mass-spectrometry screenings have identified other RNAP II subunits and multiple transcriptional pausing and elongation factors as potential Flu-Pol interaction partners [31]. Host factors involved in transcription such as DDX17 were found to bind FluPol and to determine IAV host-specificity [68]. By analogy, CEs not only bind to the pS5 CTD but also to the transcription pausing DRB Sensitivity-Inducing Factor (DSIF) [67] and make additional direct interactions with the nascent transcript exit site on the body of RNAP II [69]. Likewise, it was shown recently that the integrator complex binds RNAP II in its promotor-proximal paused state through direct interactions with the CTD of RPB1 but also with RPB2, RPB3, and RPB11, and through indirect interaction with the negative elongation factor NELF and DSIF [70]. Intriguingly, FluPol was also found to interact with the DSIF subunit SPT5 [71]. To what extent host-specific features of SPT5 or other cellular factors may have constrained the evolution of CTD-binding sites on FluPol remains to be explored.

Evolutionary divergence of influenza polymerase/host RNAPII interface
We show that the in vitro transcriptional activity of FluPol B is facilitated by the addition of CTD pS5 mimicking peptides, as reported previously for FluPol A and FluPol C [33]. The mechanism previously proposed for FluPol C [20,33] in which the CTD stabilises FluPol in a transcription-competent conformation by bridging P3 (the PA equivalent for ICVs) and PB2, could possibly apply to FluPol B with PA-PB2 bridging occurring at site 2B. Our data show that it does not apply to FluPol A , unless another yet unidentified domain of PB2, distinct from the PB2 627 domain, is involved.
As underlined by the different sensitivity of IAV and IBVs to cap-binding inhibitors related to differences in the cap-binding mode of their PB2 subunits [72], a detailed understanding of structural and functional differences between FluPol A and FluPol B is of significant importance with regard to the development of broad-spectrum antivirals and need to be taken into account when targeting the FluPol-CTD binding interface for antiviral intervention.

PLOS PATHOGENS
Evolutionary divergence of influenza polymerase/host RNAPII interface 6.0 and visualised by Espript 3.0 [55]. The CTD repeat numbers are indicated below the sequence alignment. Identical and similar residues are indicated in red or yellow, respectively. (TIF)