Unraveling the mechanism of recognition of the 3’ splice site of the adenovirus major late promoter intron by the alternative splicing factor PUF60

Pre-mRNA splicing is critical for achieving required amounts of a transcript at a given time and for regulating production of encoded protein. A given pre-mRNA may be spliced in many ways, or not at all, giving rise to multiple gene products. Numerous splicing factors are recruited to pre-mRNA splice sites to ensure proper splicing. One such factor, the 60 kDa poly(U)-binding splicing factor (PUF60), is recruited to sites that are not always spliced, but rather function as alternative splice sites. In this study, we characterized the interaction of PUF60 with a splice site from the adenovirus major late promoter (the AdML 3' splice site, AdML3’). We found that the PUF60–AdML3’ dissociation constants are in the micromolar range, with the binding affinity predominantly provided by PUF60’s two central RNA recognition motifs (RRMs). A 1.95 Å crystal structure of the two PUF60 RRMs in complex with AdML3’ revealed a dimeric organization placing two stretches of nucleic acid tracts in opposing directionalities, which can cause looping of nucleic acid and explain how PUF60 affects pre-mRNA geometry to effect splicing. Solution characterization of this complex by light-scattering and UV/Vis spectroscopy suggested a potential 2:1 (PUF602:AdML3’) stoichiometry, consistent with the crystal structure. This work defines the sequence specificity of the alternative splicing factor PUF60 at the pre-mRNA 3’ splice site. Our observations suggest that control of pre-mRNA directionality is important in the early stage of spliceosome assembly, and advance our understanding of the molecular mechanism by which alternative and constitutive splicing factors differentiate among 3’ splice sites.

The table below summarises the geometric issues observed across the polymeric chains and their fit to the electron density. The red, orange, yellow and green segments on the lower bar indicate the fraction of residues that contain outliers for >=3, 2, 1 and 0 types of geometric quality criteria. A grey segment represents the fraction of residues that are not modelled. The numeric value for each fraction is indicated below the corresponding segment, with a dot representing fractions <=5% The upper red bar (where present) indicates the fraction of residues that have poor fit to the electron density. The numeric value is given above the bar.

Mol Chain Length
Quality of chain 2 Entry composition i ○ There are 4 unique types of molecules in this entry. The entry contains 3191 atoms, of which 0 are hydrogens and 0 are deuteriums.
In the tables below, the ZeroOcc column contains the number of atoms modelled with zero occupancy, the AltConf column contains the number of residues with at least one atom in alternate conformation and the Trace column contains the number of residues modelled with at most 2 atoms.
• Molecule 1 is a protein called Poly(U)-binding-splicing factor PUF60. 3 Residue-property plots i ○ These plots are drawn for all protein, RNA and DNA chains in the entry. The first graphic for a chain summarises the proportions of the various outlier classes displayed in the second graphic. The second graphic shows the sequence view annotated by issues in geometryand electron density. Residues are color-coded according to the number of geometric quality criteria for which they contain at least one outlier: green = 0, yellow = 1, orange = 2 and red = 3 or more. A red dot above a residue indicates a poor fit to the electron density (RSRZ > 2). Stretches of 2 or more consecutive residues without any outlier are shown as a green connector. Residues present in the sample, but not in the model, are shown in grey.

Mol Chain Residues
• Molecule 1: Poly(U)-binding-splicing factor PUF60 Chain A: Xtriage's analysis on translational NCS is as follows: The largest off-origin peak in the Patterson function is 5.82% of the height of the origin peak. No significant pseudotranslation is detected.
5 Model quality i ○

Standard geometry i ○
Bond lengths and bond angles in the following residue types are not validated in this section: CL The Z score for a bond length (or angle) is the number of standard deviations the observed value is removed from the expected value. A bond length (or angle) with |Z| > 5 is considered an outlier worth inspection. RMSZ is the root-mean-square of all Z scores of the bond lengths (or angles). There are no bond angle outliers.

Mol Chain
There are no chirality outliers.
There are no planarity outliers.

Too-close contacts i ○
In the following The all-atom clashscore is defined as the number of clashes found per 1000 atoms (including hydrogen atoms). The all-atom clashscore for this structure is 14. 5KW1 All (84) close contacts within the same asymmetric unit are listed below, sorted by their clash magnitude. There are no symmetry-related clashes.

Protein backbone i ○
In the following table, the Percentiles column shows the percent Ramachandran outliers of the chain as a percentile score with respect to all X-ray entries followed by that with respect to entries of similar resolution.
The Analysed column shows the number of residues for which the backbone conformation was analysed, and the total number of residues. In the following table, the Percentiles column shows the percent sidechain outliers of the chain as a percentile score with respect to all X-ray entries followed by that with respect to entries of similar resolution.

Mol Chain
The Analysed column shows the number of residues for which the sidechain conformation was analysed, and the total number of residues. 5.4 Non-standard residues in protein, DNA, RNA chains i ○ There are no non-standard protein/DNA/RNA residues in this entry.

Carbohydrates i ○
There are no carbohydrates in this entry.

Ligand geometry i ○
Of 1 ligands modelled in this entry, 1 is monoatomic -leaving 0 for Mogul analysis.
There are no bond length outliers.
There are no bond angle outliers.
There are no chirality outliers.
There are no torsion outliers.
There are no ring outliers.
No monomer is involved in short contacts.

Other polymers i ○
There are no such residues in this entry.

Polymer linkage issues i ○
There are no chain breaks in this entry. 6 Fit of model and data i ○ 6.1 Protein, DNA and RNA chains i ○ In the following table, the column labelled '#RSRZ> 2' contains the number (and percentage) of RSRZ outliers, followed by percent RSRZ outliers for the chain as percentile scores relative to all X-ray entries and entries of similar resolution. The OWAB column contains the minimum, median, 95 th percentile and maximum values of the occupancy-weighted average B-factor per residue. The column labelled 'Q< 0.9' lists the number of (and percentage) of residues with an average occupancy less than 0.9. 6.2 Non-standard residues in protein, DNA, RNA chains i ○ There are no non-standard protein/DNA/RNA residues in this entry.

Carbohydrates i ○
There are no carbohydrates in this entry.

Ligands i ○
In the following table, the Atoms column lists the number of modelled atoms in the group and the number defined in the chemical component dictionary. The B-factors column lists the minimum, median, 95 th percentile and maximum values of B factors of atoms in the group. The column labelled 'Q< 0.9' lists the number of atoms with occupancy less than 0.9.