Figures
Abstract
Alu-elements comprise a large part of the human genome and some insertions have been shown to cause diseases. Here, we illuminate the protective role of an Alu-element in the 3’UTR of the human Factor 9 gene and its ability to ameliorate a poly(A) site mutation in a hemophilia B patient, preventing him from developing a severe disease. Using a minigene, we examined the disease-causing mutation and the modifying effect of the transposon in cellulo. Further, we simulated evolutionary scenarios regarding alternative polyadenylation before and after Alu insertion. A sequence analysis revealed that Old World monkeys displayed a highly conserved polyadenylation sites in this Alu-element, whereas New World monkeys lacked this motif, indicating a selective pressure. We conclude that this transposon has inserted shortly before the separation of Old and New World monkeys and thus also serves as a molecular landmark in primate evolution.
Citation: Kopp J, Rovai A, Ott M, Wedemeyer H, Tiede A, Böhmer HJ, et al. (2024) A transposable element prevents severe hemophilia B and provides insights into the evolution of new- and old world primates. PLoS ONE 19(10): e0312303. https://doi.org/10.1371/journal.pone.0312303
Editor: Xiaoyong Sun, Shandong Agricultural University, CHINA
Received: May 16, 2024; Accepted: October 2, 2024; Published: October 18, 2024
Copyright: © 2024 Kopp et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Hemophilia B is a clotting disorder caused by mutations in the human coagulation factor IX gene (hF9), which codes for a serine protease in the intrinsic pathway of the coagulation cascade. With a prevalence of approximately 1:30.000 in males, it represents an orphan disease. The factor IX variant database (www.factorix.org) currently lists 1692 variants (August 2023) in coding and non-coding regions of the F9 gene. The F9 gene is located on the X-chromosome (Xq27.1-q27.2).
The maturation of eukaryotic mRNA involves various steps such as 5’-capping, splicing and polyadenylation, which facilitate nuclear export and protection from exonucleases [1,2]. mRNAs without a poly(A) tail are degraded rapidly by exonucleases [3]. The process of polyadenylation is initiated by recognition of the hexameric poly(A) signal (PAS) AATAAA by the cleavage and polyadenylation specificity factor CPSF [4]. In addition to the binding of CPSF to the PAS, a G/U-rich sequence downstream named downstream sequence element (DSE) is recognized by the Cleavage stimulating Factor (CstF) [5]. Subsequently, the endonuclease CPSF73 together with other key players such as Cleavage Factors I and II (CF I, CF II) cut the RNA molecule approximately 10–30 nucleotides downstream of the PAS at the Cleavage Site (CS), which is normally composed of a UA or CA dinucleotide [6], where the poly(A) polymerase (PAP) binds and starts operating and adds around 250 Adenine-nucleotide residues [4,7]. The poly(A) tail is then recognized and bound by poly(A) binding proteins (PABP), which act as regulators of translation [8]. Besides the consensus motif of AATAAA alternative PAS such as ATTAAA or AATGAA do exist but exhibit a reduced polyadenylation- and cleavage activity as systematically demonstrated by Sheets and colleagues [9]. Therefore, mutations in the PAS can significantly reduce polyadenylation efficiency and impede gene expression.
More than two thirds of human genes show alternative polyadenylation (APA), which offers the possibility of generating different transcript isoforms. The usage of alternative PAS modulates the length and thus the properties of a 3’UTR as microRNA binding sites and secondary structures are located within these sequences [10,11]. The modulation of 3’UTR length has been shown to be of importance during physiological processes such as embryonic development where prolonged 3’UTRs were observed [12] and in T-cell activation, where shortened 3’UTRs were associated with enhanced growth advantages [13]. But also pathological processes involve APA during oncogene activation [14–16].
Alu elements are primate specific transposable elements with a length of approximately 300 bp. They belong to the group of short interspersed nuclear elements (SINEs). With more than one million copies, they make up more than 10% of the human genome [17]. Approximately 65 million years ago (mya) Alu elements have developed from a duplication of the 7SL-RNA gene, which is a component of the signal recognition particle [18–20]. Alu elements can be subdivided into various subfamilies of which the Alu J subfamily represents the earliest one followed by the Alu S subfamily, which includes Sx, Sq, Sp and Sc types. Alu Y elements represent the youngest subfamily largely abundant in old world monkeys and humans [21].
The insertion of Alu elements has been shown to cause various genetic defects such as hemophilia A, X-SCID, breast cancer or lipoprotein lipase deficiency [22]. The major pathological mechanism, however, remains de novo insertion leading to either disruption of open reading frames, splice sites or regulatory sequences [22]. Very recently, Alu elements have been shown to be involved in promoter-enhancer wiring via RNA sequence interactions [23]. Notably, the younger Alu-elements of the Y-subfamily appear to be predominantly associated with genetic diseases. In general, Alu elements have a terminal A-stretch, which can accumulate mutations or shorten over time, reflecting their transposition activity [24]. Therefore, a simple A>T transversion can directly lead to the generation of a PAS (AATAAA) within the terminal A-stretch. So far, only disease-causing effects through premature transcriptional termination were reported [25]. Here, we describe a hemophilia B patient, who apparently profits from the presence of an Alu Sx element harboring three functional PAS, as they appear to substitute the function of the mutated authentic F9 PAS1.
Materials and methods
Cell culture
HEK 293T-Cells are a Human embryonic kidney cell line, which expresses the Simian-Virus 40 (SV40) large T-Antigen. Cells were cultured in T-75 flasks at 37°C and 5% CO2 in a sterile incubator. Passaging occurred every 2–3 days when the cells reached a confluency of >90%. Cells were washed with PBS and treated with 1 mL of trypsin solution. After 2–3 minutes of incubation, the cells were carefully removed from the flask surface and split in a 1:10 manner. Subsequently, fresh DMEM media supplemented with 1% Penicillin/Streptomycin, 10% FCS and 1% sodium pyruvate was added and the flask was gently swirled to equally distribute the cells. Transfection was performed according to the manufacturer’s protocol of Lipofectamine 3000 (Invitrogen).
Half-life experiments were conducted in HeLa TA cells harboring a stably integrated copy of the Tet-transactivator by the addition of 20 ng Doxycyclin.
For FIX activity analysis, 4×10^5 HEK 293T cells were seeded in a six-well. At a confluency of about 50%, cells were transfected with various FIX minigenes using Viafect (Promega). 12 hours post transfection, medium was replaced with FCS-free DMEM, supplemented with 4μg/mL vitamin K, 2.5% BSA and 1% L-Glutamine and cells were incubated for 96 hours.
Cloning work
The generation of pSK_SV40_F9_wt is described in Krooss et al. 2020. To introduce an alternative poly(A) signal inactivating mutation, the source plasmid pSK_SV40_F9_wt was modified via PCR mutagenesis in the following way: Using the forward primer P1, which includes an XbaI site 5’ (underlined) and the according modification of the poly(A) signal (bold), and the reverse primer P4 a PCR was performed on the SV40_F9_wt template. The PCR-product was then digested using XbaI/MunI and ligated into the linearized target plasmid, thereby giving rise to pSK_SV40_F9_PAS1mut2. For the generation of the PAS1 mutation as described in Li et al. 2000, the SV40-driven minigene published in Krooss et al. 2020 was modified via PCR mutagenesis in the following way similar to pSK_SV40_F9_PASmut1: Using the forward primer P2, which includes an XbaI site 5’ (underlined) and the according modification of the poly(A) signal (bold), and the reverse primer P4 a PCR was performed on the SV40_F9_wt template. The PCR-product was then digested using XbaI/MunI and ligated into the linearized target plasmid, thereby giving rise to pSK_SV40_F9_PAS1mut. For the generation of another alteration of the PAS1, the SV40-driven minigene published in Krooss et al. 2020 was modified via PCR mutagenesis in the following way similar to pSK_SV40_F9_PASmut1 and pSK_SV40_F9_PAS2mut: Using the forward primer P3, which includes an XbaI site 5’ (underlined) and the according modification of the poly(A) signal (bold), and the reverse primer P4 a PCR was performed on the SV40_F9_wt template. The PCR-product was then digested using XbaI/MunI and ligated into the linearized target plasmid, thereby giving rise to pSK_SV40_F9_PAS1mut. The deletion of the Alu Sx element from the source plasmid pSK_SV40_F9_wt was performed via overlap PCR using primers P5+P6 and P7+P8 and a fusion of the products in a final PCR using P6+P8 with subsequent cloning via EcoRI/XbaI. The canine 3’UTR was extracted from Madin-Darby canine kidney (MDCK) cells using primers P13 and P14. Subsequently the PCR product and the target vector (pSK_SV40_F9_wt) were digested using BamHI/MunI and ligation was performed.
To generate the shut off vectors, F9 plasmids pSK_SV40_F9_wt and pSK_SV40_F9_PAS1mut were amplified using P16+P17 and transferred into a pTetbi promoter driven vector (kindly provided by V. Cordes, MPI Göttingen) via AgeI, BamHI and NcoI digest.
RNA work
RNA methods were performed as described previously [26]. For detection of F9 RNA, a specific probe was generated from the FIX plasmid by HindIII/BamHI digestion. The GAPDH-specific probe was prepared as described. cDNA synthesis for quantitative PCR (qPCR) was conducted using QuantiTect Reverse Transcription Kit (Qiagen) according to the manufacturer’s protocol. Prior to reverse transcription, RNA was treated with TURBO DNase (Invitrogen) and purified via RNeasy columns (Qiagen). qPCR was performed with QuantiTect SYBR Green PCR Kit (Qiagen) using primer pairs P9+P10 (F9) and P11+P12 (GAPDH). For 3’RACE experiments, poly(A)+ -RNA of transfected cells was isolated and 50 ng were reverse transcribed. To this, the Invitrogen 3’-RACE protocol and the provided reagents were utilized. The PCR products were separated via agarose gel electrophoresis. After gel extraction, the DNA was transferred into pCR2.1 (Invitrogen) for sequencing. Northern blot using total RNA obtained from HEK 293T cells transfected with the indicated constructs. The membrane was hybridized with a P32-labelled probe corresponding to the F9 cDNA. The position of F9 mRNA is indicated on the right. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) serves as a loading control. Bands were densitometrically quantified using a phosphoimager.
Factor IX activity measurement.
FIX activity was determined as described in Krooss et al. 2020 using a one-stage coagulation assay. Samples were diluted 1:5 in imidazol buffer and mixed with an equal amount of FIX deficient plasma and Actin FS APTT reagent (all from Siemens Healthcare). After incubation at 37°C for 2 min, 0.025 M calcium-chloride was added, and the coagulation time recorded in an Amelung KC10 coagulometer. Calibration curves were generated with human standard plasma diluted in FIX deficient plasma.
Results
A Patient with a severe poly(A) site mutation displays regular F9 mRNA levels
The human F9 3’UTR incorporates one actual PAS and one cryptic PAS, which are interspaced by 210 nucleotides. Earlier, we have demonstrated that physiologically only the first poly(A) signal (PAS1) is utilized for hF9 mRNA polyadenylation [26]. Initial polyadenylation analysis of the factor IX mRNA revealed exclusive utilization of PAS1 and no evidence for PAS2 activity as determined by 3’RACE (S1 Fig in S1 File). The inactivity of PAS2 can be explained by its increased distance to DSE2 as a consequence of the Alu Sx element insertion, since the physiological distance between PAS and DSE does not exceed more than approximately 60 bp [27]. This Alu element is conserved in various species as depicted in Fig 1A.
A) Architecture of the F9 3’UTR and downstream sequence. PAS1 is located 1364 nucleotides downstream of F9 exon 8. It is followed by its authentic DSE and another PAS (PAS2), which has been shown to be inactive (marked in red). 3’ of PAS2 the Alu Sx element is located and interspaces PAS2 from its putative DSE (DSE2). B) Description of the pathogenic mutation including the nucleotide position (modified from Li et al. 2000(28)). C) Depiction of polyadenylation- and cleavage activity of a standard poly(A) signal and the mutated one in the patient. The activity of these modifications was measured in the context of ß-globin mRNA by Sheets et al. (1990). D) Depiction of the F9 minigene holding the patient’s mutation in the 3’UTR. E) qRT-PCR on F9 mRNA upon transfection of the F9 minigenes into HEK 293T cells, normalized to GAPDH (n = 3 biological replicates, error bars represent standard deviation. Students t test; ns: not significant.). F) FIX clotting activity measured by a one-step clotting assay in supernatant obtained from F9 minigene-transfected HEK 293T cells (n = 3 biological replicates, bars represent standard deviation, unpaired two-tailed t test; p = 0.0008). G) Measurement of F9 mRNA half-life using an expression shut-off vector. pTetbi F9wt and pTetbi F9_PAS1mut plasmids were transfected into Hela TA cells. 36 hours post transfection, doxycycline was added to shut off transcription. mRNA was purified at several timepoints after transcriptional shut-off. mRNA amounts were determined via Northern, bars represent standard deviation).
In 2000 Li et al. [28] have described a hemophilia B patient with a poly(A) site mutation (32739 A>G) in the F9 gene (Fig 1B). However, the disease severity of this individual was classified as mild, which intriguingly does not reflect the expected mRNA expression level, assuming that an AATAAA>AATAGA conversion leads to a residual polyadenylation activity of approximately 3% as demonstrated by Sheets et al. [9] (Fig 1C). The weak predicted residual activity of the mutated poly(A) site, together with our earlier finding that exclusively PAS1 appears to be functional (S1 Fig in S1 File), encouraged us to modify our previously established F9 minigene accordingly to mimic the genomic situation of this patient (Fig 1D). Despite the predicted polyadenylation efficiency of 3%, F9 mRNA levels determined via RT-qPCR revealed only a slight decrease of mRNA amounts when compared to healthy donor F9 mRNA (Fig 1E). However, significant differences in coagulation activity (secreted Factor IX) were detected upon analyzing the media of the transfected cells (Fig 1F). To test, whether a different mRNA half-life would be responsible for the reduced activity on protein level, we constructed inducible vectors that can facilitate transcriptional shut-off immediately upon doxycycline exposition. Total mRNA was harvested 2, 4, 6, and 8 hours post induction and F9 specific mRNA was determined. The results of the half-life analysis indicate a reduced half-life for the patient’s F9 mRNA when compared to healthy donor control (Fig 1G), despite some variation at the 2 h time point.
An Alu-element protects the patient’s mRNA by facilitating alternative polyadenylation
To further characterize the mRNA derived from the PASmut minigene, we performed Northern blot analysis (Fig 2A). In addition, two F9 minigenes with alternative PAS1 modifications to ensure total polyadenylation inactivity (ACTAGA and AATGAA) were cloned and investigated. Intriguingly, a novel F9 mRNA species (~1 kb longer than the F9 wild type mRNA) was observed in all of the PAS1 variants (Fig 2A). This striking difference in mRNA length, however, does not match with an alternative polyadenylation event at PAS2, which already has been proved to be non-functional (Fig 2B). Using 3’RACE analysis and Sanger sequencing, we detected polyadenylation at the 3’ end of the Alu-element in case of all PAS1 variants (Fig 2B and 2C). The sequence analysis moreover revealed the presence of a triple AATAAA hexamer that we refer to as multiPAS. Taken together, these findings indicate the presence of a functional PAS inside the Alu Sx element (multiPAS) taking advantage of the DSE2 and a switch of polyadenylation upon inactivation of PAS1 and evidenced non-functionality of PAS2 as illustrated in Fig 2D.
A) Northern blot of mRNAs collected from cells transfected with F9 minigenes holding various different PAS mutations. B) 3’ RACE performed on mRNAs obtained from HEK 293T cells upon transfection of F9wt and F9 PAS1 mutated minigenes. The yellow box highlights the band of a new F9 mRNA species polyadenylating at an alternative poly(A) site. C) Sanger sequencing results of the band highlighted in B). Sequencing indicates polyadenylation at a multi-poly(A) site located inside the Alu Sx element. D) Schematic showing a model of alternative polyadenylation regarding the non-functionality of PAS1 (due to mutation) and PAS2 due to a distanced downstream sequence element (DSE2). Polyadenylation therefore occurs at the multiPAS located at the 3’ end of the Alu Sx element.
PAS2 regains its function after Alu-deletion
To test the functionality of PAS2 prior to Alu invasion, we generated a F9 minigene deficient of the Alu element referred to as F9_wt_dAlu (Fig 3A). F9 mRNA analysis of HEK 293T cells transfected with this construct revealed utilization of both PAS1 and PAS2, as confirmed via 3’RACE and Northern blot (Fig 3B and 3C). The results of the 3’-RACE sequencing also indicate the utilization of different cleavage sites downstream of PAS1 (Fig 3D).
A) Schematic depicting the arrangement of genomic elements upon deletion of the Alu Sx sequence. B) Gel electrophoresis of cDNA upon 3’RACE of mRNA of F9_wt_dAlu transfected HEK 293T cells. No template control (NTC) and mock serve as controls. +/- indicates the presence of reverse transcriptase inside the reaction. The bands in lane 2 indicate run slightly different and belong to the MLX gene (as validated by Sanger sequencing), which is aberrantly amplified by the F9-specific primers and also has a second polyadenylation site in its 3’UTR. Lane 4 shows two distinct lanes at 300 and 600 nts, which indicate polyadenylation at PAS1 and PAS2 of the F9 mRNA. C) Northern blot of mRNA obtained from HEK 293T cells transfected with F9_wt (lane 1) and F9_dAlu (lane 2). In contrast to the F9wt mRNA, the F9_dAlu mRNA exhibits two bands analogous to the 3’RACE and indicative of dual poly(A) site utilization. D) Sanger sequencing of the bands (300 bp or 600 bp) shown in lane 4. The sequencing results indicate polyadenylation at both PAS1 and PAS2, also with different cleavage sites (cCS = canonical cleavage site, uCS = unknown cleavage site).
At this point we turned to species missing the Alu element and containing two functional PAS. This configuration applies to all canine species. A minigene combining human F9 cDNA and canine F9 3’UTR was generated (S2 Fig in S1 File) and polyadenylation was analyzed via 3’RACE. In this case, polyadenylation also at both canine PAS (cPAS) was observed.
These results support the assumption that PAS2 was functional in a common ancestor of primates and dogs long time before the invasion of Alu elements.
A PAS1 mutation shortly after Alu Sx invasion would have resulted in a fatal FIX deficiency
The Alu Sx element downstream of the hF9 3’UTR is orientated in sense and has accumulated numerous mutations within its A-stretch giving rise to multiPAS complexes conserved in humans and the vast majority of old-world monkeys. Intriguingly, no PAS is abundant at the corresponding position in new-world monkeys according to the UCSC genome browser accessed in June 2023 (https://genome.ucsc.edu/ [29]) (Fig 4A and 4B). However, to investigate the effect of the PAS1 mutation shortly after invasion of the Alu Sx element into the primate genome approximately 30 mya (32 mya [30]), we rejuvenated its terminal A-stretch by reverting all A>T transversions that have accumulated over time (Fig 4C). This construct (F9_def. multiPAS) represents a complete null mutation in the patient context as no F9 mRNA could be detected upon transfection (Fig 4F, Lane 4). In a final scenario, we simulated the effect of the PAS1 mutation in the context of an inversely integrated Alu Sx element (Fig 4E), as it is known that they can insert in an antisense manner. A construct holding an inverse Alu Sx element at the exact same position intriguingly revealed a low mRNA expression. This residual expression of approximately 20% (compared to F9wt) emerges from polyadenylation at PAS2 and the inverted multiPAS (Fig 4E Lane 5), indicative of a cryptic DSE inside the inverted Alu Sx element.
A) The sequence of the terminal Alu Sx element in several different species was compared in addition to the presence of a second poly(A) site (PAS1). Old World primates are highlighted in green, new world primates are highlighted in yellow. None of the New World primates holds a functional PAS in the Alu Sx element. B) Phylogenetic tree of primates over the last 60 million years and proposed Alu Sx invasion time point. The Alu Sx invasion must have occurred before separation of old world- and new world monkeys. However, only in old world monkeys a functional multiPAS is present. C) Schematic of a hypothetic Alu Sx element evolution. A pure poly(A) tail acquires T insertions over time (middle panel). By exchanging the T nucleotides with A nucleotides, the initial state of the poly(A) tail is restored (lower panel). The rejuvenated Alu Sx element is incorporated in the F9_def.multiPAS minigene. D) Schematic of a F9 minigene devoid of an Alu Sx element. E) Depiction of a minigene, which holds the Alu Sx element in an inversed orientation. This scenario should mimic the possibility of polyadenylation upon inverse Alu Sx insertion. F) Northern blot and G) results of the densitometric analysis upon phospho-imaging). HEK 293T cells were transfected with the respective minigenes and mRNA was subjected to Northern blot analysis (n = 3 biological replicates, error bars represent standard deviation. Student’s t test: *: p < 0.05 (exact p-value: 0.011); ***:p < 0.0001; ns: not significant. Exact p-values: 1).
Comparative sequence analysis of the Alu Sx element in OWM and NWM reveals substantial differences
The stark differences in the multiPAS sequence between NWM (marmorset and squirrel monkey) and OWM evident in our initial genome browser analysis (Fig 4A) prompted us to substantiate this finding by analyzing more primate sequences. Based on a collection of primate genomes a multiple sequence alignment (MSA) of the terminal region of the F9 Alu Sx element was performed. Sequencing results of 546 OWM and NWM were aligned. The results of the MSA corroborate the observation made in the UCSC genome browser, which harbors the genomes of only 11 primate species (chimp, gorilla, orangutan, gibbon, rhesus, crab eating macaque, baboon, green monkey, squirrel monkey and bushbaby). Table 1, which only shows representative examples of the over 500 analyzed sequence, reveal a clear cut difference between both primate groups.
Specifically, the MSA revealed the NWM-specific motif “-ATACATACATA” at the corresponding position of the multiPAS in OWM Table 1. These results clearly mark the separation of New World from Old World monkeys at this position in their genomes and therefore serves as a molecular clock landmark. Furthermore, the evolutionary path in NWM displays more or less random mutations that are acquired regularly by Alu poly A stretches. However, the selection of a multiPAS element may indicate some selective pressure. The nature of this selection is currently under investigation.
Discussion
In our study, we have demonstrated the protective effect of an Alu Sx element on various mutations in the authentic PAS of the F9 gene. Initial analyses stated that polyadenylation of F9 mRNA exclusively occurs at PAS1. A second PAS (PAS2) located 223 nt downstream of PAS1, however, has been shown to be non-functional most likely due to the increased distance to its original DSE (DSE2), caused by the integration of the Alu Sx element ~35mya.
In the light of these findings, the question arose, why a patient with a predicted devastating poly(A) site mutation only exhibits a mild hemophilia B as reported by Li and colleagues[28].
To investigate this phenomenon, we have incorporated this particular mutation into a minigene holding the human F9 open reading frame, F9 3’ UTR and a downstream sequence including the Alu Sx element. Intriguingly, expression analyses via RT-qPCR indicated similar mRNA levels compared to the F9 wild type control (Fig 1D), despite the poor predicted polyadenylation activity of the mutated poly(A) site. On the level of coagulation activity, however, the PAS1 mutation has led to a certain but not massive reduction (Fig 1E), reflecting the mild disease severity described by Li and colleagues. This reduction is only partially due to an accelerated F9 mRNA decay (Fig 1G). We speculate that the extended 3’UTR terminating at the Alu element exhibit reduced translation efficiency. Along these lines, 3’UTR shortening was shown to correlate with higher protein output per mRNA molecule.
Further examination of the F9_PAS1mut mRNA via Northern blot revealed a novel mRNA species ~350 bp longer than F9_wild type mRNA (Fig 2A). 3’RACE analysis and Sanger sequencing evidenced a poly(A) switch into a multiPAS at the 3’ end of the Alu Sx element (Fig 2C).
Alu elements accumulate random mutations after a new insertion. The variant frequency reveals information on the time point of insertion, if compared to the parental sequence of that class of Alu elements. The multiPAS emerged from the terminal A-stretch of the Alu Sx element appears to take advantage of the DSE2, formerly associated with PAS2 as presented in Fig 3. Thereby, it facilitates alternative polyadenylation and near physiological mRNA expression levels in the case of the F9_PAS1 mutation in the patient. This exaptation of Alu elements is rare, but has been described before on a genome wide level [31,32]. This protection mechanism, however, only seems to be present in humans and old world primates.
Our finding, that the terminal A-stretch of the Alu Sx element contains multiple functional poly(A) sites, in turn raised the question of their formation. The generation of the PAS hexamer AATAAA from an A-stretch involves the event of a transversion (A>T), which occurs less common than a transition (A>G) as an interchange of purine and pyrimidine bases is required [33]. The constant disruption of the A-stretch with Ts in defined intervals and the thereby resulting poly(A) sites, infers the presence of a previous putative selective pressure towards polyadenylation at the Alu Sx element during evolution of Old world monkeys. In contrast, New world monkeys display no bias towards a multiPAS, but rather the default mutational drift after insertion of the Alu Sx.
Such a scenario in Old world monkeys could include a moderate mutational destruction of PAS1, still providing a basic coagulation activity but not a sufficient bleeding control upon blood loss (e.g. birth, injury). The generation of an alternative or additional poly(A) site inside the Alu Sx element by a single point mutation could have provided a significant selective advantage.
Although highly speculative at this point, parasites may also be considered as a potential selective pressure, which have acted on the formation of a multi-poly(A) site in the F9 3’UTR of all Old World monkeys. As malaria tropica caused by Plasmodium falciparum, which is endemic in West Africa until today, can induce disseminated intravascular coagulation (DIC), a modifying mechanism such as alternative polyadenylation as described above could have a beneficial effect on the survival of this disease and therefore led to the formation of a multiPAS. Assuming that the protective multiPAS has formed as a consequence of selective pressure, we aimed to investigate the effect of the PAS1 mutation shortly after insertion of the Alu Sx element into the F9 3’UTR. For this purpose, we have rejuvenated the A-stretch of the Alu Sx element by reverting all A>T mutations. Transfection of this construct (F9_PAS1mut_def.multiPAS) resulted in a complete loss of F9 mRNA expression (Fig 4F, Lane 4), indicative of the lethal effect of this mutation shortly after Alu Sx invasion, when no multiPAS was present. This again would support the above-mentioned hypothesis of a moderately impaired PAS1 prior to the multiPAS formation. Of course, the mutational bias in the homopolymeric A-stretch of the Alu Sx could also be explained by founder effects on populations passing through a genetic bottleneck. However, given the explanations above this seems unlikely.
To investigate the effect of the PAS1 mutation prior to Alu Sx invasion, we have generated a minigene deficient of it (F9_PASmut_dAlu). Our hypothesis that PAS2 functionality can be reconstituted upon spatial approximation of DSE2 via Alu Sx removal, proved to be correct, as a relative F9 mRNA expression level of approximately 70% was determined (Fig 4E, Lane 4). This result together with the finding that Alu deletion without PAS1-mutation leads to the utilization of both PAS1 and PAS2 (Fig 3) substantiates our model of PAS2 inactivation upon Alu Sx invasion. This poly(A) site constellation, however, precisely reflects the genomic situation of the bushbaby (Fig 4A), which constitutes the last primate without the Alu Sx element in the F9 3’UTR. In addition, both PAS1 and PAS2 appear to be conserved in afrotheria and carniformia species. Still, we were surprised that a perfect surgical removal of the Alu and thereby sending this genomic locus 38 mya back in time, reactivates PAS2 being silent and prone to mutations for such a long time.
In an additional scenario, we mimicked the antisense insertion of the Alu Sx element (Fig 4E). Interestingly, this modification has led to a residual F9 mRNA expression, which we explain with the presence of a putative DSE located inside the Alu Sx element. The antisense invasion of Alu elements close to a gene body and subsequent mutation of the A-stretch (A>T) to generate a PAS, which is even functional due to the presence of a DSE inside the Alu element, could be a potential mechanism of Alu elements to become vital and indispensable components of genes by generating novel transcripts that could be of advantage during a selection process.
Alu Sx elements are present in the genomes OWM and NWM. Strikingly, the multiPAS in the Alu Sx of the F9 gene is only conserved in OWM, but not in NWM (Fig 4A). NWM have separated approximately 40 mya [34]. Given that Alu Sx elements have invaded the primate genomes 35–55 mya, when Old World and New World have already separated geographically, the question arises, how this transposon has managed to integrate at identical positions in the genome of OWM and NWM simultaneously–on both sides of the Atlantic Ocean. As this scenario is rather unlikely, one explanation could be the transatlantic migration of primates into the New World shortly after Alu Sx invasion.
Our results indicate that NWM are not protected against a PAS1 mutation, which in turn implies the lack of an according selective pressure as possibly occurred in the OWM (e.g. a mild PAS1 mutation). In addition, the corresponding sequence motif in the Alu Sx element of the NWMs (ATACATACATA) was screened for RNA-binding factors using RBPmap [35]. However, no putative binding motif was detected, suggestive of the presence of a selective pressure on the preservation of the multiPAS in the Alu Sx element of the OWMs.
Based on the results of the MSA, we could show that especially the 3’ end of the Alu Sx element in the F9 3’UTR substantially differs between OWM and NWM. This specific difference may serve as an important molecular clock landmark for the separation event of OWM and NWM.
Taken together that Alu Sx invasion (30–50 mya) happened in all analyzed primates on the African continent except the Bushbaby, we assume that the separation of OWM and NWM must have occurred shortly after Alu invasion but shortly before a putative selective pressure event, which has led to the generation of the multiPAS in all OWMs.
By the time of OWM and NWM separation, Old World and New World were already ~2000 km apart, which raises the question of how and where exactly NWM have entered South America as there was no land bridge present and sea levels remained approximately 200 m higher than nowadays [36]. Accepted explanatory approaches for the transatlantic migration of primates consider “island hopping” and, particularly, the presence of “floating islands” [37–41]. Our finding that none of the analyzed NWMs holds a multiPAS sequence in the F9 Alu Sx, substantiates the assumption that most probably a single or at best a few transatlantic migration events of primates in the same period of time may have occurred indicative of a founder effect.
Conclusion
In our study we highlight the beneficial effect of an Alu Sx element, which provides residual FIX activity in a hemophilia B patient by alternative polyadenylation in a Multi-PAS, which has evolved in the 3’ A-stretch. This specific feature is highly conserved in old world- but not in new world primates underlining the important role of this sequence in the transatlantic separation process of those species.
References
- 1. Fuke H, Ohno M. Role of poly (A) tail as an identity element for mRNA nuclear export. Nucleic Acids Res [Internet]. 2008 Feb [cited 2014 Sep 15];36(3):1037–49. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2241894&tool=pmcentrez&rendertype=abstract. pmid:18096623
- 2. Dreyfus M, Re P. The Poly (A) Tail of mRNAs: Bodyguard in Eukaryotes, Scavenger in Bacteria Minireview. 2002;111:611–3. pmid:12464173
- 3. Hocine S, Singer RH, Grünwald D. RNA processing and export. Cold Spring Harb Perspect Biol [Internet]. 2010 Dec;2(12):a000752. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24267292. pmid:20961978
- 4. Bienroth S, Keller W, Wahle E. Assembly of a processive messenger RNA polyadenylation complex. EMBO J [Internet]. 1993 Feb [cited 2020 Feb 25];12(2):585–94. Available from: http://www.ncbi.nlm.nih.gov/pubmed/8440247. pmid:8440247
- 5. Beaudoing E, Freier S, Wyatt JR, Claverie JM, Gautheret D. Patterns of variant polyadenylation signal usage in human genes. Genome Res. 2000;10(7):1001–10. pmid:10899149
- 6. Li XQ, Du D. RNA polyadenylation sites on the genomes of microorganisms, animals, and plants. PLoS One [Internet]. 2013 Nov 18 [cited 2022 Jun 22];8(11):79511. Available from: /pmc/articles/PMC3832601/. pmid:24260238
- 7. Danckwardt S, Hentze MW, Kulozik AE. 3’ end mRNA processing: molecular mechanisms and implications for health and disease. EMBO J [Internet]. 2008 Feb 6 [cited 2014 Jul 26];27(3):482–98. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2241648&tool=pmcentrez&rendertype=abstract. pmid:18256699
- 8. Barrett LW, Fletcher S, Wilton SD. Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements [Internet]. Vol. 69, Cellular and Molecular Life Sciences. Cell Mol Life Sci; 2012 [cited 2022 Jun 9]. p. 3613–34. Available from: https://pubmed.ncbi.nlm.nih.gov/22538991/. pmid:22538991
- 9. Sheets MD, Ogg SC, Wickens MP. Point mutations in AAUAAA and the poly (A) addition site: effects on the accuracy and efficiency of cleavage and polyadenylation in vitro. Nucleic Acids Res [Internet]. 1990 Oct 11 [cited 2018 Aug 5];18(19):5799–805. Available from: http://www.ncbi.nlm.nih.gov/pubmed/2170946. pmid:2170946
- 10. Millevoi S, Vagner S. Molecular mechanisms of eukaryotic pre-mRNA 3’ end processing regulation. Nucleic Acids Res [Internet]. 2009 Dec 30 [cited 2022 Jun 22];38(9):2757–74. Available from: /pmc/articles/PMC2874999/. pmid:20044349
- 11.
Tian B, Manley JL. Alternative polyadenylation of mRNA precursors [Internet]. Vol. 18, Nature Reviews Molecular Cell Biology. Nature Publishing Group; 2016 [cited 2022 Jun 22]. p. 18–30. Available from: /pmc/articles/PMC5483950/.
- 12. Ji Z, Lee JY, Pan Z, Jiang B, Tian B. Progressive lengthening of 3′ untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development. Proc Natl Acad Sci U S A [Internet]. 2009 Apr 28 [cited 2022 Jun 22];106(17):7028–33. Available from: /pmc/articles/PMC2669788/. pmid:19372383
- 13. Sandberg R, Neilson JR, Sarma A, Sharp PA, Burge CB. Proliferating cells express mRNAs with shortened 3′ untranslated regions and fewer microRNA target sites. Science (80-) [Internet]. 2008 Jun 20 [cited 2022 Jun 22];320(5883):1643–7. Available from: /pmc/articles/PMC2587246/. pmid:18566288
- 14. Mayr C, Bartel DP. Widespread Shortening of 3′UTRs by Alternative Cleavage and Polyadenylation Activates Oncogenes in Cancer Cells. Cell [Internet]. 2009 Aug 21 [cited 2022 Jun 22];138(4):673–84. Available from: /pmc/articles/PMC2819821/. pmid:19703394
- 15. Yuan F, Hankey W, Wagner EJ, Li W, Wang Q. Alternative polyadenylation of mRNA and its role in cancer. Vol. 8, Genes and Diseases. Chongqing University; 2021. p. 61–72. pmid:33569514
- 16. Kwon B, Fansler MM, Patel ND, Lee J, Ma W, Mayr C. Enhancers regulate 3′ end processing activity to control expression of alternative 3′UTR isoforms. Nat Commun [Internet]. 2022 Dec 1 [cited 2022 Sep 30];13(1):1–14. Available from: https://doi.org/10.1038/s41467-022-30525-y.
- 17. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature [Internet]. 2001 Feb 15;409(6822):860–921. Available from: http://www.ncbi.nlm.nih.gov/pubmed/11237011. pmid:11237011
- 18. Häsler J, Strub K. Alu elements as regulators of gene expression. Nucleic Acids Res [Internet]. 2006 Jan [cited 2014 Jul 19];34(19):5491–7. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1636486&tool=pmcentrez&rendertype=abstract. pmid:17020921
- 19. Kriegs JO, Churakov G, Jurka J, Brosius J, Schmitz J. Evolutionary history of 7SL RNA-derived SINEs in Supraprimates [Internet]. Vol. 23, Trends in Genetics. Trends Genet; 2007 [cited 2022 Jun 9]. p. 158–61. Available from: https://pubmed.ncbi.nlm.nih.gov/17307271/. pmid:17307271
- 20. Ullu E, Tschudi C. Alu sequences are processed 7SL RNA genes. Nature. 1984;312:171–4. pmid:6209580
- 21. Richard Shen M, Batzer MA, Deininger PL. Evolution of the master Alu gene(s). J Mol Evol [Internet]. 1991 Oct [cited 2022 Jun 9];33(4):311–20. Available from: https://pubmed.ncbi.nlm.nih.gov/1774786/. pmid:1774786
- 22. Deininger PL, Batzer M a. Alu repeats and human disease. Mol Genet Metab [Internet]. 1999 Jul;67(3):183–93. Available from: http://www.ncbi.nlm.nih.gov/pubmed/10381326.
- 23. Liang L, Cao C, Ji L, Cai Z, Wang D, Ye R, et al. Complementary Alu sequences mediate enhancer–promoter selectivity. Nature [Internet]. 2023 Jul 27 [cited 2023 Sep 27];619(7971):868–75. Available from: https://www.nature.com/articles/s41586-023-06323-x. pmid:37438529
- 24. Deininger P. Alu elements: know the SINEs. Genome Biol [Internet]. 2011 Jan;12(12):236. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3334610&tool=pmcentrez&rendertype=abstract. pmid:22204421
- 25. Janicic N, Pausova Z, Cole DE, Hendy GN. Insertion of an Alu sequence in the Ca(2+)-sensing receptor gene in familial hypocalciuric hypercalcemia and neonatal severe hyperparathyroidism. Am J Hum Genet [Internet]. 1995 Apr [cited 2019 Sep 15];56(4):880–6. Available from: http://www.ncbi.nlm.nih.gov/pubmed/7717399. pmid:7717399
- 26. Krooss S, Werwitzke S, Kopp J, Rovai A, Varnholt D, Wachs AS, et al. Pathological mechanism and antisense oligonucleotide-mediated rescue of a non-coding variant suppressing factor 9 RNA biogenesis leading to hemophilia B. Cooper GM, editor. PLOS Genet [Internet]. 2020 Apr 8 [cited 2020 Oct 26];16(4):e1008690. Available from: https://dx.plos.org/10.1371/journal.pgen.1008690. pmid:32267853
- 27. Zhao J, Hyman L, Moore C. Formation of mRNA 3’ ends in eukaryotes: mechanism, regulation, and interrelationships with other steps in mRNA synthesis. Microbiol Mol Biol Rev [Internet]. 1999 Jun;63(2):405–45. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=98971&tool=pmcentrez&rendertype=abstract. pmid:10357856
- 28. Li X, Drost JB, Roberts S, Kasper C, Sommer SS. Factor IX mutations in South Africans and African Americans are compatible with primarily endogenous influences upon recent germline mutations. Hum Mutat [Internet]. 2000 [cited 2023 Sep 5];16(4):371. Available from: https://pubmed.ncbi.nlm.nih.gov/11013449/. pmid:11013449
- 29. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The Human Genome Browser at UCSC. Genome Res [Internet]. 2002 Jun 1 [cited 2022 Sep 30];12(6):996–1006. Available from: www.genome.org. pmid:12045153
- 30. Britten RJ. Evidence that most human Alu sequences were inserted in a process that ceased about 30 million years ago. Proc Natl Acad Sci U S A. 1994 Jun 21;91(13):6148–50. pmid:8016128
- 31. Chen C, Ara T, Gautheret D. Using Alu Elements as Polyadenylation Sites: A Case of Retroposon Exaptation. Mol Biol Evol [Internet]. 2009 Feb 1 [cited 2024 Sep 2];26(2):327–34. Available from: https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msn249. pmid:18984903
- 32. Roy-Engel AM, El-Sawy M, Farooq L, Odom GL, Perepelitsa-Belancio V, Bruch H, et al. Human retroelements may introduce intragenic polyadenylation signals. Cytogenet Genome Res [Internet]. 2005 [cited 2024 Sep 2];110(1–4):365–71. Available from: https://pubmed.ncbi.nlm.nih.gov/16093688/. pmid:16093688
- 33. Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, et al. Rate of de novo mutations and the importance of father-s age to disease risk. Nature [Internet]. 2012 Aug 23 [cited 2023 Sep 5];488(7412):471–5. Available from: https://pubmed.ncbi.nlm.nih.gov/22914163/. pmid:22914163
- 34. Perelman P, Johnson WE, Roos C, Seuánez HN, Horvath JE, Moreira MAM, et al. A Molecular Phylogeny of Living Primates. Brosius J, editor. PLoS Genet [Internet]. 2011 Mar 17 [cited 2020 Oct 27];7(3):e1001342. Available from: https://dx.plos.org/10.1371/journal.pgen.1001342. pmid:21436896
- 35. Paz I, Kosti I, Ares M, Cline M, Mandel-Gutfreund Y. RBPmap: A web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Res [Internet]. 2014 Jul 1 [cited 2022 Oct 25];42(W1):W361. Available from: /pmc/articles/PMC4086114/. pmid:24829458
- 36. Miller KG. Sea level change, last 250 million years. Encycl Earth Sci Ser. 2009;649–51.
- 37. Houle A. The origin of platyrrhines: An evaluation of the Antarctic scenario and the floating island model. Am J Phys Anthropol [Internet]. 1999 Aug [cited 2020 Nov 3];109(4):541–59. Available from: https://pubmed.ncbi.nlm.nih.gov/10423268/. pmid:10423268
- 38. Marivaux L, Negri FR, Antoine P-O, Stutz NS, Condamine FL, Kerber L, et al. An eosimiid primate of South Asian affinities in the Paleogene of Western Amazonia and the origin of New World monkeys. Proc Natl Acad Sci [Internet]. 2023 Jul 11 [cited 2023 Sep 25];120(28):e2301338120. Available from: pmid:37399374
- 39. Ciochon RL, Chiarelli AB. Paleobiogeographic Perspectives on the Origin of the Platyrrhini. In: Evolutionary Biology of the New World Monkeys and Continental Drift [Internet]. Springer US; 1980 [cited 2023 Sep 25]. p. 459–93. Available from: https://link.springer.com/chapter/10.1007/978-1-4684-3764-5_23.
- 40. Poux C, Chevret P, Huchon D, de Jong WW, Douzery EJP. Arrival and Diversification of Caviomorph Rodents and Platyrrhine Primates in South America. Soltis P, editor. Syst Biol [Internet]. 2006 Apr 1 [cited 2023 Sep 25];55(2):228–44. Available from: https://academic.oup.com/sysbio/article/55/2/228/1621645.
- 41.
Oliveira FB de, Molina EC, Marroig G. Paleogeography of the South Atlantic: a Route for Primates and Rodents into the New World? In: South American Primates [Internet]. Springer New York; 2008 [cited 2023 Sep 25]. p. 55–68. Available from: https://link.springer.com/chapter/10.1007/978-0-387-78705-3_3.