Figures
Abstract
The discovery of the ProtoRAG transposon in lancelets revealed that V(D)J recombination originates from the Recombination activating gene-like (RAGL) transposon. Analogous to the vertebrate RAG complex, the RAGL transposase nicks host flanking DNA and leads to the formation of hairpin ends. Here, we showed that the Artemis nuclease, which is capable of resolving DNA hairpin ends generated during V(D)J recombination, is also responsible for unraveling ProtoRAG-mediated DNA hairpin ends. Notably, like the RAGL transposon, Artemis originated from the eukaryotic common ancestor. By tracing the evolving function of Artemis from cephalochordates to vertebrates, we revealed the lineage specific allele polymorphism of lancelet Artemis and uncovered an increased activity on hairpin DNA opening in vertebrate Artemis. Additionally, the evolutionarily conserved LYCS motif in Artemis β6, which may be associated with disease, is demonstrated to be crucial for its function. Overall, this study highlights the evolving function of Artemis, identifies novel critical residues, and provides new insights into the evolution of RAG-mediated recombination and the clinical therapy of Artemis deficient disease.
Citation: Huang Z, Cai Z, Tao X, Wang X, Tian X, Chen F, et al. (2025) Cross-species analysis of the nuclease Artemis highlights its evolving function in domesticating RAG-like transposons and residues that are crucial for activity. PLoS Biol 23(4): e3003056. https://doi.org/10.1371/journal.pbio.3003056
Academic Editor: David Nemazee, Scripps Research Institute, UNITED STATES OF AMERICA
Received: August 25, 2024; Accepted: February 6, 2025; Published: April 17, 2025
Copyright: © 2025 Huang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data need to evaluate the conclusions in this study are present in the paper and the Supporting information.
Funding: This work was supported by the National Natural Science Foundation of China (31970852 and 32170888 to S. Y); Guangdong Science and Technology Department (2024B1515040009, 2024B1111130003 and 2023B1212060028 to S. Y), and National Key Research and Development Project (2019YFC1710104 to A. X). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: Bbe, Branchiostoma belcheri; CJs, coding joints; COAD, colon adenocarcinoma; Dre, Danio rerio; HDJ, host DNA joint; ICL, interstrand crosslink; Igs, immunoglobulins; KIRP, kidney renal papillary cell carcinoma; LIHC, liver hepatocellular carcinoma; Lja, Lampetra japonicum; MBL, metallo-beta-lactamase; NHEJ, non-homologous end joining; RAG, Recombination activating gene; RAGL, RAG-like; RS-SCID, radiosensitive severe combined immune deficiency; RSSs, recombination signal sequences; SJs, signal joints; SNM1, sensitivity to nitrogen mustard 1; TCRs, T-cell receptors; TIRs, terminal inverted repeats; TTJ, TIRs joint; VLRs, variable lymphocyte receptors
Introduction
The highly diverse repertoire of immunoglobulins (Igs) and T-cell receptors (TCRs) is a hallmark of adaptive immunity in jawed vertebrates. Igs and TCRs are assembled by the well-known V(D)J recombination, which is initiated by the synapsis and cleavage of a complementary (12/23) pair of recombination signal sequences (RSSs) by the RAG1 and RAG2 proteins (Recombination activating gene [RAG]). The RAG complex creates double-strand break between the RSS and the coding gene segment, generating precise and blunt signal end and coding end with covalently sealed hairpin. Formations of both signal joints (SJs) and coding joints (CJs) require the non-homologous end joining (NHEJ) mechanism. Specifically, coding joint formation requires the nuclease Artemis to open the sealed coding ends [1]. In Artemis-deficient lymphocytes, coding ends but not signal ends accumulate, and both B and T cells arrest at an early stage, suggesting the immunological importance of Artemis [2].
Since V(D)J recombination only occurs in jawed vertebrates, its origin has attracted a long-term attention in the field of immunology and evolution. A previous study revealed that the characteristics of RSSs resemble the terminal inverted repeats (TIRs) found at the ends of prokaryotic insertion elements [3]. Additionally, the RAG complex exhibits transposase activity in vitro, leading to the hypothesis that vertebrate V(D)J recombination originated from a RAG-like (RAGL) DNA transposon [4]. The discovery of ProtoRAG in lancelet Branchiostoma belcheri, a basal chordate, provides direct evidence to support this transposon hypothesis [5]. After discovering ProtoRAG, the origin of the RAGL transposon has been traced back to eukaryotic unicellular organisms, and it is believed to have descended directly from the bacterial transposon Transib [6]. Similar to the activity of the RAG complex in V(D)J recombination, the RAGL transposase encoded by ProtoRAG excises its TIRs and generates hairpin-tipped DNA ends in vitro [5]. Processing of these hairpin ends on the host genome after excision by the RAGL transposase is important for host genome stability. However, how these hairpin ends were processed during ProtoRAG mediated transposition is still unclear.
Artemis belongs to sensitivity to nitrogen mustard 1 (SNM1) gene family, which is characterized by the metallo-beta-lactamase (MBL) and β-CASP domains [7]. Three mammalian members of this family, including SNM1A, SNM1B (also known as Apollo) and Artemis (also known as SNM1C/DCLRE1C) have been identified and demonstrated to play various roles on DNA metabolism [8]. SNM1A and SNM1B contribute predominantly to interstrand crosslink (ICL) repair, while Artemis primarily participates in non-ICL repair pathways, including processing the 5′ and 3′ overhangs as well as hairpin DNAs [1]. Structural analysis revealed that a second zinc ion-binding site in the β-CASP domain is unique to Artemis, which helped explain it’s predominantly endonucleolytic activity, in contrast with the exonuclease activity of the closely related SNM1A and SNM1B nucleases [9]. Moreover, when processing the DNA hairpin end generated by the RAG complex in V(D)J recombination, Artemis interacts with DNA-PKcs (encoded by PRKDC gene) and is phosphorylated [1,10]. Mutations in some conserved residues, such as D17, H33, H35, D37, H38, H115, D136, D165, H254, H319 and E324, could abolish the function of the Artemis protein and lead to radiosensitive severe combined immune deficiency (RS-SCID) [7,11,12]. Thus, discovery of the functional residues in Artemis is still important for revealing its activation mechanism and providing new strategies for the diagnosis of RS-SCID symptoms.
To gain more insight into the evolution of the V(D)J recombination mechanism from the perspective of DNA repair, we revisited the nuclease Artemis, an important partner of the RAG complex for V(D)J recombination, and found that lancelet Artemis is truly responsible for DNA hairpin opening after ProtoRAG-mediated transposition. Interestingly, we found that lancelet Artemis exhibits allele polymorphism among individuals and identified the R58 residue that is crucial for its activity. By revealing the highly conserved and functionally important residues of Artemis among species, we further demonstrated that the conserved LYCS motif in Artemis β6 is important for its function in processing various DNA ends.
Results
Artemis originated from the common ancestor of eukaryotes and is highly conserved during vertebrate evolution
Artemis has been previously identified in fungi, invertebrates and vertebrates [13]. To further reveal its origin, we used the TBLASTN program to search Artemis homologs using fungal Artemis as the query sequence and further identified Artemis homologs in protists and plants (Figs 1A, S1 and S1 Table). Most Artemis proteins contain an N-terminal MBL domain and β-CASP domain and a C-terminal region (Fig 1B). The N-terminus has a critical catalytic function, whereas the C-terminus plays a role in functional regulation [7,14]. Previously, structural analysis of Artemis has revealed that a new zinc-binding motif containing two histidine and two cysteine residues was supposed to be an adaption for Artemis to process several different DNA substrate structures, such as DNA overhangs and DNA hairpins [9]. Here, we found that this zinc-binding motif and other essential motifs for the activity of human Artemis, such as motifs I–IV and motifs A–C, were also conserved in Artemis from protists to mammals (Fig 1A).
(A) The Bayesian tree was construct by MrBayes based on the MBL and β-CASP domains of Artemis, SNM1A and SNM1B. The numbers indicated the posterior probability. The right panel displays a multiple sequence alignment of Artemis proteins, alongside mouse SNM1A and SNM1B. Conserved residues were highlighted by red background. Black stars indicated the conserved residues composing motif I-IV and motif A-C in Artemis. (B) Intron phase composition of Artemis from invertebrate and vertebrate representatives was shown. On the left, colored rectangles in light orange, light blue and light yellow represent three distinct intron phases, respectively. On the right, colored rectangles in orange, light green and gray correspond to the MBL, β-CASP and C-terminus domains of Artemis protein, respectively. Species abbreviation and accession number of Artemis proteins were shown in S1 Table.
Based on multiple sequence alignment of MBL and β-CASP domains of Artemis, SNM1A and SNM1B, we then constructed a Bayesian tree taking CPSF73 from archaea and yeast as outgroups [15] (Figs 1A and S1). CPSF73 is a member of β-CASP family and involved in processing of RNA. Consistent with a previous study [13], the derived tree suggests three major clades of SNM1 proteins, including the SNM1A, SNM1B and Artemis clades. Among them, SNM1A is the most ancient clade (Fig 1A). In the Artemis clade, the newly identified Artemis in choanoflagellates, amoebozoans, cryptomonads and palpitomonas represent ancient homologs. Since Artemis is involved in processing different DNA substrate structures in the NHEJ repair pathway, while SNM1B plays a role in telomere maintenance in association with the shelterin complex [16,17], the emergence of Artemis and SNM1B may be a benefit to the maintenance of genomic stability in eukaryotes.
As intron phase is a character of gene structure that could reveal gene evolution among species [18], we then analyzed the intron phase of the Artemis homologs among various species and found that some arthropod and echinoderm Artemis genes are encoded by a single exon, such as Artemis in Dermacentor silvarum and Anneissia japonica (S1 Table). The intron phases of vertebrate Artemis are highly conserved, while those of invertebrate Artemis are dynamic (Fig 1B and S1 Table). In addition, our analysis revealed that exons 1–6, 7–13, and 14 encode the MBL domain, β-CASP domain, and the C-terminus, respectively, in all vertebrate Artemis proteins. In contrast, these domains were encoded by various exons in invertebrate Artemis genes (Fig 1B), suggesting that vertebrate Artemis may have undergone purifying selection pressure due to its important functions.
Lancelet Artemis is responsible for joining ProtoRAG-cleaved DNA ends
Similar to the detection of CJ and SJ formation in V(D)J recombination, we have previously designed two GFP reporter constructs to test the cleavage activity of lancelet B. belcheri (Bbe) BbeRAG1L/2L transposase complex [5]. One is for detecting the formation of the host DNA joint (HDJ) and the other is for detecting the blunt TIRs joint (TTJ) formation (Figs 2A and S2A). Mechanistically, BbeRAG1L/2L recognizes the TIR sequences in the substrates, leading to the cleavage of two TIRs and the removal of the PolyA sequence. Subsequently, the flanking sequences of the substrates are resealed by the NHEJ repair mechanism, which leads to the formation of HDJ and restores the normal expression of the GFP protein in the cells. The efficacy of HDJ and TTJ formation can be further quantified using flow cytometry and PCR assays (Figs 2A and S2A). To further test the involvement of Artemis in dealing with the DSBs generated by the BbeRAG1L/2L complex, endogenous Artemis was knocked out in 293T cells (293TArtemis−/−) (Fig 2B), and those two previously designed constructs were used to test the function of BbeArtemis in HDJ formation after cleavage of BbeRAG1L/2L. The results of flow cytometry assays showed no GFP expression in 293TArtemis−/− cells when the BbeRAG1L/2L complex was co-transfected together with the HDJ construct (Fig 2C). Consistently, PCR assays could not detect the products of HDJ formation in 293TArtemis−/− cells (Fig 2D). In contrary, transient expression of human Artemis in 293TArtemis−/− cells rescued the deficiency of HDJ formation (Fig 2C). These results suggest that Artemis is indispensable for the repair of BbeRAG1L/2L-cleaved DNA hairpin ends. Furthermore, ligation-mediated polymerase chain reaction (LM-PCR) could not detect significant difference in the generation of TIR ends between 293T and 293TArtemis−/− cells (Fig 2D), suggesting that the deficiency of Artemis did not affect TIR cleavage by BbeRAG1L/2L. It is worth noting that, both flow cytometry and PCR assays revealed slight decrease of TTJ formation in 293TArtemis−/− cells (S2B and S2C Fig).
(A) Schematic diagrams of the GFP reporter assay and PCR assay used to measure BbeRAG1L/2L-mediated DNA excision and recombination. HDJ indicated host DNA joint, which is formed when the flanking DNA hairpins were opened and ligated. TIRE, TIR end. After successful recombination, GFP was normally expressed and quantification of the GFP-positive cells were performed by flow cytometry. The HDJ and TIRE products were also detected by PCR using primer P1, P2 and P3, P4, P5, respectively. Unfilled and filled triangles indicated the 5′TIR and 3′TIR sequences, respectively. P1-P5 indicated the PCR primers and their sequences were presented in S2 Table. (B) Artemis was knocked out through CRISPR/Cas9 in 293T cells. The sgRNA sequence was presented in S2 Table. (C) Artemis is essential for the host’s direct joining after the cleavage of BbeRAG1L/2L in 293T cells. (D) PCR assays revealed defective HDJ formation but not TIR cleavage by BbeRAG1L/2L in 293TArtemis−/− cells. URS, un-recombination substrate. 3TIRE, 3′ TIR end. 5TIRE, 5′ TIR end. (E) Artemis alleles were cloned from five lancelet individuals using two distinct primer pairs. The U3 and L3 primer pair, designed within the CDS, was used to clone the Artemis coding sequences from cDNAs. U4 and L4 primer pair was used to clone the last exon of lancelet Artemis from genomic DNAs and confirm the cloning results of U3 and L3 paired primer. All primer sequences were listed in S2 Table. Bbe#A-E represented different lancelet individuals. Numbers in parentheses indicated number of clones that we have sequenced. R58 or Q58 indicated Artemis allele that encoded the arginine or glutamine at residue 58. Ticks in parentheses indicated that the obtained Artemis allele sequences were confirmed by genomic PCR assays using U4 and L4 primer pair. (F) The polymorphism frequencies (%) for SNPs of two lancelet Artemis alleles and two Gapdh alleles from five individuals. Notably, only one Gapdh allele sequence was obtained from Bbe#E after we sequenced 10 clones. (G) Divergent activity of lancelet Artemis in HDJ formation after BbeRAG1L/2L-mediated cleavage of substrates in 293TArtemis−/− cells. (H) Schematic diagrams of the bacterial colony assay. pTIR104 and PJH290 are the substrates for BbeRAG1L/2L and mouse RAG1/2, which can detect the repaired sequences of HDJ and CJ, respectively. (I) Sequence alignment of the HDJ products recovered from the bacterial colony assays revealed that BbeArtemisL2 promoted palindromic nucleotides addition in 293TArtemis−/− cells. P, palindromic nucleotides addition. Plus and minus signs indicate the number of nucleotides added or cleaved in the putative products, respectively. (J–L) Cleavage of DNA hairpin substrates (J), 3′DNA overhang substrates (K) and single strand DNA substrates (L) by BbeArtemisL1, BbeArtemisL2 and BbeArtemisL6 proteins in manganese ion buffer in vitro. NC, negative control by adding dialysis buffer instead of Artemis protein. Sequences used for various DNA substrate preparation were shown in S2 Table. The asterisk indicated Cy5 fluorophores were labeled. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files). All flow cytometry data represented the mean ± SD of three biological replicates. P-values were analyzed with Student t test. ns, not significant.
Above observation suggested that human Artemis deficiency might specially inhibit the repair of DNA hairpin ends formed during ProtoRAG transposition. To reveal whether lancelet Artemis is involved in this process, we then cloned BbeArtemis from cDNAs of lancelet B. belcheri using primers designed from its 5′- and 3′-UTRs, then seven sequences exhibiting 92.1–97.4% sequence identity were obtained from seven individuals and named BbeArtemisL1-L7 (S2D Fig). All BbeArtemis proteins contain the MBL and β-CASP domains at their N-terminus and an unexpected EK repeat region at their C-terminus (S2E Fig). Since there is only one gene copy of Artemis in the genome of lancelets (S2F Fig), we then cloned two alleles of BbeArtemis from each independent individual using primers designed within the conserved coding sequence and finally obtained allele sequences of Artemis from five lancelet individuals, respectively (Fig 2E). We also cloned the last exon of BbeArtemis from the genomic DNAs of five lancelet individuals to confirm the cDNA cloning results (Fig 2E). Sequence comparison indicated that two alleles from one individual have polymorphism and splicing dynamics (S3A–S3F Fig). The average SNP frequency of lancelet Artemis alleles was about 4.8%, comparable with that of lancelet diploid genome (4.39%) [19], but higher than lancelet Gapdh alleles, which is about 1.53% (Fig 2F). Moreover, some splicing dynamics may result in early termination of translation and functional deactivation of BbeArtemis (S3A–S3F Fig). Thus, the allele polymorphism of lancelet Artemis may result in functional dynamic among individuals.
To reveal how BbeArtemis is involved in ProtoRAG mediated transposition and whether the allele polymorphism can affect its activity, seven full length Artemis sequences (BbeArtemisL1 to L7) were used for the HDJ GFP reporter assays. The results showed that only BbeArtemisL2 and BbeArtemisL6 could promote HDJ formation (Fig 2G). To further confirm the functional difference among BbeArtemis, we utilized another construct that was used to detect the coding joint (CJ) formation after cleavage of mouse RAG (MmuRAG1/2) in 293TArtemis−/− cells (S2G Fig). Similar to the formation of HDJ, when MmuRAG1/2 recognizes the RSSs on the substrates and initiates cleavage, the PolyA sequence is subsequently removed. Following this, the coding joints are ligated, allowing for the normal expression of the GFP reporter. Consistently, the GFP reporter assay demonstrated dynamic activity of BbeArtemis proteins in mediating CJ formation in 293TArtemis−/− cells (S2H Fig). Furthermore, the efficiency of CJ formation is found to be higher than that of HDJ formation. This suggest that the MmuRAG1/2-mediated GFP reporter assay may be more sensitive in detecting the activity of Artemis compared to the BbeRAG1L/2L complex.
To show features of the repaired sequences, we then used another two constructs named pTIR104/PJH290 to recover the HDJ or CJ constructs from 293TArtemis−/− cells overexpressing BbeArtemisL2, BbeArtemisL6 or human Artemis, respectively. As indicated in Fig 2H, if the flanking ends of the constructs were ligated after BbeRAG1L/2L or MmuRAG1/2-mediated cleavage, the chloramphenicol resistance gene CAT (Chloramphenicol acetyltransferase) within substrates can successfully express. After transforming the recovered HDJ or CJ constructs into E. coli DH5α, the successfully ligated constructs could grow on chloramphenicol-containing medium plates. By sequencing the recovered plasmids, short palindromic nucleotides were observed in the recombination products when co-expressed with BbeArtemisL2, BbeArtemisL6 or HsaArtemis (Figs 2I, S2I and S2J), further supporting that lancelet Artemis can open DNA hairpins during ProtoRAG transposition and induce nucleotide additions.
In addition to the GFP reporter and bacterial colony assays ex vivo, we prepared Cy5-labeled DNA hairpin substrates and purified lancelet Artemis proteins from 293T cells for the in vitro cleavage assays (S2K and S2L Fig). Results showed that BbeArtemisL2 and BbeArtemisL6 are more efficient than BbeArtemisL1 in processing DNA hairpin, 3′DNA overhangs and single strand DNA (ssDNA) in manganese buffer in the absence of DNA-PKcs in vitro (Fig 2J–2L), further confirming divergent nuclease activity of Artemis among lancelet individuals.
The C-terminus and residue 58 are responsible for the divergent activity of lancelet Artemis
As we have observed above, distinct BbeArtemis showed functional difference. To reveal the key domain or residues that may have effect on the activity of lancelet Artemis, we used the PAML [20] program to analyze the positive selection sites based on the nucleotide sequences of lancelet Artemis we have obtained. We found that residues under positive selection pressure were focused on the C-terminus of lancelet Artemis (Fig 3A), similar to the primate Artemis which have positive selected residues in its C-terminus [21]. Then, several truncated mutants lacking the C-terminus of lancelet Artemis and human Artemis were constructed (Fig 3B). As the GFP reporter assays showed, both BbeArtemisL2A and BbeArtemisL6A, two mutants with deletion of both the C2 region and the repeat sequences, exhibited higher activity than the full-length BbeArtemis in HDJ formation in 293TArtemis−/− cells (Fig 3B and 3C). However, the efficiency of HDJ formation decreased in BbeArtemisL2B and BbeArtemisL6B, two mutants with deletions of the C1, repeat and C2 regions (Fig 3B and 3C). Since the dynamic EK repeat region is unique in BbeArtemis, to elucidate its function in BbeArtemis activity, we further constructed vectors with deletion in the repeat sequence of BbeArtemisL1, BbeArtemisL2, and BbeArtemisL6 (Fig 3B) and then perform the GFP reporter assay. We firstly detected the expression of these deleted mutants and found that the expression level of BbeArtemisL2_ΔR was higher than that of BbeArtemisL2, whereas BbeArtemisL6_ΔR showed lower expression compared to BbeArtemisL6 (S3G Fig). Then, BbeArtemisL2_ΔR was found to exhibit higher endonuclease activity compared to BbeArtemisL2, whereas BbeArtemisL6_ΔR demonstrated the opposite results in 293TArtemis−/− cells (Fig 3D). Thus, the dynamic EK repeat region of BbeArtemis alleles is hypothesized to play a regulatory role, potentially modulating protein stability and enzymatic activity. Notably, the EK repeat regions within various BbeArtemis alleles are enriched with “AG” dinucleotides, which are recognized as canonical splice acceptor sites. Our analysis of BbeArtemis splice variants further indicated that these “AG” dinucleotides potentially serve as splice acceptors (S4 Fig), suggesting that the dynamic EK repeats may play a role in contributing to the diversity of gene expression.
(A) Codons under positive selection pressure in lancelet Artemis with a posterior probability (P) identified by Bayes empirical Bayes (BEB) analysis. ( *P > 0.95, **P > 0.99). Coordinates correspond to the BbeArtemisL1. PSS, positive selection site. (B) Diagrams showing the truncated mutants of lancelet and human Artemis. (C) Activity of the truncated lancelet and human Artemis in HDJ formation after the cleavage of BbeRAG1L/2L in 293TArtemis−/− cells. FL, full length Artemis. ΔR indicated the repeat sequence was deleted. (D) BbeArtemisL2_ΔR demonstrated increased endonuclease activity relative to BbeArtemisL2, whereas BbeArtemisL6_ΔR exhibited decreased endonuclease activity compared to BbeArtemisL6 in 293TArtemis−/− cells. (E) Multiple sequence alignment showed variant residues (indicated in red background) among the BbeArtemisL1A, BbeArtemisL2A and BbeArtemisL6A. The numbers in red indicated residues that are identical in BbeArtemisL2 and BbeArtemisL6 while different in BbeArtemisL1. (F) The table indicated the replaced residues in BbeArtemisL2A and BbeArtemisL1A. D17A mutation in BbeArtemisL2A was used as the negative control. (G) GFP reporter assay showed that residue 58 is important for the activity of BbeArtemisL2A in HDJ formation after the cleavage of BbeRAG1L/2L in 293TArtemis−/− cells. (H) Quantification of GFP-positive cells revealed that substitution of Q58 by arginine in BbeArtemisL1A increased its activity in HDJ formation after the cleavage of BbeRAG1L/2L in 293TArtemis−/− cells. (I) Divergent activity of BbeArtemisL8-L12 in HDJ formation after the cleavage of BbeRAG1L/2L. BbeArtemisL8-L11 were obtained from five lancelet individuals Bbe#8-#12, respectively. (J) BbeArtemisL8-L12 showed distinct activity in CJ formation in 293TArtemis−/− cells. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files). All flow cytometry data represented the mean ± SD of three biological replicates. P-values were analyzed with Student t test. ns, not significant.
Although the C-terminal repeat and the C2 region may inhibit the activity of lancelet Artemis, deletion of the C-terminal repeat and the C2 region of BbeArtemisL1 could not rescue its deficiency in promoting HDJ formation in 293TArtemis−/− cells (Fig 3B and 3C). To further reveal why BbeArtermis proteins show dynamic activities, multiple sequence alignment of the BbeArtemisL1A, BbeArtemisL2A and BbeArtemisL6A sequences was performed. The result showed that there were 21 different residues among these sequences (Fig 3E). Four residues, L24, R58, R357, and D469, are identical in BbeArtemisL2 and BbeArtemisL6 but different in BbeArtemisL1 (Fig 3E). Independent substitution of these four amino acids in BbeArtemisL2A demonstrated that the R58Q mutation decreased its activity in 293TArtemis−/− cells (Fig 3F and 3G). In contrast, substitution of Q58 as arginine in BbeArtemisL1A increased its activity in HDJ formation (Fig 3F and 3H), suggesting that R58 is essential for the activity of lancelet Artemis. To confirm this observation, we then cloned another five Artemis sequences from individuals Bbe#8-#12, which were named BbeArtemisL8-L12. GFP reporter assays showed that, besides BbeArtemisL9, the other four lancelet Artemis proteins promoted both HDJ and CJ formation in 293TArtemis−/− cells (Fig 3I and 3J). The R58 residue, which was demonstrated to be critical for the activity of lancelet Artemis, is replaced by glutamine in BbeArtemisL9 (S5 Fig), further confirming that R58 is vital for the activity of lancelet Artemis. Interestingly, we can only obtain two Artemis alleles with R58/R58 or R58/Q58 but not Q58/Q58 from one individual (Fig 2E), suggesting a certain level of Artemis activity in different lancet individuals.
Functional evolution of Artemis from invertebrates to vertebrates
To reveal whether individual polymorphisms of Artemis commonly exist in the early vertebrates, we then cloned Artemis from the jawless vertebrate lamprey (Lampetra japonicum, Lja) and bony fish zebrafish (Danio rerio, Dre), two basic vertebrate representatives. Finally, eight sequences (LjaArtemisL1–L8) with 97.9–99.7% identity were cloned from nine lampreys sourced from Songhua River (S6A Fig). Similarly, four Artemis protein sequences (DreArtemisL1–L4), exhibiting 99.1–99.8% identity duo to code degeneracy, were successfully cloned from eight laboratory-cultured zebrafishes (S6B Fig). The average SNP rate of LjaArtemis was calculated and suggested to be 0.63%, which is notably lower than that was observed in BbeArtemis (Fig 4A). We subsequently employed the MmuRAG1/2-mediated GFP reporter assay to investigate whether the Artemis proteins from lamprey and zebrafish exhibit divergent activities, as observed with lancelet Artemis. The results revealed that neither zebrafish nor lamprey Artemis proteins showed functional differences (Fig 4B and 4C). Bacterial colony assays further revealed that both lamprey and zebrafish Artemis were involved in DNA hairpin opening and could add short palindromic nucleotides in the recombination products, like lancelet and human Artemis (S6C and S6D Fig). Then, we purified distinct Artemis proteins from 293T cells for in vitro nuclease assays (Fig 4D). The cleavage results revealed that both lamprey and human Artemis are more efficient than lancelet Artemis in processing DNA hairpin substrates (Fig 4E). However, lancelet Artemis can process ssDNA and 3′DNA overhang substrates as efficiently as human Artemis (Fig 4F and 4G). The GFP reporter assay further validated that the activity of lancelet Artemis in joining the DNA hairpin ends, which were generated by BbeRAG1L/2L- or MmuRAG1/2-cleavage, were lower compared to that of lamprey and zebrafish Artemis in 293TArtemis−/− cells (Figs 4H and S6E). However, no significant difference in joining MmuRAG1/2-cleaved DNA hairpin ends was observed among Artemis proteins from lamprey, zebrafish and humans in 293TArtemis−/− cells (Fig 4H). Thus, vertebrate Artemis exhibits greater activity in DNA hairpin opening and displays less individual polymorphism compared to lancelet Artemis.
(A) SNP rate (%) of LjaArtemis allelic gene among different individuals. (B) Activity of LjaArtemisL1–L7 in CJ formation in 293TArtemis−/− cells. (C) DreArtemisL1–L4 showed no significant difference of activity in CJ formation after cleavage of MmuRAG1/2 in 293TArtemis−/− cells. (D) Analysis of protein purity by SDS-PAGE and Coomassie bright blue (CBB) stain. (E–G) Comparison of the nuclease activity of Artemis proteins on DNA hairpin substrates (E), ssDNA substrates (F) and 3′ DNA overhang substrates (G) in manganese ion buffer in vitro. The red asterisk indicated the cleavage product of BbeArtemisL6. H35A mutant of human Artemis was used as a negative control. (H) Vertebrate Artemis are significantly more effective than invertebrate Artemis and dependent on DNA-PKcs in HDJ formation after the cleavage of BbeRAG1L/2L in 293TArtemis−/− cells. (I) BbeArtemisL6 were phosphorylated in 293T cells. PPI, phosphatase inhibitor. L-PP, Lambda protein phosphatase. Anti-P-Ser, Anti-Phospho-Serine. (J) LjaArtemisL1, DreArtemisL1 and human Artemis were phosphorylated in 293T cells. (K) DNA-PKcs (PRKDC) genes were knocked out through CRISPR/Cas9 in 293T cells (293TPRKDC−/−) and 293TArtemis−/− cells (293TDKO). The sgRNA sequence was presented in S2 Table. (L) The activity of BbeArtemisL6 as well as vertebrate Artemis is DNA-PKcs-dependent ex vivo. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files). All flow cytometry data represented the mean ± SD of three biological replicates. P values were analyzed with Student t test. ns, not significant.
Given that DNA-PKcs is indispensable for the activity of mammalian Artemis in opening DNA hairpins in vivo [1], to further explore the functional conservation of Artemis across species, we first performed sequence analysis and revealed that the ABCDE autophosphorylation sites [22] and the three interaction sites of Artemis [23] with DNA-PKcs are highly conserved across different species (S6F Fig). Then, we treated 293T cells, transfected with distinct Artemis constructs, with lambda protein phosphatase or phosphatase inhibitors. This approach was used to determine if Artemis can be phosphorylated in mammalian cells across species. We observed that all the tested Artemis could be phosphorylated in 293T cells (Fig 4I and 4J). Then, a specific inhibitor of DNA-PKcs, NU7441, was used to treat 293TArtemis−/− cells to clearly show the impairment of both HDJ and CJ formation in cells expressing distinct Artemis (Figs 4H and S6E). In addition, we generated the PRKDC knockout (293TPRKDC−/−) and PRKDC and Artemis double knockout cells (293TDKO) (Fig 4K). As results shown in Fig 4L, the CJ formation was totally inhibited in 293TPRKDC−/− or 293TDKO cells even though distinct Artemis constructs were transfected, suggesting the importance of DNA-PKcs in the activation of both invertebrate and vertebrate Artemis ex vivo.
The highly conserved residues in Artemis β6 are essential for its activity
As description above, allele polymorphisms associated with activity variations were identified in lancelet Artemis across different individuals. Early studies have also revealed that missense mutation and substitution of conserved residues in human Artemis can impact its enzymatic activity, potentially resulting in SCID [7,24–27]. To reveal more essential residues for the function of human Artemis by comparative analysis, we prepared multiple sequence alignment of Artemis and discovered that L31, L59, Y60, C61, S62, C116, R138, E213, C256, C272, S320 and S321 are evolutionarily conserved (Fig 5A). We then used the GFP reporter assay to detect whether the substitution of these conserved residues with alanine affected the activity of human Artemis. The H35A mutant used as a negative control [7]. The results demonstrated that six residues, including L31, L59, Y60, E213, C256 and C272 replaced by alanine in human Artemis did not affect the protein expression but significantly decreased its activity in opening DNA hairpins generated by mouse RAG1/2 complex in 293TArtemis−/− cells (Figs 5B and S7A). Further independent replacement of these residues in lamprey Artemis confirmed their functional importance (S7B Fig).
(A) Multiple sequence alignment showed evolutionarily conserved residues of Artemis from protists to mammals. Species abbreviation was shown in S1 Table. (B) Quantification of GFP-positive cells represented the activity of HsaArtemis mutants in CJ formation in 293TArtemis−/− cells. (C) SNPs that are predicted to be related to RS-SCID in human Artemis. Data are obtained from ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/). (D) Several mutations of human Artemis found in ClinVar decreased its activity in CJ formation after cleavage of MmuRAG1/2 in 293TArtemis−/− cells. (E and F) Cleavage of DNA hairpin substrates (E) and 3’DNA overhang substrates (F) by human Artemis and its mutants in manganese ion buffer in vitro. H35A mutant of human Artemis was used as a negative control. (G) The substitution of evolutionarily conserved residues in LjaArtemisL1 decreased its endonuclease activity in processing DNA hairpin substrates with manganese ion buffer in vitro. NC, negative control by adding dialysis buffer instead of Artemis proteins. (H) Replacing the conserved residues of BbeArtemisL6 decreased its activity in HDJ formation after the cleavage of BbeRAG1L/2L. (I) The predicted structures of wild type, L59R, Y60R, C61R and S62R mutants of human Artemis revealed that the β6 structure and DNA binding pockets of human Artemis were disrupted in these mutants. The structures of Artemis mutants were predicted using SWISS-MODEL and visualized by PyMOL. Yellow spheres represented zinc ions at the active site and cyan sticks indicated the mutated residues. β6 structure in human Artemis was encircled with a red dashed line. More structural details were also presented in S7F Fig. (J) The activity of HsaArtemis mutants in CJ formation was reduced when residues composing the LYCS motif were replaced by arginine independently. (K) The improved free energy of Artemis mutants was calculated using Aggrescan3D (https://biocomp.chem.uw.edu.pl/A3D2/). (L and M) Substitution of L59, Y60, C61 or S62 by arginine in human Artemis abolished its 3’DNA overhang (L) and DNA hairpin (M) processing activity in vitro. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files). All flow cytometry data represented the mean ± SD of three biological replicates. P values were analyzed with Student t test. ns, not significant.
To further investigate whether mutations of these conserved residues occurred in pathological status or were associated with RS-SCID, we then searched the ClinVar database for SNPs of human Artemis (Fig 5C) and identified that C61R, C256Y and E324G mutations may be pathogenic (Figs 5D and S7C). In vitro nuclease assays demonstrated that mutations of C61, C256 or E324 in human Artemis abolished its endonuclease activity on both DNA hairpin and 3′ DNA overhang substrates (Fig 5E and 5F). In addition, mutation of C61, C256 or E324 in lancelet, lamprey and zebrafish Artemis also decreased their endonuclease activity on DNA hairpin substrates (Figs 5G, 5H and S7D), indicating these residues are functionally conserved and potential disease-related residues in Artemis.
Recent studies have highlighted the importance of C256 and C272 in the new zinc-binding motif of human Artemis [9,28]. Additionally, mutations of E213 and E324 have been reported to affect Artemis activity [11,12]. To elucidate the effects of the C61R mutation on Artemis function, structural modeling was utilized. This revealed that mutating C61 to arginine and histidine, but not alanine and leucine, blocks the DNA binding pocket of human Artemis (Figs 5I and S7E). Through usage of the reporter assay, we confirmed that the C61R and C61H mutants, in contrast to the C61A and C61L mutants, completely abolished the activity of human Artemis (Fig 5B, 5D and 5J). These results indicated that the positively charged residues in the residue 61 may affect the protein stability and DNA binding of Artemis.
Given that the substitution of R58 can affect the activity of lancelet Artemis (Fig 3G and 3H), we also replaced Y58 with arginine or glutamine in human Artemis. However, no decreased activity was observed (Fig 5J). Since R58 is near C61, which has been proven to be essential, we further studied the function of residues L59, Y60 and S62, which are highly conserved in Artemis β6 during the evolution of metazoans (Fig 5A). Using the structural data of human Artemis for prediction [9,28], the residues in the LYCS motif are tightly packed within the hydrophobic core of the enzyme. Thus, to explain the effects of this LYCS motif on the stability of Artemis, we further applied Aggrescan3D for analysis [29]. As Fig 5K shown, the calculated change in the protein’s free energy upon mutation was positive, suggesting a reduction in the stability of the mutated protein as compared to the wild type. Furthermore, structural simulation of those Artemis mutants was finished using SWISS-MODEL using human Artemis (PDB: 6WNL) as the template [9]. The results indeed revealed that the structure of β6 and DNA binding pockets of Artemis were disrupted in these mutants (Figs 5I and S7F). GFP reporter assays and in vitro nuclease assays showed that these mutations indeed decreased the activity of human Artemis (Fig 5J, 5L and 5M). Thus, mutations of this LYCS motif may perturb the protein’s stability and consequently disrupt its DNA binding capacity, which is crucial for its function in V(D)J recombination and DNA repair.
Discussion
NHEJ is one of the prominent pathways of DNA double-strand break repair. Artemis, a nuclease recruited by the DNA-PK complex to DSB sites in the NHEJ pathway, can process various DNA substrates at the boundaries between single-strand and double-strand DNA, generating compatible DNA ends and promoting DNA ligation. In addition, Artemis is indispensable for the opening of DNA hairpins in V(D)J recombination, which is the hallmark of the vertebrate immune system. Functional characterization of ProtoRAG revealed a similar mechanism between ProtoRAG transposition and V(D)J recombination, including the formation of DNA hairpin in RAGL transposase-mediated DNA nicking [5]. Similar to the RAGL transposon that had been recently found to emerge as early as the unicellular eukaryote [6], Artemis also originates from the eukaryotic ancestor and is widely distributed in eukaryotes (Fig 1A and S1 Table). Like repair mechanism in RAG1/2-mediated V(D)J recombination, the host DNA rejoining in ProtoRAG-mediated transposition is dependent on BbeArtemis. Moreover, the activity of Artemis from lancelet, lamprey and zebrafish is dependent on DNA-PKcs, which is also widely distributed in invertebrates, fungi, plants, and protists [30,31]. These observations collectively indicate the ancient nature of the Artemis-mediated DNA repair mechanism throughout evolution.
Besides the conserved roles in DNA repair, some evolutionary adaptations have been found for Artemis (S8A Fig). First, as shown in S9 Fig, the majority of Artemis orthologs, including those from fungi like DhaArtemis, possess a C-terminus, as previously reported [13]. The C-terminus of Artemis has been demonstrated to play a pivotal regulatory role during V(D)J recombination [14,32]. Cross-species analysis also indicated that the inhibitory effect of the BbeArtemis C-terminus in its ex vivo endonuclease activity, underscoring the functional conservation of the C-terminus among different species. Notably, with the exception of lancelet Artemis, no repeated sequences were observed in other Artemis. Since the EK repeats in BbeArtemis may affect the protein stability, enzymatic activity and splicing process, these EK repeats might play a unique role in the functional evolution of lancelet Artemis. Second, although the 3’DNA overhang and ssDNA processing activity of Artemis are equivalent between lancelet and human, the DNA hairpin opening activity is increased in vertebrates, possibly to better ensure the efficiency of RAG-mediated V(D)J recombination. Third, lancelet Artemis has sequence polymorphism and activity divergence. Interestingly, we could not find Q58/Q58 homozygous individuals, possibly due to the insufficient activity of Artemis in repairing the flanking ends during ProtoRAG transposition, potentially resulting in genome instability in lancelets. Thus, we assumed that individual diversity of Artemis may balance the genome stability and the survival of the ProtoRAG transposon in lancelets, as we have found for lancelet YY1 (Ying Yang 1). As reported, YY1 can inhibit the gene transcription of BbeRAG1L and BbeRAG2L and promote the repair of the host genome by interacting with the TIRs of ProtoRAG [33]. Four, lampreys rely on variable lymphocyte receptors (VLRs) for their adaptive immune responses, unlike the Igs and TCRs found in other vertebrates [34]. Previous research has identified hairpin-shaped transposons, such as hAT transposons [35] within the VLRB genomic loci [36]. Whether these transposons are involved in the formation of VLR loci or the rearrangement of mature VLRs is still an unknown. Since transposons, such as Transib and hAT, are widely distributed in eukaryotes and thought to have evolutionary relationships with RAGL transposons [35,37–39] (S8B Fig), the ancient Artemis-like proteins may also contribute to hairpin opening at the flanking DNA after transposition of these hairpin-formed transposons.
Through genetic mapping and positional cloning of SCID-related genes, several genes have been confirmed to cause SCID when they are mutated [40]. Deficiency of Artemis can cause a radiation-sensitive type of SCID known as RS-SCID, which is poorly responsive to allogeneic hematopoietic cell transplantation treatment [12]. Therefore, discovering new mutations in Artemis that affect its critical activity could help diagnose and treat Artemis deficiency-related SCID. Here, we discovered that some residues conserved during evolution are important for the endonuclease activity of human Artemis, such as L31, L59, Y60, C61 and S62. Among them, the C61R mutation is identified in the ClinVar database and its mutation can abolish the endonuclease activity of human Artemis, suggesting its disease relation with RS-SCID. By analyzing the sequence near C61, we further revealed that the LYCS motif is highly conserved during evolution and is crucial for protein stability, DNA binding and enzymatic activity of the Artemis proteins. By analyzing the nearby sequence of the c.181 T > C mutation (C61R) identified in ClinVar database, we found that it is suitable for base editing using cytosine base editors (CBEs) (S10A Fig). We further analyzed mutations in Artemis that have been demonstrated to cause SCID [27] and found some of them may be also suitable for editing by adenine base editor, another tool for gene editing that could convert A to G (S10A Fig). Thus, revealing more functional residues of Artemis would promote the development of suitable gene-editing therapies for RS-SCID patients in clinical treatment.
In addition to the mutations that may lead to RS-SCID, defects in Artemis may lead to telomere dysfunction and neoplastic transformation [41,42], which are related to carcinogenesis. Here, we employed the COSMIC database [43], an expert-curated database of somatic mutations, and found that S62 and L59, two residues that are identified to be essential for the activity of Artemis in this study, are mutated in melanoma cells [44] and lung carcinoma cells [45–49], respectively. Additionally, Artemis is highly expressed in human rectal cancer cells, and its deficiency increases colorectal cancer cell sensitivity to DNA-damaging agents [50]. Interestingly, we also observed that higher Artemis expression was associated with inferior outcomes with lower overall survival, such as in COAD (colon adenocarcinoma), KIRP (kidney renal papillary cell carcinoma) and LIHC (liver hepatocellular carcinoma) (S10B–S10E Fig). Due to its strong association with carcinogenesis and multiple functions in HR, NHEJ, and MMEJ repair pathways [51], targeting Artemis in cancer therapy appears to be a rational approach for increasing the efficacy of radiotherapy or chemotherapy. Inhibitors of Artemis have been reported based on pharmacological assays and function in decreasing the proliferation of patient-derived primary acute lymphoblastic leukemia cells [52]. More evidence from structure-function studies, assisted by computer virtual screening and pharmacological validation, may help to discover novel inhibitors for clinical usage.
Materials and methods
Animals, cells and antibodies
Chinese lancelets (B. belcheri) were captured from the shore nearby Zhanjiang city, China. Japanese lampreys (L. japonica) were captured from Songhua River in Jilin province. TU strains of zebrafish (D. rerio) were cultured in indoor recirculating aquaculture systems.
The 293T cell lines from ATCC were maintained in DMEM (Gibco) supplemented with 10% FBS at 37 °C under 5% CO2. Transfections were performed using jetPRIME (cat.: 114–15, PolyPlus-transfection Bioparc.) according to the manufacturer’s instructions.
Antibody reagents were purchased from the indicated manufacturers and diluted to appropriate concentrations for blotting analysis. Anti-Flag (1:5,000, cat.: 66008-2-Ig, Proteintech), Anti-β actin (1:10,000, cat.: 60008-1-Ig, Proteintech), Anti-GAPDH (1:50,000, cat.: 10494-1-AP Proteintech), Anti-Artemis (1:200, cat.: 13381, CST). Anti-Phosphoserine (1:125, cat.: ab9332, Abcam), DNA-PKcs mouse mAb (1:1,000, cat.: 12311S, CST), Anti-DNA-PKcs (phospho S2056) rabbit mAb (1:5,000, cat.: ab124918, Abcam).
Gene knocked out by CRISPR/Cas9 technology
HEK-293T cells were transfected with sgRNAs expression plasmids px458 using jetPRIME. After 24 h transfection, GFP-positive single-cell clones were sorted using a BD FACSAria II cell sorter. Single-cell clones were then cultured for about 2 weeks and verified by western blotting with Artemis rabbit mAb or DNA-PKcs (also known as PRKDC) mouse mAb. HsaArtemis_sgRNA and HsaDNA-PKcs_sgRNA sequences used were presented in S2 Table.
Fluorescent reporter assays
293T or 293T Artemis−/− cells in 24-well plates were cultured and transfected with indicated substrate vectors (pTIRG8, pTIRG8-ivt or pCJ) and indicated protein expression vectors per well by jetPRIME. After 4 h transfection, 1 μM NU7441 (cat.: S2638, Selleck) was added as indicated. After 48 h, cells were digested with trypsin and collected by centrifugation, then washed once with PBS and resuspended in PBS. GFP expression was analyzed using a Beckman FACS Calibur. As for the detection of HDJ, TTJ and TIR ends, plasmids were recovered and PCR were used. Primers used for detection of HDJ, TTJ and TIR ends were presented in S2 Table.
Genes cloning and plasmid construction
Artemis orthologs were identified in lancelet (B. belcheri, Bbe), zebrafish (D. rerio, Dre), Japanese lamprey (L. japonica, Lja) genome. Based on these sequences, BbeArtemis, LjaArtemis, DreArtemis and HsaArtemis were cloned from their cDNAs using indicated primers. Primer pairs used were shown in S2 Table. All Artemis genes were constructed into the pEZ-3× Flag (cat.: EX-NEG-M12, iGene Biotechnology) vectors for protein expression. Artemis gene mutants were prepared by QuikChange Lightning Multi Site-Directed Mutagenesis Kit (cat.: #210513, Agilent).
Bacterial colony assay
The substrate vector pTIR104 together with the BbeRAG1L, BbeRAG2L and Artemis expression vectors were co-transfected into 293TArtemis−/− cells seeded on a six-well culture plate. After 48 h, the recombinant plasmids were extracted from the transfected cells through the alkaline lysis method and transformed into the Escherichia coli DH5α. Kanamycin (50 μg/mL) and chloramphenicol (20 μg/mL) containing LB plates were used to select the positive clones. These positive clones were then sent to Sanger sequencing for further analysis. The primer used for sequencing was presented in S2 Table.
Protein purification
For Flag-tagged Artemis expression and purification, the pEZ-Artemis plasmids were transformed into 293T cells. The cells were collected after transfection with indicated vectors for 24 h and washed twice by pre-cold PBS. Cells were lysed with lysis buffer (1% TritonX-100, 25 mM Tris (pH 8.0), 500 mM NaCl, 2 mM EDTA and protease inhibitor cocktail (cat.: 4693132001, Roche)). After centrifugation with 12,000 g for 10 min at 4 °C, the cell lysis supernatants were immunoprecipitated with anti-Flag resins (cat.: A2220, Sigma-Aldrich) for 2 h at 4 °C. The resins were slowly washed five times with wash buffer (1% TritonX-100, 25 mM Tris (pH 8.0), 500 mM NaCl, 2 mM EDTA and 1 mM PMSF). Then samples were eluted with elution buffer (25 mM Tris (pH 8.0), 500 mM NaCl, 1 mM DTT) containing 5 μg/μl 3 × Flag peptides (cat.: F4799, Sigma-Aldrich) according to manufacturer’s instructions and dialyzed with dialysis buffer (25 mM Tris (pH 7.5), 150 mM KCl, 10% glycerol) through ultra-filtration. The purified protein concentration was quantified using Detergent Compatible Bradford Protein Assay Kit (cat.: P0006C-1, Beyotime Biotechnology) and stored at −80°C.
In vitro cleavage assay
The DNA cleavage substrates [1] were labeled by Cy5 fluorophores (S2 Table). The 10 μl cleavage reactions containing reaction buffer (25 mM Tris (pH 7.0), 10 mM KCl, 10 mM MnCl2, 0.25 mM ATP, 0.5 mg BSA, 1 mM DTT), 100 nM DNA substrates and 150 nM Artemis proteins were incubated at 37 °C for 2 h (DNA hairpin) or 30 min (ssDNA and DNA overhangs). The reactions were stopped by adding 10 μl stop buffer (98% formamide and 2% EDTA) and incubated at 100 °C for 5 min. Then the reactions were loaded into the 20% urea TBE (Tris-borate-EDTA) acrylamide gels. After electrophoresis, the gels were imaged using Bio-Rad QualityOne.
Structural prediction of human Artemis
Structures of human Artemis mutants were modeled by submitting the corresponding sequences to the SWISS-MODEL server (https://swissmodel.expasy.org/), utilizing the human Artemis X-ray crystal structure (PDB: 6WNL) as a reference template. The generated models underwent energy minimization in Discovery Studio to refine their structures. Subsequently, electrostatic potential calculations and structural visualizations were conducted using PyMOL, using its default settings.
Phylogenetic analysis of SNM proteins
The Bayesian tree of SNM proteins was constructed with MrBayes3.2.1 using the amino acid sequences aligned by MAFFT (https://mafft.cbrc.jp/alignment/software/). Sequences were obtained from NCBI (S1 Table). The numbers at the nodes indicated the posterior probability.
Supporting information
S1 Fig. Phylogenetic analysis of Artemis, SNM1A and SNM1B across eukaryotes.
Species abbreviation was shown in S1 Table.
https://doi.org/10.1371/journal.pbio.3003056.s001
(TIF)
S2 Fig. Lancelet Artemis participates in DNA hairpin resolution.
(A) Schematic diagrams of the GFP reporter assays and PCR assays used to measure BbeRAG1L/2L-mediated TIR excision and recombination. TTJ indicated the TIR-TIR ends joint, which is formed when the TIR ends were ligated directly. (B) Artemis deficiency slightly decreased the formation of TTJ in 293T cells. (C) TIR-TIR joint was formed in 293TArtemis−/− cells. TTJ, TIR-TIR joint. (D) The protein sequence identity of seven lancelet Artemis. R, repeat sequence composing glutamic acid (E) and lysine (K) in the C-terminus of lancelet Artemis. FL, full length. aa, amino acid. The numbers in jade-green rectangles represented the percentage of the protein sequence identity, excluding the repeated sequence. (E) Diagram showing domains of BbeArtemisL1, BbeArtemisL2 and BbeArtemisL6 proteins. MBL, metallo-β-lactamase domain. R, the EK repeated sequences. (F) The only one Artemis gene copy in the chromosome 5 of Branchiostoma floridae. (G) Schematic diagrams of the GFP reporter assay mediated by mouse RAG1/2 in 293TArtemis−/− cells. (H) BbeArtemisL2 and BbeArtemisL6 were much effective than BbeArtemisL1 in promoting CJ formation in 293TArtemis−/− cells. (I) HsaArtemis added palindromic nucleotides in the HDJ products in 293TArtemis−/− cells. P, palindromic nucleotides. Plus and minus signs indicate numbers of nucleotides that added or cleaved in the products. (J) BbeArtemisL6 added palindromic nucleotides in the CJ products in 293TArtemis−/− cells. (K) Purified water demonstrated the most favorable for 20 bp hairpin formation during in vitro annealing. NEBuffer 1.1–3.1 and CutSmart buffer were from New England Biolabs (NEB). (L) SDS-PAGE gels and Coomassie Brilliant Blue staining have been used to verify the high protein purity of lancelet Artemis. CBB, Coomassie brilliant blue. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files).
https://doi.org/10.1371/journal.pbio.3003056.s002
(TIF)
S3 Fig. Schematic representation of the two lancelet Artemis alleles and their splicing isoforms from different individuals.
(A) The BbeArtmis contains 15 exons. (B–F) Two lancelet Artemis alleles and their splicing isoforms were cloned from five individuals. FS, frameshift. The two lancelet Artemis alleles were presented in different colors. Numbers in red indicated clones we have obtained for each sequence. (G) BbeArtemisL2_ΔR exhibited higher expression levels compared to BbeArtemisL2, while BbeArtemisL6_ΔR showed lower expression levels compared to BbeArtemisL6. Raw data can be found in Supporting information (S1 Raw Images).
https://doi.org/10.1371/journal.pbio.3003056.s003
(TIF)
S4 Fig. A splice variant of allele G utilized the “AG” dinucleotides in the repeat sequence as the splice acceptor site.
The splicing variant exhibited a deletion spanning the entirety of exons 12–14 and extending into exon 15, resulting in partial loss of exon 15’s coding sequence.
https://doi.org/10.1371/journal.pbio.3003056.s004
(PDF)
S5 Fig. Multiple sequence alignment of Artemis from different lancelet individuals.
Conserved residues were colored in red. Figure is shown by ESPript 3.0.
https://doi.org/10.1371/journal.pbio.3003056.s005
(PDF)
S6 Fig. Lamprey and zebrafish Artemis open DNA substrates dependent on DNA-PKcs ex vivo.
(A) Different residues in lamprey Artemis proteins which were derived from nine lamprey individuals. The red number text represented numbers of lamprey which share the same Artemis protein sequence. The red residues represented different amino acids compared to LjaArtemisL5. (B) Different residues in zebrafish Artemis proteins derived from eight zebrafishes. The red text represented numbers of zebrafish individuals that share the same Artemis protein sequence. Red residues indicated the different amino acids among the four DreArtemis proteins. (C and D) LjaArtemisL1 (C) and DreArtemisL1 (D) added palindromic nucleotides in the HDJ products in 293TArtemis−/− cells. P, palindromic nucleotides. N, non-template nucleotides addition. Plus and minus signs indicate the number of nucleotides added and cleaved in the putative products, respectively. (E) Vertebrate Artemis proteins were much effective than those of invertebrates in a DNA-PKcs dependent manner in HDJ formation in 293TArtemis−/− cells. NU7441, a specific inhibitor of DNA-PKcs. (F) ABCDE autophosphorylation sites and Artemis interaction sites of DNA-PKcs are evolutionarily conserved. Accession number of DNA-PKcs proteins sequences were presented in S1 Table. Raw data can be found in Supporting information (S1 Data files). All flow cytometry data represented the mean ± SD of three biological replicates. P-values were analyzed with Student t test. ns, not significant.
https://doi.org/10.1371/journal.pbio.3003056.s006
(TIF)
S7 Fig. Evolutionarily conserved and functional amino acids in Artemis.
(A) The expression level of HsaArtemis mutants were comparable with wild-type human Artemis in 293T cells. (B) LjaArtemis mutants showed decreased activity in CJ formation in 293TArtemis−/− cells. (C) Replacement of evolutionarily conserved residues in human Artemis do not decrease its expression in 293T cells. (D) The activity of DreArtemisL1 mutants were decreased in the formation of CJ following the cleavage of MmuRAG1/2. (E) The predicted structure of C61A, C61H and C61L mutants of HsaArtemis represented that only the binding pocket of C61H mutants was blocked. (F) The predicted structure of wild type, L59R, Y60R, C61R and S62R mutants of HsaArtemis represented that the β6 sheets were disrupted in these mutants. Raw data can be found in Supporting information (S1 Raw Images and S1 Data files).
https://doi.org/10.1371/journal.pbio.3003056.s007
(TIF)
S8 Fig. Correlation of Artemis and RAG.
(A) Distribution of RAG1L, RAG2L and Artemis across eukaryotes. The increased DNA hairpin opening activity and limited gene diversity of Artemis in vertebrates suggested its adaptation to the decreased transposase activity but increased recombinase activity of RAG. The species tree was inferred by TimeTree (http://www.timetree.org/). (B) Distribution of Artemis, hAT, Transib and DNA-PKcs in eukaryotes.
https://doi.org/10.1371/journal.pbio.3003056.s008
(TIF)
S9 Fig. Multiple sequence alignment of Artemis revealed that most of the Artemis proteins process a C-terminus.
Conserved residues were colored in red. Figure is shown by ESPript 3.0. Species abbreviation was shown in S1 Table.
https://doi.org/10.1371/journal.pbio.3003056.s009
(PDF)
S10 Fig. Clinical implication of Artemis.
(A) Mutations in Artemis that cause RS-SCID could potentially be corrected by CBEs and ABEs. Adenines and cytosine in red indicated the mutation from guanine to adenine and mutation from thymine to cytosine, respectively, within the human Artemis gene in RS-SCID patients. (B) Artemis expression levels in different cancer types from the TCGA database analyzed by the TIMER web platform (http://cistrome.dfci.harvard.edu/TIMER/). (*p < 0.05, ** p < 0.01, *** p < 0.001). (C–E) Kaplan-Meier survival curve of COAD (C), KIRP (D) and LIHC (E) with high and low Artemis expression analyzed by the GEPAI web platform (http://gepia2.cancer-pku.cn/#analysis). COAD, colon adenocarcinoma; KIRP, kidney renal papillary cell carcinoma; LIHC, liver hepatocellular carcinoma.
https://doi.org/10.1371/journal.pbio.3003056.s010
(TIF)
S1 Table. Species abbreviation and intron phase composition of Artemis.
https://doi.org/10.1371/journal.pbio.3003056.s011
(XLSX)
S2 Table. Primers used for gene cloning and various DNA substrates preparation.
https://doi.org/10.1371/journal.pbio.3003056.s012
(XLSX)
S1 Data. Excel spreadsheet containing, in separate sheets, the underlying numerical data for Figs 2C, 2G, 3C, 3D, 3G–3J, 4B, 4C, 4H, 4L, 5B, 5D, 5H, 5J, S2B, S2H, S6E, S7B and S7D.
https://doi.org/10.1371/journal.pbio.3003056.s013
(XLSX)
S1 Raw Images. Unprocessed original images for Figs 2B, 2D, 2G, 2J–2L, 3G, 3H, 4D–4G, 4I–4K, 5E–5G, 5L, 5M, S2C, S2L, S2K, S3G and S7A–S7D.
‘×’ indicates irrelevant sample.
https://doi.org/10.1371/journal.pbio.3003056.s014
(PDF)
Acknowledgments
We thank Dr. Pierre Pontarotti for his comments on the evolution of Artemis in this study.
References
- 1. Ma Y, Pannicke U, Schwarz K, Lieber MR. Hairpin opening and overhang processing by an Artemis/DNA-dependent protein kinase complex in nonhomologous end joining and V(D)J recombination. Cell. 2002;108(6):781–94. pmid:11955432
- 2. Rooney S, Sekiguchi J, Zhu C, Cheng HL, Manis J, Whitlow S, et al. Leaky Scid phenotype associated with defective V(D)J coding end processing in Artemis-deficient mice. Mol Cell. 2002;10(6):1379–90. pmid:12504013
- 3. Sakano H, Hüppi K, Heinrich G, Tonegawa S. Sequences at the somatic recombination sites of immunoglobulin light-chain genes. Nature. 1979;280(5720):288–94. pmid:111144
- 4. Agrawal A, Eastman QM, Schatz DG. Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system. Nature. 1998;394(6695):744–51. pmid:9723614
- 5. Huang S, Tao X, Yuan S, Zhang Y, Li P, Beilinson HA, et al. Discovery of an active RAG transposon illuminates the origins of V(D)J recombination. Cell. 2016;166(1):102–14. pmid:27293192
- 6. Tao X, Huang Z, Chen F, Wang X, Zheng T, Yuan S. The RAG key to vertebrate adaptive immunity descended directly from a bacterial ancestor. Natl Sci Rev. 2022;9(8):nwac073. https://doi.org/10.1093/nsr/nwac073 PMID: pmid:36060303
- 7. Poinsignon C, Moshous D, Callebaut I, de Chasseval R, Villey I, de Villartay J-P. The metallo-beta-lactamase/beta-CASP domain of Artemis constitutes the catalytic core for V(D)J recombination. J Exp Med. 2004;199(3):315–21. pmid:14744996
- 8. Yan Y, Akhter S, Zhang X, Legerski R. The multifunctional SNM1 gene family: not just nucleases. Future Oncol. 2010;6(6):1015–29. pmid:20528238
- 9. Karim MF, Liu S, Laciak AR, Volk L, Koszelak-Rosenblum M, Lieber MR, et al. Structural analysis of the catalytic domain of Artemis endonuclease/SNM1C reveals distinct structural features. J Biol Chem. 2020;295(35):12368–77. pmid:32576658
- 10. Niewolik D, Schwarz K. Physical ARTEMIS:DNA-PKcs interaction is necessary for V(D)J recombination. Nucleic Acids Res. 2022;50(4):2096–110. pmid:35150269
- 11. de Villartay J-P, Shimazaki N, Charbonnier J-B, Fischer A, Mornon J-P, Lieber MR, et al. A histidine in the beta-CASP domain of Artemis is critical for its full in vitro and in vivo functions. DNA Repair (Amst). 2009;8(2):202–8. pmid:19022407
- 12. Schuetz C, Neven B, Dvorak CC, Leroy S, Ege MJ, Pannicke U, et al. SCID patients with ARTEMIS vs RAG deficiencies following HCT: increased risk of late toxicity in ARTEMIS-deficient SCID. Blood. 2014;123(2):281–9. pmid:24144642
- 13. Bonatto D, Brendel M, Henriques JAP. In silico identification and analysis of new Artemis/Artemis-like sequences from fungal and metazoan species. Protein J. 2005;24(6):399–411. pmid:16323046
- 14. Niewolik D, Peter I, Butscher C, Schwarz K. Autoinhibition of the nuclease ARTEMIS Is mediated by a physical interaction between its catalytic and C-terminal domains. J Biol Chem. 2017;292(8):3351–65. pmid:28082683
- 15. Phung DK, Rinaldi D, Langendijk-Genevaux PS, Quentin Y, Carpousis AJ, Clouet-d’Orval B. Archaeal β-CASP ribonucleases of the aCPSF1 family are orthologs of the eukaryal CPSF-73 factor. Nucleic Acids Res. 2013;41(2):1091–103. pmid:23222134
- 16. Lenain C, Bauwens S, Amiard S, Brunori M, Giraud-Panis M-J, Gilson E. The Apollo 5′ exonuclease functions together with TRF2 to protect telomeres from DNA repair. Curr Biol. 2006;16(13):1303–10. pmid:16730175
- 17. Freibaum BD, Counter CM. hSnm1B is a novel telomere-associated protein. J Biol Chem. 2006;281(22):15033–6. pmid:16606622
- 18. Sánchez D, Ganfornina MD, Gutiérrez G, Marín A. Exon-intron structure and evolution of the Lipocalin gene family. Mol Biol Evol. 2003;20(5):775–83. pmid:12679526
- 19. Huang S, Chen Z, Yan X, Yu T, Huang G, Yan Q, et al. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes. Nat Commun. 2014;5:5896. pmid:25523484
- 20. Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997;13(5):555–6. pmid:9367129
- 21. Demogines A, East AM, Lee J-H, Grossman SR, Sabeti PC, Paull TT, et al. Ancient and recent adaptive evolution of primate non-homologous end joining genes. PLoS Genet. 2010;6(10):e1001169. pmid:20975951
- 22. Liu L, Chen X, Li J, Wang H, Buehl CJ, Goff NJ, et al. Autophosphorylation transforms DNA-PK from protecting to processing DNA ends. Mol Cell. 2022;82(1):177-189.e4. pmid:34936881
- 23. Watanabe G, Lieber MR, Williams DR. Structural analysis of the basal state of the Artemis:DNA-PKcs complex. Nucleic Acids Res. 2022;50(13):7697–720. pmid:35801871
- 24. Noordzij JG, Verkaik NS, van der Burg M, van Veelen LR, de Bruin-Versteeg S, Wiegant W, et al. Radiosensitive SCID patients with Artemis gene mutations show a complete B-cell differentiation arrest at the pre-B-cell receptor checkpoint in bone marrow. Blood. 2003;101(4):1446–52. pmid:12406895
- 25. Moshous D, Pannetier C, Chasseval Rd R de, Deist Fl F le, Cavazzana-Calvo M, Romana S, et al. Partial T and B lymphocyte immunodeficiency and predisposition to lymphoma in patients with hypomorphic mutations in Artemis. J Clin Invest. 2003;111(3):381–7. pmid:12569164
- 26. Kobayashi N, Agematsu K, Sugita K, Sako M, Nonoyama S, Yachie A, et al. Novel Artemis gene mutations of radiosensitive severe combined immunodeficiency in Japanese families. Hum Genet. 2003;112(4):348–52. pmid:12592555
- 27. Felgentreff K, Lee YN, Frugoni F, Du L, van der Burg M, Giliani S, et al. Functional analysis of naturally occurring DCLRE1C mutations and correlation with the clinical phenotype of ARTEMIS deficiency. J Allergy Clin Immunol. 2015;136(1):140-150.e7. pmid:25917813
- 28. Yosaatmadja Y, Baddock HT, Newman JA, Bielinski M, Gavard AE, Mukhopadhyay SMM, et al. Structural and mechanistic insights into the Artemis endonuclease and strategies for its inhibition. Nucleic Acids Res. 2021;49(16):9310–26. pmid:34387696
- 29. Kuriata A, Iglesias V, Pujols J, Kurcinski M, Kmiecik S, Ventura S. Aggrescan3D (A3D) 2.0: prediction and engineering of protein solubility. Nucleic Acids Res. 2019;47(W1):W300–7. pmid:31049593
- 30. Lees-Miller JP, Cobban A, Katsonis P, Bacolla A, Tsutakawa SE, Hammel M, et al. Uncovering DNA-PKcs ancient phylogeny, unique sequence motifs and insights for human disease. Prog Biophys Mol Biol. 2021;163:87–108. pmid:33035590
- 31. Elías-Villalobos A, Fort P, Helmlinger D. New insights into the evolutionary conservation of the sole PIKK pseudokinase Tra1/TRRAP. Biochem Soc Trans. 2019;47(6):1597–608. pmid:31769470
- 32. Malu S, De Ioannes P, Kozlov M, Greene M, Francis D, Hanna M, et al. Artemis C-terminal region facilitates V(D)J recombination through its interactions with DNA Ligase IV and DNA-PKcs. J Exp Med. 2012;209(5):955–63. pmid:22529269
- 33. Liu S, Yuan S, Gao X, Tao X, Yu W, Li X. Functional regulation of an ancestral RAG transposon ProtoRAG by a trans-acting factor YY1 in lancelet. Nat Commun. 2020;11(1):4515. pmid:32908127
- 34. Pancer Z, Amemiya CT, Ehrhardt GRA, Ceitlin J, Gartland GL, Cooper MD. Somatic diversification of variable lymphocyte receptors in the agnathan sea lamprey. Nature. 2004;430(6996):174–80. pmid:15241406
- 35. Zhou L, Mitra R, Atkinson PW, Hickman AB, Dyda F, Craig NL. Transposition of hAT elements links transposable elements and V(D)J recombination. Nature. 2004;432(7020):995–1001. pmid:15616554
- 36. Das S, Rast J, Li J, Kadota M, Donald J, Kuraku S. Evolution of variable lymphocyte receptor B antibody loci in jawless vertebrates. Proc Natl Acad Sci U S A. 2021;118(50):e2116522118. pmid:34880135
- 37. Hencken CG, Li X, Craig NL. Functional characterization of an active Rag-like transposase. Nat Struct Mol Biol. 2012;19(8):834–6. pmid:22773102
- 38. Feschotte C, Pritham EJ. DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet. 2007;41:331–68. pmid:18076328
- 39. Kapitonov VV, Jurka J. RAG1 core and V(D)J recombination signal sequences were derived from Transib transposons. PLoS Biol. 2005;3(6):e181. pmid:15898832
- 40. Picard C, Bobby Gaspar H, Al-Herz W, Bousfiha A, Casanova J-L, Chatila T, et al. International union of immunological societies: 2017 primary immunodeficiency diseases committee report on inborn errors of immunity. J Clin Immunol. 2018;38(1):96–128. pmid:29226302
- 41. Dudásová Z, Chovanec M. Artemis, a novel guardian of the genome. Neoplasma. 2003;50(5):311–8. pmid:14628082
- 42. Yasaei H, Slijepcevic P. Defective Artemis causes mild telomere dysfunction. Genome Integr. 2010;1(1):3. pmid:20678254
- 43. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 2019;47(D1):D941–7. pmid:30371878
- 44. Hintzsche JD, Gorden NT, Amato CM, Kim J, Wuensch KE, Robinson SE, et al. Whole-exome sequencing identifies recurrent SF3B1 R625 mutation and comutation of NF1 and KIT in mucosal melanoma. Melanoma Res. 2017;27(3):189–99. pmid:28296713
- 45. Campbell JD, Alexandrov A, Kim J, Wala J, Berger AH, Pedamallu CS, et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat Genet. 2016;48(6):607–16. pmid:27158780
- 46. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature. 2020;578(7793):82–93. pmid:32025007
- 47. Satpathy S, Krug K, Jean Beltran PM, Savage SR, Petralia F, Kumar-Sinha C, et al. A proteogenomic portrait of lung squamous cell carcinoma. Cell. 2021;184(16):4348–71.e40. https://doi.org/10.1016/j.cell.2021.07.016 PMID: pmid:34358469
- 48. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4. pmid:22588877
- 49. Gao J, Aksoy B, Dogrusoz U, Dresdner G, Gross B, Sumer S. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6(269):pl1. https://doi.org/10.1126/scisignal.2004088 PMID: pmid:23550210
- 50. Liu H, Wang X, Huang A, Gao H, Sun Y, Jiang T, et al. Silencing Artemis enhances colorectal cancer cell sensitivity to DNA-damaging agents. Oncol Res. 2018;27(1):29–38. pmid:29426373
- 51. Moscariello M, Wieloch R, Kurosawa A, Li F, Adachi N, Mladenov E, et al. Role for Artemis nuclease in the repair of radiation-induced DNA double strand breaks by alternative end joining. DNA Repair (Amst). 2015;3129–40. pmid:25973742
- 52. Ogana HA, Hurwitz S, Hsieh C-L, Geng H, Müschen M, Bhojwani D, et al. Artemis inhibition as a therapeutic strategy for acute lymphoblastic leukemia. Front Cell Dev Biol. 2023;11:1134121. pmid:37082620