I-PfoP3I: A Novel Nicking HNH Homing Endonuclease Encoded in the Group I Intron of the DNA Polymerase Gene in Phormidium foveolarum Phage Pf-WMP3

Homing endonucleases encoded in a group I self-splicing intron in a protein-coding gene in cyanophage genomes have not been reported, apart from some free-standing homing edonucleases. In this study, a nicking DNA endonuclease, I-PfoP3I, encoded in a group IA2 intron in the DNA polymerase gene of a T7-like cyanophage Pf-WMP3, which infects the freshwater cyanobacterium Phormidium foveolarum is described. The Pf-WMP3 intron splices efficiently in vivo and self-splices in vitro simultaneously during transcription. I-PfoP3I belongs to the HNH family with an unconventional C-terminal HNH motif. I-PfoP3I nicks the intron-minus Pf-WMP3 DNA polymerase gene more efficiently than the Pf-WMP4 DNA polymerase gene that lacks any intervening sequence in vitro, indicating the variable capacity of I-PfoP3I. I-PfoP3I cleaves 4 nt upstream of the intron insertion site on the coding strand of EXON 1 on both intron-minus Pf-WMP3 and Pf-WMP4 DNA polymerase genes. Using an in vitro cleavage assay and scanning deletion mutants of the intronless target site, the minimal recognition site was determined to be a 14 bp region downstream of the cut site. I-PfoP3I requires Mg2+, Ca2+ or Mn2+ for nicking activity. Phylogenetic analysis suggests that the intron and homing endonuclease gene elements might be inserted in Pf-WMP3 genome individually after differentiation from Pf-WMP4. To our knowledge, this is the first report of the presence of a group I self-splicing intron encoding a functional homing endonuclease in a protein-coding gene in a cyanophage genome.


Introduction
Group I introns are self-splicing RNA sequences that are inserted into genes of a diverse range of bacteriophages of gramnegative bacteria, gram-positive bacteria and cyanobacteria. Most introns have been encountered in phages of Myoviridae or Siphoviridae family such as Escherichia coli phage T4 [1], Bacillus subtilis phage SPO1 [2], marine cyanomyovirus S-PM 2 [3,4,5] or Xanthomonas Campestris phage phiL7 [6]. Although the first description of group I introns in T7-like enteric bacteria phages WI and W31 (Podoviridae family) in 2004 [7], group I introns in T7like phages have not been widely reported since, especially in T7like cyanobacteria phages.
Many group I introns contain reading frames encoding a homing endonuclease gene (HEG) which is described as selfish genetic element [8]. Homing endonucleases (HEases) cleave single (nick) or double (DSB) strands at or close to the intron insertion site (IIS), generating strand breaks in homologous alleles that lack the intervening sequence (IVS). Subsequently, the strand breaks are repaired by homologous recombination using the allele that contains the HEG as a template [9]. As a result, group I introns are transferred into a new site [10]. However, different from typical intron-encoded HEases, I-HmuI and I-HmuII can cleave both intron-plus and intronless versions of their cognate genes.
These two nicking HEases are encoded in group I introns in the DNA polymerase genes of B. subtilis phages SPO1 and SP82 [11].
HEases are divided into five families based on conserved nuclease active-site core motifs, catalytic mechanisms, biological distributions and wider relationships to non-homing nuclease systems. They are LAGLIDADG, HNH, His-Cys box, GIY-YIG and PD-(D/E)-XK motif in one HEase I-Ssp6803I [12,13,14]. HEases recognize extremely specific target sites spanning  bp. This means that cleavage by HEases is rare, making them possible to be used in genome engineering and gene therapy by highly efficient gene targeting in mammalian cells [15,16]. However, HEases are tolerant to a variety of sequence variations within the recognition sequences [9].
Pf-WMP3 and Pf-WMP4 are two closely related T7-like cyanophages which infect the freshwater cyanobacterium P. foveolarum and were isolated from Lake Weiming [17,18]. In this article, we report the identification of a group I intron in the DNA polymerase (DNAP) gene of Pf-WMP3. This intron was initially found by DNA sequencing as an IVS (Figure 1). The intron was spliced in vivo and in vitro and inhibited growth when expressed in E. coli. A fully functional HEase (denoted I-PfoP3I) of the HNH family is encoded in this intron. I-PfoP3I nicked the intron-minus Pf-WMP3 DNAP gene and the Pf-WMP4 DNAP gene which did not contain any interfering sequence, indicating the variable capacity of I-PfoP3I. The recognition site of I-PfoP3I covers base pairs 2 to 15 downstream of the cut site on the coding strand. The endonuclease activity of I-PfoP3I purified from E. coli was independent of any added divalent cations, but apo-I-PfoP3I (I-PfoP3I without metal co-factors) required one of the metal ions, Mg 2+ , Ca 2+ or Mn 2+ to resume the endonuclease activity. Phylogenetic analysis suggests that the intron and the HEG elements might be inserted in Pf-WMP3 genome individually after differentiation from Pf-WMP4.

Results
In vivo Splicing of the Group I Intron in Pf-WMP3 DNAP Gene The intron of Pf-WMP3 DNAP gene was tested for in vivo splicing activity from the primary transcript by nonquantitative RT-PCR. P3f1 and P3r1 specific primers gave products of 850 bp from genomic DNA ( Figure 1 Figure 2A, we can observe that RNA isolated from cells presented both unspliced and spliced DNAP mRNAs. Additionally, RT-PCR amplified product was cloned into pEASY-T1 cloning vector (TRANS) and sequenced ( Figure 2B). It was confirmed that the predicted location of this intron was correct by comparison with DNAP gene of Pf-WMP3 ( Figure S1). There is no inverted repeat in the terminal regions of the intron sequence, suggesting that the IVS is not an insertion element or a transposon [19]. The IVS was also shown to be removable from the Pf-WMP3 DNAP precursor mRNA that was being translated in E. coli (BL21/pETP3DNAP[+int], data not shown).

In vitro Splicing of the Group I Intron in Pf-WMP3 DNAP Gene
To examine whether the intron can be excised in vitro from the primary transcript, intron sequence including flanking exons was cloned into pET-28a(+) in a proper orientation. The DNAP[+int] gene was under the control of the T7 promoter. RNA was obtained using T7 RNA polymerase from XhoI linearized plasmid pETP3DNAP[+int] (Materials and methods). This produced an RNA transcript with a 709 base 59 exon, 672 base intron and 780 base 39 exon. PCR was done using cDNA (from RNA generated by in vitro transcription) with P3f3 and P3r2 primers that recognized the flanking exons ( Figure 1). The PCR product was 312 bp in size, corresponding to the size of the ligated exons ( Figure 2C). This result indicated that the intron was able to selfsplice in vitro simultaneously during transcription [20,21]. Additionally, sequencing of the 312-bp PCR product confirmed correct exon ligation (data not shown).

Prediction of the Secondary Structure of the Intron
Group I introns share conserved secondary structure elements which are necessary for ribozyme activity [22]. The secondary structure of Pf-WMP3 intron was predicted by Mfold program [23] with manual correction according to conventions for group I introns. The introns of WI and W31 [7] were referred to as models ( Figure 2D). It folded into a typical group I intron structure with all conserved stem-loops P1 through P10 except P2. Secondary structure and the characteristic helical elements P7.1 and P7.2 linked by a G-U-A sequence in the intron assign it to subgroup IA2 [7,24]. The conserved secondary structure elements are necessary for proper folding and excision. The open reading frame (ORF) is predicted to start in the P6a region and to span the P7 region with 486 bp, contributing to key structure elements of P6a, P6, P7, P7.1 and P7.2. The intron has a 4-bp long P10 paring between the sequence around the start of 39 exon and a sequence near the 59 end of the intron, promoting an alignment between the 39 and 59 splicing sites required for the ligation of exons. The characteristic sequence elements with a terminal exonic uracil at the 59 spliced position forming a pair with guanosine in the P1 stem and a guanosine at the 39 end are typical in most group I introns.
The intron also has a typical ribosome binding site (RBS), located 6 to 12 bp upstream of the start codon. Like introns in DNAP genes of T7-like bacteriophages WI and W31 [7], the RBS of Pf-WMP3 intron may reduce overall expression of the HEase compared with the product DNAP (in whose transcript the HEG is embedded).
The Pf-WMP3 Intron Retards the Growth Rate of E. coli Like some introns such as 26S rRNA intron from Tetrahymena thermophila [25] and 23S rRNA introns from Coxiella burnetii [26], the Pf-WMP3 intron displayed a significantly decreased growth rate relative to controls when expressed in E. coli. We monitored the growth rates of E. coli strains transformed with pETP3DNA-P[+int] and pETP4DNAP spectrophotometrically for 5 h. As shown in Figure 3, E. coli expressing the intron had a significant retarded growth rate when compared to the control after 0.05 mM IPTG was added. The Pf-WMP3 Intron Encodes a Nicking DNA Endonuclease As shown in Figure 1, the Pf-WMP3 DNAP gene was interrupted by a 672-bp intron, located between Pro363 and Asn364. This intron contains an ORF encoding a 161-amino-acidresidue putative HEase of the HNH family using the protein blast tool at NCBI [27] with default parameters. According to the suggested nomenclature for HEase [28], we named the ORF I-PfoP3I (Intron-encoded HEase, P. foveolarum phage Pf-WMP3, I). I-PfoP3I is inserted into the stem of P6a, which is the same location where the T7-like phages WI and W31 encode HEases. There are eight subsets of proteins containing the HNHc domain. Subset 2 has mostly phage proteins which are intron-encoded sitespecific endonucleases with the HNHc domain closer to the Nterminal end of the protein in contrast to the other subsets [29]. Figure 4A, B show the conserved HNH motif of I-PfoP3I and other five subset 2 HEases. The intron containing I-PfoP3I, I-HmuI, I-HmuII, I-BasI is inserted in exactly the same genomic position of the respective DNAP gene, but from very widely divergent phages.
To address the question of whether the Pf-WMP3 intron ORF encodes a functional endonuclease, I-PfoP3I was expressed including a His 6 affinity tag at the C-terminal end. The expressed protein product was consistent with its predicted size (18.5 kDa) ( Figure 5).
Plasmids pETP3DNAP[+int], pETP3DNAP[2int] and pETP4D-NAP were used as substrates to detect supercoiled plasmid DNA cleavage by I-PfoP3I. pETP4DNAP contained DNAP gene from Pf-WMP4, which was isolated from Lake Weiming as Pf-WMP3 [17]. Both of the two phages infect the freshwater cyanobacterium P. foveolarum and they are closely related at the protein level and genome architecture [18]. However, DNAP gene from Pf-WMP4 did not contain any IVS. As shown in Figure 6A, a small amount of nicked products of plasmids pET28a and pETP3DNAP [+int] were generated by I-PfoP3I at 200 mM after 20 min. I-PfoP3I nicked the intron-minus Pf-WMP3 DNAP gene more efficiently than Pf-WMP4 DNAP gene. The supercoiled form of plasmid pETP3DNAP[2int] was completely converted to other forms by I-PfoP3I at 20 mM after 20 min while pETP4DNAP was completely converted to other forms by I-PfoP3I at 200 mM after 20 min.
As shown in Figure 6B, PCR products of Pf-WMP3 DNAP gene (intron-plus or intron-minus) and wild type Pf-WMP4 DNAP gene were used as substrates (10 nM). One strand of both intron-minus Pf-WMP3 DNAP gene and wild type Pf-WMP4 DNAP gene were cleaved by purified I-PfoP3I at 1000 nM after 20 min. No cleavage activity was detected on intron-plus Pf-WMP3 DNAP gene when incubated with I-PfoP3I under the same condition. No activity was detected using purified protein derived from cells transformed with the expression plasmid pET28a(+) vector without insert ( Figure 6B, Lane Control Protein). Substrates with 59 end-labeled on both strands showed a cleavage product about 1089 nt or 1053 nt in size, indicating that I-PfoP3I introduced a nick in the sense strand of EXON 1 of the target DNA. No cleavage of the antisense strand was detected under the same condition [30].
To characterize metal ions effect on the endonuclease activity of apo-I-PfoP3I, purified I-PfoP3I was first treated with EDTA to remove the endogenous ions bound to the enzyme expressed in E. coli. ,1 mM of EDTA remained in the protein solution, extracting any residual metal ions to eliminate any metal contamination. We found that the apo-I-PfoP3I did not cleave plasmid or PCR product DNA ( Figure 7A, B, C). However, the endonuclease activity of apo-I-PfoP3I resumed by the presence of one of the metal ions, Mg 2+ , Ca 2+ or Mn 2+ . The lowest Mg 2+ concentration used to digest DNA completely is 1000 fold higher than the residual concentration of EDTA. Ca 2+ and Mn 2+ were able to activate apo-I-PfoP3I at 10 mM. Co 2+ and Zn 2+ make the assay system precipitate (data not shown). To test if the exogenous metal ions would enhance or inhibit endonuclease activity, the enzyme untreated with EDTA was incubated with Mg 2+ at a concentration range of 0-125 mM. Cleavage analyses indicate that metalbounding I-PfoP3I expressed in E. coli was independent of any divalent cations for activity. Lower concentration of Mg 2+ had no effect on the nuclease activity of I-PfoP3I, while a higher concentration of Mg 2+ was progressively detrimental to the enzyme activity. When the Mg 2+ concentration reached 125 mM, that is ,10 5 -fold to I-PfoP3I, the Mg 2+ ion completely inhibited the endonuclease activity ( Figure 7D). The metalbounding enzyme was precipitated in the presence of Mn 2+ , Zn 2+ or Ca 2+ (data not shown). . Effect of Pf-WMP3 intron on E. coli growth. E. coli cells expressing cloned partial Pf-WMP3 DNAP gene with intron or an irrelevant control RNA (pETP4DNAP) were induced with (0.05 mM) or without IPTG and assayed spectrophotometrically for growth at 37uC over 5 h. E. coli growth assays were performed three times and the averaged optical density was used to construct the growth curve. When the error bar cannot be seen, the deviation is less than the size of the symbol. doi:10.1371/journal.pone.0043738.g003 Mapping of DNA Cleavage Site Introduced by I-PfoP3I As shown in endonuclease activity assays, the breakpoints introduced by I-PfoP3I were located on the coding strands of both Pf-WMP3 and Pf-WMP4 DNAP genes. Nucleotide sequencing was used to determine precise cleavage sites of I-PfoP3I. Both substrates were cleaved on the coding strands 4 nt upstream of the IIS despite considerable differences in the nucleotide sequence surrounding the cleavage site ( Figure 8A, B). Pf-WMP3 intron was inserted in the same site as introns in SPO1, SP82 and Bastille according to the corresponding amino acid sequence alignment for related genes ( Figure 8C). The fact that both substrates were cleaved at the same site of both Pf-WMP3 and Pf-WMP4 DNAP genes indicates that I-PfoP3I binds homologous stretches of its respective DNAP genes.
To map the approximate size of the recognition site, we focused our investigation on a 30 bp region surrounding the I-PfoP3I cleavage site in intronless Pf-WMP3 DNAP gene. We used a PCR-based mutagenesis strategy to introduce sitedirected 1 to 81 bp deletions into the region to make short wild-type flanking sequences either upstream or downstream of the IIS ( Figure 9A). We presumed that deletion of base pairs within the recognition site would greatly reduce or eliminate cleavage of the substrate by I-PfoP3I [14]. Figure 9B shows that PCR products containing deletions extending from positions 23 to +11 (with respect to IIS) were cleaved with reduced efficiency ( Figure 9B  The results indicate that the sequence necessary for full cleavage activity by I-PfoP3I comprises at least 14 bp ( Figure 9C). As shown in Figure 9D, plasmids pT1-23 and pT1-25 were used as substrates to detect supercoiled plasmid DNA cleavage by I-PfoP3I. pT1-23 contained the minimal recognition sequence and pT1-25 contained the same sequence with 14 bp sequence deleted ( Figure 9A). I-PfoP3I nicked pT1-23 efficiently, confirming the short nicking site.

Phylogenetic Relationships of DNAP Gene Group I Introns and their HNH Proteins
The trees ( Figure 10) demonstrated that most introns and HNH proteins in DNAP gene of phages infecting the same host appeared to be more closely related. However, HNH proteins were closely related in Pf-WMP3 and phiL7, which are biogeographically and morphologically distantly related. Cyanophage Pf-WMP3 of Podoviridae family, infecting the freshwater cyanobacterium P. foveolarum, was found in Beijing, China. The lytic phage phiL7 of Siphoviridae family, infecting Xanthomonas campestris pv. campestris, was isolated in the laboratory in Taiwan [6]. The introns in Pf-WMP3 and PhiL7 are inserted in a similar position of their respective DNAP genes and the HEases have a similar location of the HNH motif in the C-terminal half. It is possible that their HNH motifs and their DNA recognition sequences may be related.

Discussion
A group I Intron in the Genome of Cyanophage Pf-WMP3 Self-splicing group I introns are rarexly found in T7-like phages. In this study, we show that the DNAP gene, an essential enzyme for the replication for phage DNA, carries a group I intron that is efficiently spliced in vivo and in vitro. The intron is inserted in the DNAP gene at the site 674 (E. coli numbering), homologous to the introns in SPO1, SP82 and Bastille, all of which are closely related phages infecting B. subtilis. Other two group I introns inserted in DNAP genes of WI and W31 are at the site 881 [7]. The group I intron in Xanthomonas campestris phage phiL7 is inserted at the site 676 [6], indicating that there are at least three sites within these genes that can contain intron insertions ( Figure 8C). Like other group I introns [31,32], the insertion site is located in a highly conserved region of functional importance within the coding sequence. According to the crystal structure of a bacteriophage T7 DNA replication complex, the insertion site is in the finger subdomain of DNAP [33]. It is worth noting that the first three nucleotides of the intron are UAA, which can also serve as the stop codon of EXON 1 if this intron does not splice efficiently due to inexact deletion. This might be lethal, for EXON 1 displayed DNA exonuclease activity without any synthesis activity in vitro (data not shown).
There are some hypotheses to explain the growth inhibition caused by Pf-WMP3 intron. For example, the HEase activity from I-PfoP3I, with its relatively short recognition sequence, might cleave essential E. coli genes (in particular the DNA PolI gene would be a likely candidate). Also, when proteins are hyperexpressed from plasmids, a common observation is cessation of cell growth, even when the proteins are not toxic (beta-galactosidase, for example). Further work attempting to test the effect of group I introns on changes on the fitness of the host organism would be performed.

A Functional HEase Encoded by this Intron
A BLASTP search of the protein database using the I-PfoP3I amino acid sequence revealed only one protein to be highly similar, with maximum similarity in the C-termini, which contain the HNH motif, but conservation extending throughout, extending through the N-termini (which presumably contain the DNA binding regions). Interestingly, this presumptive HEase is from phage phiL7, whose intron is inserted into the homologous region of its DNAP gene. The E values of the subsequent BLASTP hits are much lower. The next two have good alignments with the HNH at their C-termini. But the very next one which is from Natromonas aligns its N-terminal HNH region with the C-terminal motif in I-PfoP3I, as do almost all the remaining hits on the list. The proteins in this list are referred to as ''putative'' HEases or ''hypothetical'' proteins because their biochemical activities have not been determined. The co-crystal structure of I-HmuI, which includes the HNH motif in the N-terminal part (Figure 4A), displays a 2domain arrangement with N-terminal catalytic and C-terminal DNA-binding domains (although significant specific DNA contacts are made near the N-terminus), leading to the proposal that these phage endonucleases have a two-domain structure [34].
Apo-I-PfoP3I required one of the metal ions, Mg 2+ , Ca 2+ or Mn 2+ for endonuclease activity, indicative of a relatively relaxed divalent metal requirement ( Figure 7C). However, the fact that I-PfoP3I expressed and purified from E. coli cleaved DNA substrates independently of any divalent cations indicates that endogenous metal is sufficient for promoting the activity. A higher concentration of Mg 2+ is progressively detrimental to the enzyme activity ( Figure 7D).
From the results presented in Figure 6, we suggest that I-PfoP3I possess nicking activity in vitro as DNA endonucleases I-HmuI, I-HmuII and I-BasI encoded in the introns of phages SPO1, SP82 and Bastille respectively. All these five HEases belong to the HNH endonuclease family [2,35]. I-PfoP3I was unable to cleave intron-plus DNA, indicating the disruption to the target site caused by the acquisition of the intron. Although the plasmid pETP3DNAP[+int] became relaxed circular after exposure to the enzyme ( Figure 6A), the nick could have occurred anywhere in the plasmid, not necessarily at the normal cleavage site of the enzyme. The recognition sequence for this enzyme is short ( Figure 9) and a secondary cleavage site could be recognized at very high enzyme concentrations. I-HmuI, I-HmuII and I-BasI cleave the template strand of the homologous alleles [30,34]. I-PfoP3I produced a nick in the coding strand like HNH HEase I-TwoI which is encoded in nrdE gene of Staphylococcus aureus phage Twort [31]. The incision that each of the five HNH HEases generates is 59 of the IIS, independent of which stand is cleaved [31]. Cleavage of the recipient DNA with HEase encoded within the intron makes the intron spread to cognate intron-less genes and persists in a host gene. As shown in Figure 8D, we compared the nt sequence flanking the cleavage site of the Pf-WMP3 and Pf-WMP4 DNAP genes to those of HEases residing within group I introns of DNAP genes such as I-HmuI, I-HmuII and I-BasI. I-HmuI cleaves 3 nt downstream of the IIS on both SPO1 and SP82 DNAP genes in a region with few differences between them [11]. In contrast, I-PfoP3I cleaves 4 nt upstream of the IIS on both intron-minus Pf-WMP3 DNAP gene and Pf-WMP4 DNAP gene in a region with more differences between these two DNAP genes, indicating the tolerance of multiple substitutions within the target sequences. From the results in Figure 9, we suggest that the recognition site of I-PfoP3I covers 2 to 15 bps downstream of the cut site (i.e. the recognition site covers 3 bp upstream and 11 bp downstream of IIS).

The Intron-HEG Element Might Insert in Pf-WMP3 Genome Individually after Differentiation from Pf-WMP4
Phylogenetic analysis suggests that these intron-HEG elements have been transferred horizontally among phages infecting similar hosts, indicating these elements can continue to persist into new populations or species via horizontal transfer [36]. Pf-WMP4 is closely related to Pf-WMP3 in genome sequence, size and structure. However, Pf-WMP4 genome does not contain any IVS, suggesting that the intron-HEG element in Pf-WMP3 genome might be obtained after differentiation of these phages. Although Pf-WMP3 and Xanthomonas campestris phage phiL7 contain close IISs and closely related HNH proteins, the group I introns of these two phages are distantly related (Figure 10). This might favor the model that the chimeric mobile element was formed by group I introns and HEGs individually targeting the same set of highly conserved DNA sequences for insertion and cleavage respectively [37].
In cyanophages, HEases were only reported as free-standing HEases such as F-CphI found in S-PM2 [4] and some similar HEases. All of these HEases are encoded adjacent to an intron-containing psbA gene, encoding the D1 core component of the photosynthetic reaction center PSII (photosystem II) [5]. Apart from these HEases, this isthe firsxt report of the presence of a functional HEase encoded in    onto BG11 agar plates. Plaque formation was used to isolate phages [39].

Isolation of Pf-WMP3 and Pf-WMP4 Phage DNA
To obtain a template for PCR reactions, genomic DNA was isolated from Pf-WMP3 and Pf-WMP4 according to previously published methods [40,41]. Briefly, after the addition of MgSO 4 (final concentration 20 mM) to lysates, phage particles were precipitated using polyethylene glycol grade 6000 (PEG 6000) and then further purified by sucrose density gradient. Purified phage particles were broken with SDS and proteinase K. DNA was extracted with phenol-chloroform and precipitated with NaOAc and ethanol. The purified DNA was then resuspended in sterile H 2 O and stored at 220uC.

Plasmid Construction
Intron P3f2 and P3R primer sites flank the intron sequence 709 bp upstream and 780 bp downstream respectively (Figure 1). Pf-WMP3 genomic DNA was used as a template for PCR with a pair of primers P3f2 and P3R. The PCR product was cloned into pEASY-T1 cloning vector (TRANS) to yield plasmid pTP3DNA-P[+int] and the orientation of cloned P3DNAP[+int] gene was confirmed by DNA sequencing. This intron sequence was then subcloned into pET28a(+) vector (Novagen) and the resulting plasmid was termed pETP3DNAP [+int].
The intronless version of Pf-WMP3 DNAP gene was amplified using an overlapping extension technique of PCR [42] and the PCR product of Pf-WMP3 DNAP[2int] gene was digested with restriction enzymes BamHI and SalI and then was ligated into a pET28a(+) vector to yield plasmid pETP3DNAP[2int]. PCR product of wild type Pf-WMP4 DNAP gene without any IVS was digested with restriction enzymes BamHI and SalI and then was ligated into a pET28a(+) vector to yield plasmid pETP4DNAP.
I-PfoP3I gene was obtained from Pf-WMP3 intron using primers P3f4 and P3r3 and the PCR product was digested with restriction enzymes NcoI and XhoI and then cloned into a pET28a(+) vector to generate plasmid pETI-PfoP3I which was used for overexpression of I-PfoP3I.
In vivo Splicing Assay P. foveolarum was grown in BG11 medium at 28uC to an OD 600 of 1.0 and infected with Pf-WMP3 at a multiplicity of 8 per cell. RNA was isolated at various times after infection using TRNzol Reagent (TIANGEN). RNase-free DNaseI (TaKaRa) was used to remove contaminating DNA. The total RNA was then incubated with sequence-specific primer P3r1 (Figure 1). Reverse transcription was carried out using M-MLV Reverse Transcriptase (Promega) according to the manufacturer's recommendations. PCR was used to analyze the presence of spliced and unspliced products. The primers P3f1 and P3r1 were used. Taq DNA polymerase was purchased from TIANGEN. The products were analyzed by electrophoresis in a 2% agarose gel and visualized with ethidium bromide.

In vitro Transcription and Splicing Assay
The P3f3 and P3r2 primers flank the intron 184 bp upstream and 128 bp downstream respectively (Figure 1). The pre-RNA for the in vitro splicing experiment was prepared by transcription using RiboprobeH in vitro Transcription System-T7 (Promega). The XhoI linearized pETP3DNAP[+int] (1 mg) was incubated with 40 mM Tris-HCl (pH 7.9), 10 mM NaCl, 6 mM MgCl 2 , 2 mM spermidine, 10 mM DTT, 40 u Recombinant RNasin Ribonuclease Inhibitor, 0.5 mM each of rATP, rGTP, rCTP and rUTP, 20 u T7 RNA Polymerase in a total volume of 20 ml at 37uC for 1 h. Template DNA was digested with RNase-free DNaseI (Promega). Reverse transcription was performed using sequencespecific primer P3r2 and M-MLV Reverse Transcriptase (Promega) according to the manufacturer's recommendations. PCR was carried out using primers P3f3 and P3r2. The resulting products were analyzed by electrophoresis in a 2% agarose gel.

Prediction of Intron Secondary Structure
Prediction of the intron secondary structure was performed using Mfold default settings (http://www.bioinfo.rpi.edu/ applications/mfold/rna/form1.cgi) and modified by hand using published intron secondary structures as a reference and was drawn using Adobe Photoshop (version 7.0).

Expression and Purification of I-PfoP3I
E. coli BL21 (DE3) bacteria transformed with pETI-PfoP3I were grown in Luria-Bertani broth supplemented with 50 mg ml 21 kanamycin at 37uC until the density reached an OD 600 of 0.6. Expression of His 6 I-PfoP3I was induced by adding IPTG to a final concentration of 0.1 mM. After an additional 4 h of culture growth at 30uC, cells were harvested and disrupted by sonication in lysis buffer (50 mM NaH 2 PO 4 (PH 8.0), 300 mM NaCl, 10 mM imidazole). The lysate was centrifuged at 10 000 g for 30 min at 4uC to pellet the cellular debris. The soluble fraction was added with Ni-NTA resin (Novagen) and then mixed gently for 30 min at 4uC. The resin was settled by low speed centrifugation (1000 g) for 10 s and then was washed several times with wash buffer (50 mM NaH 2 PO 4 (PH 8.0), 300 mM NaCl, 20 mM imidazole). The protein was eluted with elution buffer containing high concentrations of imidazole (50 mM NaH 2 PO 4 (PH 8.0), 300 mM NaCl, 250 mM imidazole). Protein concentration was determined using the Bradford method [43].

EDTA Treatment of the Purified I-PfoP3I
To examine the effect of a single divalent metal ion on the enzymatic activity of I-PfoP3I, preparation of apo-enzyme without any metal ion cofactor is necessary. The concentrated I-PfoP3I (,0.6 mg ml 21 ) was incubated with 1 M divalent metal chelating agent EDTA at room temperature for 1 h. The EDTA-treated I-PfoP3I was dialyzed against 1 l of 10 mM Tris-HCl buffer (pH 8.0) three times at 4uC overnight. A residual EDTA concentration of ,1 mM remained in the protein solution to ensure the absence of contamination of divalent metal ions. After dialysis, the apo-enzyme sample was concentrated to 1 ml (0.3 mg ml 21 ) by an Ultrafree-15 centrifugal filter (Millipore, Bedford, MA, USA).
The double-stranded DNA endonuclease activity of I-PfoP3I was assayed using purified PCR product with or without intron sequence of Pf-WMP3 and Pf-WMP4 DNAP genes. Both of the two strands were labeled with [c-32 P] ATP at 59 end. I-PfoP3I was incubated with 4000 counts per minute (cpm) of the fragments in 50 ml of assay buffer (50 mM Tris-Cl pH 8.0, 5 mM DTT) at room temperature for 20 min. Reactions were stopped by the addition of 5 ml of proteinase K (20 mg ml 21 ) and further incubated at 37uC for 1 h. After ethanol precipitation, samples (1:1 added with TaKaRa RNA Loading Buffer) were loaded on a 6% acrylamide gel with 8 M urea and separated in 40 mM Trisborate pH 8.0, 2 mM EDTA. Products were visualized with silver staining or autoradiography. Although the use of silver staining to visualize DNA makes images with higher background, this staining method is a rapid (less than 30 min to visualize DNA after urea acrylamide gel electrophoresis), sensitive, reproducible and inexpensive alternative to radioactive, fluorescent and chemiluminescent detection approaches.
To exactly localize the cleavage sites introduced by I-PfoP3I, double-stranded DNA endonuclease activity products of intronminus Pf-WMP3 DNAP gene and Pf-WMP4 DNAP gene were sequenced directly using ABI 3730xl DNA Analyzer. Reverse primers P3r1 and P4r1 were used as sequencing primers respectively.

Characterization of the Effects of Divalent Metal Ions on Endonuclease Activity
Five divalent metal ions (Co 2+ , Mg 2+ , Mn 2+ , Ca 2+ , Zn 2+ ) were used for these tests. The activity assays were carried out with apo-I-PfoP3I and different kinds of metal ions respectively. Reactions without any divalent metal ions were used as controls.

Mapping of the Recognition Site
Targets used to determine the I-PfoP3I recognition site boundaries contained variants of the wild-type intronless sequence differing by 1 to 81 bp deleted from the putative recognition site. The upstream parts of the deleted site were amplified using universal forward primer P3f3 and primers 1r to 25r. The downstream parts of the deleted site were amplified using primers 1f to 25f and universal reverse primer P3R. Oligonucleotides are shown in Table S1. The final targets containing altered target sites with deletions between positions 219 and +11 (with respect to the IIS) were created by amplification of the downstream parts and the upstream parts using forward primer P3f3 and reverse primer P3R. Reactions were cycled 35 times at 94uC for 30 s, 58uC for 30 s and 72uC for 60 s with a final extension at 72uC for 10 min. Amplification products were confirmed by sequencing (for variant target site sequences, see Figure 9A). Cleavage reactions were performed as above. Reaction products were separated by gel electrophoresis on a 6% acrylamide gel with 8 M urea and separated in 40 mM Tris-borate pH 8.0, 2 mM EDTA. Products were visualized with silver staining. Note that only one cleavage product was visualized, as the other strand was less than 180 nt in size and ran out of the gel by electrophoresis. Substrates (10 nM) of plasmids pT1-23 and pT1-25 were cleaved by I-PfoP3I at 20 nM. Cleavage reactions were performed as above. BamHI digested pT1-23 and pT1-25 were used as controls. Samples were analyzed by electrophoresis in a 1.5% agarose gels [wt vol 21 ] stained with ethidium bromide.

Oligonucleotides
The following oligonucleotides were used: