Application of a Novel Strategy of Engineering Conditional Alleles to a Single Exon Gene, Sox2

Background The Conditional by Inversion (COIN) method for engineering conditional alleles relies on an invertible optimized gene trap-like element, the COIN module, for imparting conditionality. The COIN module contains an optimized 3′ splice site-polyadenylation signal pair, but is inserted antisense to the target gene and therefore does not alter transcription, until it is inverted by Cre recombinase. In order to make COIN applicable to all protein-coding genes, the COIN module has been engineered within an artificial intron, enabling insertion into an exon. Methodology/Principal Findings Therefore, theoretically, the COIN method should be applicable to single exon genes, and to test this idea we engineered a COIN allele of Sox2. This single exon gene presents additional design challenges, in that its proximal promoter and coding region are entirely contained within a CpG island, and are also spanned by an overlapping transcript, Sox2Ot, which contains mmu-miR1897. Here, we show that despite disruption of the CpG island by the COIN module intron, the COIN allele of Sox2 (Sox2COIN) is phenotypically wild type, and also does not interfere with expression of Sox2Ot and miR1897. Furthermore, the inverted COIN allele of Sox2, Sox2INV is functionally null, as homozygotes recapitulate the phenotype of Sox2ßgeo/ßgeo mice, a well-characterized Sox2 null. Lastly, the benefit of the eGFP marker embedded in the COIN allele is demonstrated as it mirrors the expression pattern of Sox2. Conclusions/Significance Our results demonstrate the applicability of the COIN technology as a method of choice for targeting single exon genes.

Sox2 is a well-characterized and important example of a single exon gene. Sox2 pairs with tissue-specific partners [15] to impart and maintain pluripotency [16] and multipotency [14,17] during development and homeostasis [18]. Sox2 null embryos fail to form the epiblast and die at E5.5 [19]. However, even reduction in Sox2 levels to 25-30% relative to the wild type leads to pathological phenotypes in mice. These include neurodegeneration in the cortical region and hippocampus [20], hypoplasia of optic nerves and chiasmata and variable microphthalmia [21], failure of nasal placode induction [22], failure of taste buds to mature [23], malformation of the epithelium lining the conducting airways in the lung [24], enlargement of the lateral ventricles at E14.5 [25], and immature differentiation of cochlea hair follicles [26].
From a gene structure standpoint, Sox2 presents a complex locus rich in genetic elements, including an overlapping transcript [27], a putative microRNA [28], and a CpG island [29][30][31] [32]. The combination of a well-conserved compact locus with overlapping transcripts and regulatory elements [33][34][35][36][37][38][39], together with the apparent need to maintain proper levels of Sox2 for organogenesis and homeostasis, underscore the difficulties associated with designing conditional alleles for Sox2. We hypothesized that a recently developed method for generating conditional alleles -Conditional by Inversion (COIN) -might present a better choice over simple floxing of Sox2, and generated the corresponding conditional-null allele, Sox2 COIN . We show that this method is successful in that the Sox2 COIN allele starts as wild type, and it is converted into a null by the action of Cre, at which point, the expression of Sox2 is replaced by that of a marker, eGFP. This work indicates that the COIN method can be applied to single exon genes and provide a new design modality that can be adopted for other genes like Sox2.

Results
Generation of the Sox2 COIN Allele Sox2 (ENSMUSG00000074637) is a single exon gene encoding a 319 amino acid protein. The Sox2 locus contains several features that render it complex from the standpoint of engineering modified alleles (Fig. 1A). To begin with, Sox2's proximal promoter and coding region comprise a CpG island [15]. Furthermore, the Sox2 exon is contained with the intron of a long non-coding RNA (ncRNA), termed Sox2 overlapping transcript (Sox2ot) or ''non-protein coding RNA 43'', which also contains mmu-miR1897 (miR1897) [40]. Sox2ot is transcribed from the same strand as Sox2 but its molecular and biological functions remains elusive. Sox2ot transcript is expressed in mouse embryonic stem cells and in other tissues, including the nervous system, where Sox2 is also highly expressed [41], while an isoform of Sox2ot, Sox2dot, located around 500 base pairs upstream of Sox2, was detected exclusively in adult mouse brain [27]. Because of this complexity, Sox2 is a challenging locus to apply conditional mutagenesis, and therefore presents a stringent test for new methods of allele design, such as COIN.
The COIN method relies on an optimized gene trap-like element, referred to as the COIN module [42]. The COIN module is comprised of a 39 splice region-reporter cDNA-polyadenylation region optimized to function as an efficient transcriptional block, and it is flanked by Lox71 and Lox66 sites are in a mirror image configuration to enable Cre-mediated inversion [43]. In order to generate conditional-null alleles, the COIN module is placed in a position antisense to the target gene, either within a native intron, or an exon. The latter is made possible by embedding the COIN module within an artificial intron -the COIN module intronand using that intron to split the target exon into two operational halves [42].
To generate Sox2 COIN , the COIN module intron was inserted directly into the single exon of Sox2 ( Figure 1B), after the 30 th nucleotide of Sox2's open reading frame, splitting the single Sox2 exon into two exons. The COIN module lies inertly within the antisense strand of Sox2, stealth to transcription. Upon Cremediated inversion into the sense orientation, the Sox2 COIN allele is converted into a null allele, Sox2 INV . This is accomplished by the COIN module abrogating transcription of full length Sox2, effectively replacing it with expression of the COIN module's eGFP reporter. The expression of eGFP in place of Sox2 is controlled by Sox2's promoter and regulatory elements, and enables visual identification of the inversion event at the tissue and cellular level. The functionality of the allele was assessed in vivo in a series of experiments that assessed whether Sox2 COIN is a truly wild type allele, and whether Sox2 INV/INV recapitulate the null phenotype, while providing a useful marker that faithfully reproduces the expression profile of Sox2.

Sox2 COIN is Wild Type in Homozygosis
Offspring of Sox2 COIN/+ intercrosses were born in Mendelian ratios and no lethality was observed in embryos, newborn pups and adults (Table 1). Homozygote mice fed normally, showed no abnormal behavior and they had normal weight in adulthood (data not shown). Macroscopic analysis of E14.5 Sox2 +/+ , Sox2 COIN/+ , and Sox2 COIN/COIN mice showed that the COIN module does not affect normal embryonic mouse development ( Figure 2C). These phenotypic observations are further corroborated by the result that Sox2 COIN/bgeo2 E6.5 embryos were morphologically indistinguishable from Sox2 +/+ or Sox2 bgeo2/+ embryos derived from a Sox2 bgeo2/+ with Sox2 COIN/COIN cross ( Figure 3B, F), where Sox2 bgeo2 is a null allele of Sox2 [19] (see below). Furthermore, examination of Sox2 mRNA ( Figure 2F) and Sox2 protein ( Figure 2H) expression levels show no apparent difference between the three genotypic classes, Sox2 +/+ , Sox2 COIN/+ , and Sox2 COIN/COIN , demonstrating that the COIN module has no effect on the expression of Sox2. Thus, by all of these criteria -heritability, phenotype, expression of mRNA and protein -Sox2 COIN behaves as a wild type allele.
The Expression of Sox2ot and miR1897 are Unaffected in Sox2 COIN/COIN Mice To assess whether the COIN module affects Sox2ot RNA expression, we isolated RNA from E14.5 mouse embryos from different intercrosses and quantified Sox2ot RNA levels by Taqman Real-Time PCR analysis ( Figure 2G). No significant difference was detected in the expression of Sox2ot in Sox2 +/+ , Sox2 COIN/+ , and Sox2 COIN/COIN , demonstrating that the COIN module has no effect on the expression of Sox2ot, at least prior to inversion. Identical observations where made for miR1897, which is embedded in Sox2ot ( Figure 2H).

Sox2 COIN is Efficiently Inverted by Cre to Generate Sox2 INV
To assess whether we could trigger COIN inversion upon Cre expression, Sox2 COIN/+ adult mice were intercrossed with Sox2Tg(-Sox2:CRE) transgenic mice to generate Sox2 INV/+ embryos and adult mice ( Figure 2D, E). In contrast to the partial infertility phenotype that has been observed with Sox2 ßgeo2/+ mice [19], Sox2 INV/+ adult mice exhibited no obvious phenotypes and transmitted the inverted allele in Mendelian ratios ( Table 2), irrespective of whether the Sox2 INV allele is transmitted via the male or female germline (data not shown). More importantly, E14.5 Sox2 INV/+ embryos ( Figure 2E) displayed vivid eGFP expression in the cerebral cortex, retina, olfactory bulb, hair follicles, olfactory epithelium and spinal cord ( Figure 2E), mirroring what has been observed with X-gal stained E14.5 Sox2 bgeo2/ + embryos [44]. In addition, the presence of eGFP protein can be detected by Western blotting in protein extracts derived from Sox2 INV/+ embryos, and appears to be accompanied by a reduction in the levels of Sox2 protein, similar to what has been observed in the Sox2 ßgeo2/+ embryos ( Figure 2I).

Sox2 INV is a Null Allele of Sox2
Mice carrying a loss of function mutation in the Sox2 locus have been generated by the insertion of a bgeo cassette into the Sox2 locus (Sox2 bgeo [19], and Sox2 bgeo2 [44]). Upon homozygosis, both alleles yield Sox2-null embryos that fail to form the epiblast and die around implantation. To test whether Sox2 INV/INV phenocopy Sox2 bgeo2/bgeo2 , we performed Sox2 INV/+ intercrosses (Table 3). No Sox2 INV/INV offsprings were born. More specifically, Sox2 INV/INV mutants failed to survive shortly after implantation ( Figure 3I), phenocopying Sox2 bgeo/bgeo and Sox2 bgeo2/bgeo2 embryos. Only Sox2 +/+ and Sox2 INV/+ embryos reach the embryonic stage of E6.5 ( Figure 3A, G). Histological examination of whole decidual swellings harvested at 6.5 dpc revealed that 25% of deciduas carried abnormal implants, which had no egg cylinder structure and lacked the epithelial cells typical of epiblast ( Figure 3I). Instead, many trophoblast giant cells could be identified ( Figure 3C, H, I). The same phenotype is observed in Sox2 INV/ bgeo2 embryos ( Figure 3H). These results demonstrate the failure of Sox2 INV/INV embryos to develop an epiblast similarly to the Sox2 bgeo/bgeo and Sox2 bgeo2/bgeo2 mutants. Thus, the inverted COIN cassette generates a true Sox2 null phenotype.

Discussion
We show here the application of COIN technology to generate a conditional-null allele for a single exon gene, Sox2. The COIN method was invented at least in part to overcome the challenges and limitations of traditional site-specific recombinase-based strategies such as Cre/Lox for designing conditional alleles [45]. These include the placement of Lox sites as well as the distance between them [46], defining critical exons (i.e. the exons of the gene that need to be deleted by Cre in order to bring about the desired allelic state) [47,48], and the lack of unified strategy for including a reporter that can mark those cells that harbor the postrecombinase allele. To date, COIN has been successfully applied in generating conditional alleles for more than twenty-five proteincoding genes [42], but its applicability to single exon genes has not been tested.
Single exon genes present a design challenge for engineering conditional alleles by traditional, e.g. simple floxing, methods. First, the Lox sites should be placed in a position that does not affect the expression of the target locus, a design decision that can be complicated by the lack of specific knowledge of the exact position of promoters and regulatory elements. An additional design challenge is presented if a reporter that marks the conversion from 'wild-type' to null is desired. The COIN method addresses both of these challenges irrespective of gene structure by avoiding the placement of Lox and FRT sites, reporters, and other functional elements within regions upstream of the target gene's coding sequence [42]. Instead, COIN employs an 'exon-splitting' artificial intron to place an optimized module -the COIN module -within a coding exon, yet in the antisense strand. Perusal of the regulatory elements mapping within single coding exon of human Sox2 suggests that the COIN intron was inserted at a position that does not result in disruption of any such elements (data not shown) As a result, the COIN module is stealth to transcription, and does not alter the expression of the modified gene. Although it is possible that introduction of the COIN module intron alters the kinetics of transcription [49][50][51], we have not examined this possibility at the single cell level; at the population level and at steady state, the level of Sox2 mRNA as expressed from the Sox2 COIN locus does not appear different to that of wild type. Lastly, because in COINs the Lox and FRT sites are placed within the artificial intron, they do not disrupt of promoters or regulatory sequences, and are also not incorporated into mRNA.
The particular choice of Sox2 to test the COIN method's applicability to single exon genes presented additional challenges in that the majority of Sox2's single exon is contained within a CpG island, and there is also an overlapping non-coding transcript Sox2ot. Due to design constraints -specifically the need to place the COIN module as near the initiating ATG as possible -the COIN module intron was inserted into Sox2's exon in manner that disrupts the CpG island. However, this had no apparent effect on Sox2 expression and had not apparent phenotypic consequences in the mouse. Sox2ot levels also remained unaltered, indicating that at least in the antisense position the presence of the COIN module has no effect on the expression of Sox2ot. This was evident by the normal phenotype of Sox2 COIN/bgeo2 embryos, in which only the COIN allele can generate wild type mRNA. This genotype should sensitize the embryo to any reduction in Sox2 levels, and thus provides a stringent comparison between the wild type allele and the COIN allele prior to inversion.
Equally important is the fact that post-inversion of the COIN module, the resulting allele, Sox2 INV phenocopies the previously generated null alleles upon homozygosis. In addition, the reporter embedded in the COIN module, is expressed in a manner representative of Sox2 expression, thereby generating a tool to visualize Sox2 expression and to follow the conversion of the COIN allele into a null by Cre.
In addition to the conditional-null allele presented here, four other conditional-null alleles of Sox2 have been published [21,[52][53][54]. All four rely on floxing of the single exon of Sox2, though the placement of the LoxP sites and selection cassettes (and their retention) varies among alleles. One of the main differences between these alleles and Sox2 COIN , is that they do not incorporate a reporter that is activated after Cre acts on the allele. There is however a paucity of published data such as expression analysis of Sox2, Sox2ot, and miR1897 to allow further comparisons between Sox2 COIN and the previously described conditional-null alleles. Given the increasing evidence for roles that Sox2 plays in a wide range of pathological and patho-physiological conditions, assays for the normal regulation of Sox2 expression need to be conducted in a variety of cell types. Overall, our results highlight the importance of the Conditional by Inversion technology as a method of choice in targeting intronless (single exon) genes, especially when complexity of the locus and desire for inclusion of a reporter are taken into consideration.

Gene Targeting
Targeted Sox2 COINneo/+ ES cells were generated using Veloci-Gene TM methodology, essentially as described [55]. Briefly, the BAC-based targeting vector was assembled on bacterial artificial chromosome (BAC) RP23_406a6 that encompasses the single protein-coding exon of Sox2 flanked by approximately 95 and 71 kb upstream and downstream respectively. The COIN module mirror image orientation, thereby enabling inversion by Cre. For BHR and targeting, a FRT-flanked neo cassette has been incorporated into the COIN intron. After targeting, neo is removed to give rise to the Sox2 COIN allele. The COIN module is antisense to Sox2, and hence it predicted not to interfere with expression of Sox2. However, after inversion of the COIN module to the sense strand, transcription terminates around the polyadenylation region of the COIN module, and as a result expression of Sox2 is replaced by eGFP.  Figure 1B).

Experimental Animals
Sox2 COINneo/+ mice were bred with Tg(ACTB:FLPe) mice (Flpdeleter mice) to excise the Neo cassette and generate Sox2 COIN/+ Tg(ACTB:FLPe) mice. These were bred with C57BL6 mice to bring the Sox2 COIN allele into the germline. Sox2 COIN/+ mice were in turn bred with Tg(Sox2:CRE) mice to generate Sox2 COIN-INV/+ mice. All animals were handled in strict accordance with good animal practice as defined by the Animals Act 160/03.05.1991 applicable in Greece, revised according to the 86/609/EEC/ 24.11.1986 EU directive regarding the proper care and use of laboratory animals and in accordance to the Hellenic License for

Embryo Processing, Tissue Preparation and Histological Analysis
For staging of the embryos, midday of the vaginal plug was considered as embryonic day 0.5 (E0.5). E6.5 decidua and E14.5 embryos were collected and dissected in cold PBS. Tissues were fixed with 10% formalin for 24 hours at room temperature and then washed several times with 1% PBS, then placed in embedding cassettes. Paraffin sections (10 mm) were stained with Hematoxylin and Eosin (H&E) using standard procedures and mounted with xylene based mounting medium. E14.5 Sox2 bgeo2/+ LacZ staining was performed following standard protocol [19].

Imaging Analysis
Conventional bright field and fluorescence microscopy was performed under a Leica MZ16FA stereoscope, while the dissection of the embryos took place either in 1X cold PBS or in DMEM medium supplemented with 2 mM glutamine and 0.5 mM penicillin and streptomycin.

RNA Analysis
RNA was extracted from E14.5 mouse embryos and subjected to Taqman. Real-time PCR analysis typically, Gapdh was used as a control house-keeping gene, although analysis was also performed using Cyclophlin and b-actin with similar results. For miR1897 analysis, miR16 and Sno135 were used as internal controls. All probes are hydrolysis probes with 5' Fam Fluorophore and 3' quencher (BHQ) (Biosearch Technologies). Probes codes and sequences for each gene are: for mSox2, Applied Biosystems, Inc, TaqMan assay ID: Mm00488369_s1, for mSox2ot, Applied Biosystems, Inc, TaqMan assay ID: Mm01291217_m1, for Lac-Z, FRW: TTTCAGCCGCGCTGTACTGGA, RVS: TGTTGCCACTCGCTTTAATGATG, for eGFP: FRW:

Ethics Statement
All animals were handled in strict accordance with good animal practice as defined by the Animals Act 160/03.05.1991 applicable in Greece, revised according to the 86/609/EEC/24.11.1986 EU directive regarding the proper care and use of laboratory animals and in accordance to the Hellenic License for Animal Experimentation at the BSRC'' Alexander Fleming'' (Prot. No. 767/ 28.02.07) issued after protocol approval by the Animal Research Committee of the BSRC ''Alexander Fleming'' (Prot. No. 2762/ 03.08.05).