Selection of Stable Reference Genes for Quantitative RT-PCR Comparisons of Mouse Embryonic and Extra-Embryonic Stem Cells

Isolation and culture of both embryonic and tissue specific stem cells provide an enormous opportunity to study the molecular processes driving development. To gain insight into the initial events underpinning mammalian embryogenesis, pluripotent stem cells from each of the three distinct lineages present within the preimplantation blastocyst have been derived. Embryonic (ES), trophectoderm (TS) and extraembryonic endoderm (XEN) stem cells possess the developmental potential of their founding lineages and seemingly utilize distinct epigenetic modalities to program gene expression. However, the basis for these differing cellular identities and epigenetic properties remain poorly defined. Quantitative reverse transcription-polymerase chain reaction (qPCR) is a powerful and efficient means of rapidly comparing patterns of gene expression between different developmental stages and experimental conditions. However, careful, empirical selection of appropriate reference genes is essential to accurately measuring transcriptional differences. Here we report the quantitation and evaluation of fourteen commonly used references genes between ES, TS and XEN stem cells. These included: Actb, B2m, Hsp70, Gapdh, Gusb, H2afz, Hk2, Hprt, Pgk1, Ppia, Rn7sk, Sdha, Tbp and Ywhaz. Utilizing three independent statistical analysis, we identify Pgk1, Sdha and Tbp as the most stable reference genes between each of these stem cell types. Furthermore, we identify Sdha, Tbp and Ywhaz as well as Ywhaz, Pgk1 and Hk2 as the three most stable reference genes through the in vitro differentiation of embryonic and trophectoderm stem cells respectively. Understanding the transcriptional and epigenetic regulatory mechanisms controlling cellular identity within these distinct stem cell types provides essential insight into cellular processes controlling both embryogenesis and stem cell biology. Normalizing quantitative RT-PCR measurements using the geometric mean CT values obtained for the identified mRNAs, offers a reliable method to assess differing patterns of gene expression between the three founding stem cell lineages present within the mammalian preimplantation embryo.


Introduction
During mammalian pre-implantation development a series of asynchronous divisions result in the formation of the blastocyst. At this stage of development three distinct cell types have emerged: the epiblast, trophectoderm and primitive endoderm, which give rise to the fetus, placenta and extraembryonic endoderm respectively [1][2][3]. To better define the developmental and transcriptional processes unique to each of these distinct lineages, in vitro cultured progenitor stem cells have been derived [4][5][6][7]. Analysis of ES, TS and XEN stem cell lines have revealed much about the cellular processes controlling mammalian development and demonstrated surprising differences in the epigenetic regulation of gene expression between these three lineages [7][8][9][10][11][12]. Identifying the biochemical factors underlying these differences remains an essential step to understanding the molecular processes driving development and better defining crucial aspects of mammalian stem cell biology.
Quantitative reverse transcription-polymerase chain reaction (qPCR) has emerged as a powerful technique to rapidly assess transcriptional differences between cell types and differing experimental conditions. However, accurate quantitative analysis is dependent upon proper, empirical selection of a suitable reference. Using published microarray data, and a novel statistical algorithm, (geNORM) Vandesompele and colleagues demonstrated that the geometric mean of three reference genes provided the most accurate and reliable means of normalizing qPCR expression data [13]. Subsequently, this experimental strategy has been validated and additional algorithms written and utilized to identify the most suitable reference genes for a variety of experimental conditions [14][15][16][17][18][19][20][21][22][23][24][25].
In this study we sought to identify a list of genes most suitable for use as normalization controls in qPCR-based comparisons between ES, TS and XEN stem cells or their in vitro differentiated progeny. In order to help identify candidate genes we set two main criteria that the mRNAs would have to fulfill: 1) the transcripts needed to be expressed above background and easily detectable, and 2) candidate mRNAs needed to be stably expressed between each of the three stem cell lineages under investigation. To this end we surveyed the recent literature and compiled a short list of fourteen candidate genes, including Actb, B2m, Hsp70, Gapdh, Gusb, H2afz, Hk2, Hprt, Pgk1, Ppia, Rn7sK, Sdha, Tbp and Ywhaz [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28].
These genes belong to diverse functional classes and should not be co-regulated, thus providing a non-biased method of normalizing qPCR expression data.
To evaluate the stability of our candidate genes we isolated RNA from three independent lines of varying genotypes for each of the three stem cell types. This RNA was quantified and seeded into five independent qPCR reactions measuring each of the candidate genes. Using the geNORM, NormFinder and BestKeeper algorithms, we identify the Pgk1, Sdha and Tbp transcripts as the most stably expressed reference genes between each of these stem cell types. To determine which of these candidates was most suitable for use during in vitro differentiation studies, we cultured ES and TS cells in the absence of crucial growth factors LIF and FGF4 respectively. Using three independent RNA samples isolated on Day 0 and Day 8, we identify Sdha, Tbp and Ywhaz as well as Ywhaz, Pgk1 and Hk2 as the three most stable reference genes through the in vitro differentiation of ES and TS cells. Our results suggest that normalization of qPCR data using the geometric means of the transcripts listed above will yield the most accurate quantification of gene expression between these three unique stem cell types.

Results
After a survey of the recent literature we curated a short list of fourteen commonly used reference genes and either designed new primers or pulled existing ones from references cited in the materials and methods. These genes are listed in Table 1 and represent several distinct functional classes so as to minimize the possibility of co-regulation. For each gene, a minimum of two independent primer sets were tested and of these, the primer set exhibiting the greatest efficiency was selected. To conduct an accurate survey of candidate gene expression levels between ES, TS and XEN stem cells we isolated RNA from at least three independent stem cell lines, representing at least two different genotypes. We postulate that utilizing lines derived from diverse genotypes will more accurately identify stable reference genes to be used in future studies contrasting patterns of gene expression.
Previous studies in our laboratory have utilized stem cell lines derived from Mus musculus castaneus x mus musculus (C57Black6) F1 embryos. Polymorphisms between these genetic strains allow the examination of mono-allelic patterns of epigenetic marks and gene expression within loci regulated by genomic imprinting [29]. For ES, TS and XEN stem cell analysis we utilized lines derived from F1 embryos of reciprocal crosses between these stains (C57Black66 Castaneous and Cast76Black6) [11,29,30]. For analysis of ES and TS cells we also utilized the previously described R1 ES and TS3.5 lines derived from 129 stain mice [31,32]. Each of these different lines demonstrated cellular morphology consistent with their cell type and expressed unique cohorts of transcription factors characteristic of their lineage [7,33] (Figure 1). Cell lines were cultured to 80% confluence, RNA isolated and seeded into five independent qRT-PCR reactions measuring our fourteen candidate genes. Results presented below are the combined analysis of all genetic backgrounds tested.
Of the candidate genes tested Rn7sk demonstrated the most robust expression averaging expression levels 125 fold higher than the remaining candidates; which were all readily detectable in each of the cell lines tested. To measure the relative stability of each of the candidate genes between the ES, TS and XEN lines, the CT values for the measured transcripts were compiled and run through the NormFinder, GENorm, and BestKeeper software packages [13][14][15]. Each of these algorithms utilize slightly different methods of estimating both the intra-and the intergroup expression variation, and allow the ranking of candidate genes based on the calculation of a ''stability value''. While there was variation amongst the midrange to least stable genes, all three software packages identified Pgk1, Sdha, Tbp and H2afz as the most consistently stable reference genes between ES, TS and XEN stem cells (Table 2). Similar to previous studies by Mamo et al., we observed the classic ''housekeeping genes'' Actb, Hprt and to a lesser extent Gapdh were comparatively unstable and by our analysis would not be the best choice to normalize qPCR expression levels [19]. We next chose to make pair-wise comparisons between ES and TS, ES and XEN as well as TS and XEN to see which candidates emerged as the most stable in contrasts between any two cell types. A consensus of all three software packages can be seen in table 3. As with the comparisons between all three lines, Pgk1, Sdha, Tbp and H2afz remained in the top five most stable genes indicating no one cell type was biasing our analysis and that these five reference genes represent the best normalization controls for qPCR-based analysis of gene expression. Utilizing the geometric mean of Pgk1, Sdha, and Tbp we normalized the CT values for each of the fourteen candidates and graphed their relative expression levels as described previously [13,34,35]. As can bee seen in Figure 2, Rn7sk is expressed at a drastically higher level than any of the other candidates tested and therefore does not represent a viable reference gene. Similarly, analysis of Actb, B2m, Gapdh and Ywhaz all yielded significant differences in measurements of TS cell expression as compared to both ES and XEN cells eliminating their candidacy. Our results indicate normalizing quantitative RT-PCR measurements using the geometric mean CT values obtained for the Pgk1, Sdha and Tbp mRNAs, offers the most reliable method to assess differing patterns of gene expression between the three founding stem cell lineages present within the mammalian preimplantation embryo.
We next sought to determine which of the candidate genes remained the most stable throughout the process of differentiation. Therefore we chose to differentiate our ES and TS cell lines by removal of the key growth factors LIF and FGF4 respectively [6,36,37]. To this end ES cells were cultured in LIF -ES cell medium, allowed to form embryoid bodies on untreated plastic dishes and then plated on regular tissue culture plastic to differentiate into fibroblast like cells. Similarly, TS cells were plated on tissue culture treated plastic at low density in FGF4medium which promoted the formation of TS giant-like cells. We chose not to investigate the process of XEN cell differentiation as reliable protocols for the induction of differentiation have not yet been established. In contrast to both ES and TS cell lines, when XEN cells are plated on plastic many cells simply senesce, while the remainder do not uniformly differentiate into one cell type, thus complicating our analysis.
RNA samples were collected from ES cells on Day 0, Day 4 (embryoid body) and Day 8 and RNA seeded into five independent qPCR reactions measuring each of the fourteen candidate genes. Using a similar experimental design as described above, we identify Sdha, Tbp and Ywhaz as the three most stable transcripts (Table 4). To examine relative changes in gene expression, we utilized the geometric mean of these three most stable candidates to normalize CT values and graphed the relative expression of all fourteen candidate genes though ES cell differentiation (Figure 3a). We then chose to examine the expression of the cell lineage marker fibroblast-specific protein-1 (FSP-1) which is active in fibroblasts but not in epithelium, mesangial cells or embryonic endoderm [38]. In accordance with previous studies this maker demonstrated increasing expression in differentiating cell cultures, indicating our three candidate genes provided a valid reference point (Figure 3b) [39,40]. In contrast, transcripts encoding Pgk1, H2afz, Ppia (Cyclophillin) and Gapdh all demonstrate a significant down-regulation and therefore are not suitable reference genes for this experimental time course. Similar to results reported by Willems et al., examining ES cell differentiation induced by both DMSO and Retinoic acid, we also identify B2m and Hprt as among the most unstable transcripts [18]. Using similar methodologies, we identified the Ywhaz, Pgk1 and Hk2 transcripts as the most stable during TS cell differentiation (Table 5). After applying the geometric mean of these three candidates to normalize CT values we observed massive changes in transcripts encoding Actb, B2m and Rn7sk (Figure 4). Previous studies have identified increased actin mobilization as a key feature of trophectoderm stem cell differentiation, validating our identified reference genes [41]. Taken together our data indicate Sdha, Tbp and Ywhaz and Ywhaz, Pgk1 and Hk2 represent the most stable of our fourteen candidate reference genes for use as qPCR normalization controls during ES and TS cell differentiation respectively.

Discussion
Analysis of gene expression using qPCR has become the corner stone to nearly every facet of the biological sciences. However, despite numerous studies demonstrating the importance of careful selection and validation of appropriate reference genes, several studies continue to emerge utilizing inappropriate methods of qPCR normalization [13][14][15]18,20,21,42]. A recent survey of the literature identified the single use of either Actb or Gapdh to normalize expression data in the vast majority of qPCR based studies without any form of validation to ensure their experimental   CT values for each measured transcript were normalized to the geometric mean of Pgk1, Sdha and Tbp, and then graphed as relative values using methods described [13,34,35]. Error bars represent the standard error of the mean. doi:10.1371/journal.pone.0027592.g002 stability [13]. In this study we sought to identify the most stable and appropriate reference genes for studies contrasting patterns of gene expression between the three founding stem cell lineages present within the mammalian preimplantation embryo. From a list of fourteen commonly utilized reference genes we identify Pgk1, Sdha and Tbp as the most suitable reference genes and further find compelling evidence to suggest that both Actb and Gapdh are not suitable normalization controls for these experiments.
Of the top three candidates to emerge from our analysis two are components of pathways controlling cellular respiration. Pgk1phosphoglycerate kinase 1 is the seventh step of glycolysis and Sdha -Succinate dehydrogenase or succinate-coenzyme Q reductase is an enzyme complex that binds to the inner mitochondrial membrane and is an essential component of both the citric acid cycle and electron transport chain [43,44]. One potential weakness of our top three candidates is that although Pgk1 and Sdha are components of distinct pathways, they are both components of cellular respiration leaving the possibility that an experimental condition that impacts metabolic processes would significantly alter these normalization controls. The third and fourth candidates to emerge from our analysis were Tbp and H2afz respectively. Tbp is a central component of the RNA polymerase II pre-initiation complex and   Relative values were determined using methods described previously [13,34,35] and graphed. Error bars represent the standard error of the mean. Note that the top third of the graph is in an exponential scale. doi:10.1371/journal.pone.0027592.g004 H2afz is an essential component of chromatin structure which is hypothesized to play a role in chromosome organization and stability [45][46][47][48]. The third and fourth candidates are truly functionally distinct from both each-other and from pathways controlling cellular respiration. As such, where experimental design permits we would recommend normalizing CT values to the geometric mean of all four of these reference genes to improve experimental rigor. However, when we incorporated this strategy we did not observe any meaningful changes in relative gene expression (data not shown). The first differentiation event during mammalian embryogenesis is the formation of the epiblast, trophectoderm and primitive endoderm which go on to give rise to the three founding embryonic lineages. Table 6. Description and sequences of the primers used in the in the analysis of both the candidate reference genes and lineage specific transcription factors.
Stem cells derived from each of these lineages represent an excellent model system to study mammalian development and understand crucial aspects of stem cell biology necessary in developing regenerative therapies. Analysis of gene expression using qPCR will undoubtedly play a pivotal role in deciphering the cellular and molecular properties that define these different cell types. In these analysis, the identification of stable reference genes is an essential prerequisite to accurately interpreting experimental data. Using three independent, highly referenced and validated statistical methods, our analysis of fourteen potential candidate reference genes identify Pgk1, Sdha and Tbp as the most stable reference genes with which to normalize qPCR data. We believe these three genes will serve as excellent reference controls examining the basis for the differing developmental and epigenetic properties unique to embryonic, trophectoderm and extraembryonic endoderm stem cells.

Stem Cell Culture
Primary ES cells, TS cells, and XEN cells were derived from either 129 strain (R1 ES cells, [31] TS 3.5) [32] B66CAST or CAST76B6 F 1 embryos [11] as previously described [5][6][7]11]. For studies examining ES cell differentiation, sub-confluent cultures were dissociated with 1X trypsin (Accutase -Millipore Billerica, MA) and plated on non-tissue culture treated petri dishes in ES cell medium lacking LIF for four days and subsequently plated on 10 cm tissue culture treated dishes to differentiate into fibroblast like cells. To differentiate TS cells we followed methods described previously [32].

RNA Isolation and Reverse Transcription
Cultured cells were grown to 80% confluence, washed twice in warm PBS, and dissociated with 1X trypsin (Accutase -Millipore Billerica, MA). Cells were spun down, washed once in cold PBS, then RNA isolated using Trizol (Invitrogen, Carlsbad CA.) according to the manufacturer's protocol. One mg of purified total RNA was treated with amplification grade DNaseI (Invitrogen) according to the manufacturer's protocol, and 250 ng RNA seeded into a reverse transcription reaction using the Super-ScriptII system (Invitrogen) by combining 1 ml random hexamer oligonucleotides (Invitrogen), 1 ml 10 mM dNTP (Invitrogen), 11 ml RNA plus water. This mixture was brought to 70uC for 5 minutes then cooled to room temperature. SuperScriptII reaction buffer, DTT (Invitrogen) and SuperScriptII were then added according to manufacturer's protocol and the mixture was brought to 25uC for 5 minutes, 42uC for 50 minutes, 45uC for 20 minutes, 50uC for 15 minutes then 70uC for five minutes.

Real-Time PCR Amplification
Real-time PCR analysis of mRNA levels was carried out using the DyNAmo Flash SYBR Green qPCR Mastermix (Fisher Scientific, Pittsburgh PA.) following the manufacturer's instructions. Reactions were performed on a StepOnePlus Real Time PCR system (Applied Biosystems, Foster City CA.). DNA primer information is available in Table 6. Analysis of Real Time PCR Data.
The measured CT (Cycle Threshold) values for each sample were complied and the stability of each of the fourteen reference genes analyzed using the GENorm, NORMFinder and BEST-Keeper software tools; which have been described in detail elsewhere [13][14][15]. Once suitable reference genes were identified, the geometirc mean CT values of the best three candidate genes were calculated for each individual sample and used to normalize expression levels using the nnCT method described previously [13,34,35]. These normalized values were averaged and the standard error of the mean calculated and graphed using Excel.