Extraction-free whole transcriptome gene expression analysis of FFPE sections and histology-directed subareas of tissue

We describe the use of a ligation-based targeted whole transcriptome expression profiling assay, TempO-Seq, to profile formalin-fixed paraffin-embedded (FFPE) tissue, including H&E stained FFPE tissue, by directly lysing tissue scraped from slides without extracting RNA or converting the RNA to cDNA. The correlation of measured gene expression changes in unfixed and fixed samples using blocks prepared from a pellet of a single cell type was R2 = 0.97, demonstrating that no significant artifacts were introduced by fixation. Fixed and fresh samples prepared in an equivalent manner produced comparable sequencing depth results (+/- 20%), with similar %CV (11.5 and 12.7%, respectively), indicating no significant loss of measurable RNA due to fixation. The sensitivity of the TempO-Seq assay was the same whether the tissue section was fixed or not. The assay performance was equivalent for human, mouse, or rat whole transcriptome. The results from 10 mm2 and 2 mm2 areas of tissue obtained from 5 μm thick sections were equivalent, thus demonstrating high sensitivity and ability to profile focal areas of histology within a section. Replicate reproducibility of separate areas of tissue ranged from R2 = 0.83 (lung) to 0.96 (liver) depending on the tissue type, with an average correlation of R2 = 0.90 across nine tissue types. The average %CVs were 16.8% for genes expressed at greater than 200 counts, and 20.3% for genes greater than 50 counts. Tissue specific differences in gene expression were identified and agreed with the literature. There was negligible impact on assay performance using FFPE tissues that had been archived for up to 30 years. Similarly, there was negligible impact of H&E staining, facilitating accurate visualization for scraping and assay of small focal areas of specific histology within a section.


Introduction
Gene expression profiling of tissue is vitally important for understanding both normal and disease processes. Tissue can be prepared as snap frozen blocks or prepared as formalin fixed paraffin embedded (FFPE) tissue blocks, then sectioned and assayed. Frozen tissue blocks are PLOS  transcriptome, and buffers necessary for annealing, nuclease clean up, ligation, amplification, and library generation. For library purification, we used the NucleoSpin Gel and PCR Cleanup kit (Macherey-Nagel cat # 740609.50). Molecular biology grade light mineral oil was sourced from Sigma. Phosphate Buffered Saline (PBS), Ca 2+ and Mg 2+ free, was purchased from Thomas Scientific. Molecular biology grade water and TE were purchased from Invitrogen. Ethanol was sourced from Decon Laboratories. Neutral buffered formalin was purchased from VWR (16004-128). All reagents, tips, plates, and reservoirs were RNase free.

Tissue sources
All human tissue was sourced from the University of Arizona Cancer Center Biorepository. Prostate samples were sourced from the UACC Prostate Biorepository. Human samples were consented for research use after clinical testing and de-identified before receipt. Human samples were exempt from IRB approval as per the grant funding source (NCI 5R33CA183688-03 & NIEHS 1R43ES024107-01). The TempO-Seq assay does not sequence patient DNA or RNA, and instead only produces predetermined probe sequence data; therefore, it is not possible to use TempO-Seq sequencing readouts to identify patients. Mouse tissue was provided as a gift from Kathleen Scully and Pamela Itkin-Ansari of the Sanford Burnham Prebys Medical Discovery Institute (La Jolla, Ca.). Rat tissues were obtained from Tissue Acquisition and Cellular/Molecular Analysis Shared Resource at University of Arizona Cancer Center. Rats were euthanized with CO2 asphyxiation in accordance with American Veterinary Medical Association (AVMA) guidelines. The collection of animal samples was approved by IACUC (A3248-01).

Hematoxylin and eosin staining
Slides were deparaffinized using the Leica Bond Dewax solution by soaking in a Coplin staining jar for three minutes. Slides were washed 3 times in 100% ethanol, then either air dried, or continued through H&E staining with the following protocol: rehydrated in distilled water for three minutes; immersed in hematoxylin solution (Leica hematoxylin 560 diluted 1:6 in distilled water) for three minutes; washed three times in distilled water; immersed briefly in 0.1x PBS; washed three more times in distilled water; soaked in 70% Ethanol for two minutes; immersed in Alcoholic Eosin Y with Phloxine (Sigma HT110332) for three minutes; washed in 100% ethanol three times then air dried.

Cell lysates and FFPE pellets
MCF7 and MDA-MB-231 cells were obtained from ATCC and grown in RPMI supplemented with 10% FBS. Fresh lysates were prepared by washing cells with 1X Ca 2+ and Mg 2+ free PBS, then lysing in 1X TempO-Seq lysis buffer in PBS at 2,000 cells per μL. Lysates were incubated for 10 minutes at room temperature, followed by storage at -80˚C. For FFPE cell pellets, live cells were washed twice with 1X PBS, and then fixed with 1% formaldehyde in 1X PBS for 30 minutes at room temperature. Cells were then embedded by the TACMASR (University of Arizona Tissue Acquisition and Cellular/Molecular Analysis Shared Resource).
samples were stored as FFPE blocks from the year indicated until 2016, at which point a pathologist identified homogenous tumor regions. 5 μm thick sections were then cut and mounted, and slides were stored until 2018. For these experiments, 25 mm 2 areas were cut from serial sections to represent biological replicates. Mouse tissue was provided as a gift from Kathleen Scully and Pamela Itkin-Ansari of the Sanford Burnham Prebys Medical Discovery Institute (La Jolla, Ca.). All mice were wildtype adults of the C57BL/6 genetic strain. Tissues were fixed in 10% neutral buffered formalin at 4˚C for 24 hours, then moved to 70% EtOH for 24 hours before embedding. 5 μm thick tissue sections were cut and mounted on Superfrost Plus Micro slides (VWR). Slides were dried overnight at room temperature before processing for FFPE TempO-Seq.
Rat tissues were obtained from Tissue Acquisition and Cellular/Molecular Analysis Shared Resource at University of Arizona Cancer Center. Rats were euthanized with CO 2 asphyxiation in accordance with American Veterinary Medical Association (AVMA) guidelines. Tissues were fixed in 10% neutral buffered formalin for 24 hours before being transferred to 70% Ethanol prior to embedding. For fixation time studies, samples were kept in 10% NBF for the specified time before moving to ethanol prior to embedding.

TempO-Seq assay
The TempO-Seq assay for FFPE samples relies on the standard TempO-Seq chemistry [7][8][9][10][11]. The assay (Fig 1) was modified for FFPE samples, and was carried out without modification following the protocol from the User Manual provided with the kits, as described in the following text. An area of interest on a slide-mounted FFPE section ( Fig 1A) is scraped from the slide and deposited directly into BioSpyder 1X FFPE lysis buffer (Fig 1B). The sample is then overlaid with molecular biology grade mineral oil and incubated at 95˚C for five minutes to dissolve the paraffin. This separates the paraffin from the lysate without the need for harsh chemicals. The FFPE lysate is further processed by addition of FFPE Protease reagent and incubation at 37˚C for 30 minutes. After a quick homogenization step (trituration by a pipette, or vortexing), the lysates can be frozen or used immediately in the remaining steps of the Tem-pO-Seq FFPE assay [7][8][9][10][11].
As depicted in Fig 1B, a 2 μL aliquot of the processed lysate is then added to a microplate well containing a mix of annealing buffer and Detector Oligos (DOs) to measure each targeted gene. DO panels included in this study were designed against the whole transcriptome for human, mouse, and rat (commercially available assays from BioSpyder, Inc.). This mixture was then exposed to a ramp in temperature from 70˚C to 45˚C, followed by overnight incubation at 45˚C. This facilitates complete annealing of DOs to their target RNAs. The hybridization process is highly resistant to RNA fragmentation (as the DOs anneal to RNA sequences of <100 nt), which facilitates the assay efficiency in the context of FFPE.
A nuclease mix is then added, which degrades unbound and incorrectly bound DOs. Finally, addition of a ligase mixture allows for ligation of correctly bound DOs into full-length probes. The enzymes are then inactivated by a 15 minute incubation at 80˚C, and the resulting ligated probes are amplified in a PCR step. The PCR primers allow indexing of individual samples, so that hundreds or thousands of samples can be multiplexed within the same sequencing library (Fig 1C).
The assays used were the human whole transcriptome assay [7] which measures 19,283 genes (21,111 probes); the mouse whole transcriptome assay which measures 23,580 genes (30,147 probes); and the rat whole transcriptome assay which measures 21,119 genes (22,253 probes). Each gene is measured by one or more probes formed by ligation of a DO pair, as previously described [7]. TempO-Seq probes for the whole transcriptome assays were designed to target only protein-coding genes (with few exceptions), so noncoding RNAs are not visible, nor do they take any sequencing reads in this approach. Custom TempO-Seq assays that measure noncoding RNA, splice junctions and variants, fusion genes, etc., have been designed and used in other work, but were not used for the experiments described in this manuscript.

Sequencing and data analysis
Purified libraries were run on the Illumina NextSeq 550 sequencing platform. All data analysis was done using the TempO-SeqR data analysis platform (BioSpyder., Inc.) as follows. After sample demultiplexing using the default Illumina sequencer and bcl2fastq settings, mapped reads were generated by TempO-SeqR alignment of demultiplexed FASTQ files from the sequencer to the ligated DO gene sequences using Bowtie, allowing for up to 2 mismatches in the 50-nucleotide target sequence.
For correlation analysis, genes with 20 or more raw counts were log2 transformed and plotted to derive coefficient of determination (R 2 ) values. Differential expression was assessed by the TempO-SeqR software which used the DESeq2 method for differential analysis of count data [12]. The count data are first normalized using the DESEq2 function estimateSizeFactor, which establishes size factors using the "median ratio method" described in [13]. DESEq2 then computes the probability of differential expression by comparing the relative count level for each condition and the dispersion of the respective counts using a negative binomial model. A user selected adjusted p-value of <0.05 and baseMean depth >20 were used as the thresholds of significance for differential expression.
For PCA and correlation plots related to colon matched normal and cancer, a pathologist identified regions of cancerous and normal tissue on sections of colorectal cancer. Two within-donor biological replicates were taken from each tissue type from each patient, and lysed. The lysate was used to perform three technical replicate experiments, the results of which were then averaged. Plots represent direct comparisons of all data, without cutoffs, normalized for total read depth.
Raw sequencing data in form of FASTQ files, along with aligned gene counts for all samples used in this study are available through GEO (accession number GSE119630).

Reproducibility and sample types
To verify assay robustness and precision, we tested tissues from a variety of species and a broad range of tissue types. Data shown here includes FFPE samples of human colorectal adenocarcinoma, prostate adenocarcinoma, and pancreatic cancer; rat brain, kidney and liver; and mouse breast, lung, and hindlimb muscle. We chose pancreas due to its relative abundance of endogenous RNases, breast and lung for their low cellularity, and muscle for potential difficulties in digestion by the lysis protocol. For each sample type, 10mm 2 areas from 5μm thick slides were lysed, and 10% of the lysate (equivalent to 1 mm 2 of tissue) was used as input in the whole transcriptome FFPE TempO-Seq assay with species-specific DOs.
demonstrating the mixed histology that would affect the data if the entire area were to be scraped and profiled. Areas identified from a stained tissue section can be scraped from an unstained, paraffinized adjacent section (right), or (if RNase-free reagents are used for staining) directly from a stained section (center right). The scraped areas in this case were~1 x 5 mm, aligned with the focal histology of interest, and sufficient for gene expression profiling. (B) An area of interest is manually scraped from mounted FFPE sections. The tissue is added directly into 1X FFPE lysis buffer, overlaid with mineral oil, and then heated at 95˚C for 5 minutes. FFPE Protease is added and the sample is incubated and manually homogenized. The processed lysate is then ready for input directly into the annealing step of the TempO-Seq assay. (C) Schematic of the TempO-Seq detector oligo annealing and ligation process. To gauge assay reproducibility, the same areas of adjacent 5 μm thick sections were independently processed, and gene expression patterns between replicates compared. It is worth noting that these replicates are biologically different: they represent different subsections of tissue from the same organ, with potentially different tissue composition (microvasculature, innervation, etc.), and are processed completely independently. These are within-donor biological replicates, in which some variance in expression is expected due to the heterogeneity of cellular composition of FFPE. This measures the repeatability of the assay independent of the variability between donors. Technical replicates, such as replicates of the same FFPE tissue lysate, can be used as another measure of assay repeatability, but do not address the variability of the lysis step or resulting from FFPE tissue heterogeneity. For these reasons, it is important to differentiate such "within-donor biological replicates" from "between-donor biological replicates," where samples from different individuals are measured, and where the range of biological variability within a disease can be seen.
Gene expression correlation among within-donor biological replicates for all sample types had R 2 values greater than 0.8, regardless of species. Of the human samples, pancreatic tissue had the highest reproducibility across biological replicates (R 2 = 0.916), (Fig 2). Human colorectal and prostate cancers had R 2 values of 0.872 and 0.885, respectively. Mouse breast, lung, and muscle had R 2 values that exceeded 0.8 (0.891, 0.833 and 0.895, respectively). Rat brain, kidney, and liver all had R 2 values that exceeded 0.9 (0.926, 0.949, and 0.959), (Fig 2). Across all nine tissue types the average R 2 was 0.903. Average %CVs for genes with minimum of 10, 50, or 200 counts were 26.7%, 20.3%, and 16.8%, respectively (Table 1). Larger variance was observed in samples from tissues known to contain large amounts of RNAses (pancreas), and in tissues with low cellularity and thus low RNA amounts (lung, breast).

Detected gene expression profiles match expectations from literature
While no cross-platform comparison can be expected to produce perfect agreement, results should broadly conform to the main parameters-e.g. genes that are highly expressed should be recognized as such independent of platform, and genes that are tissue-specific should not be detected in the wrong tissues. To verify this, we compared expression levels detected by FFPE TempO-Seq in pancreatic tissue with those reported in the Genotype-Tissue Expression (GTEx) database [14]. As shown in Table 2, while gene rankings were not exactly the same, genes recognized as highest expressers in TempO-Seq were also highly ranked in GTEx. Top ranking genes in TempO-Seq are shown along with their ranking in GTEx. Genes recognized as highly expressed in TempO-Seq are also near the top of the rank table in GTEx. The transcript ranking 10 in TempO-Seq data was a mitochondrial target not ranked in GTEx.
Simultaneously, FFPE TempO-Seq recognized expression of pancreas-specific aquaporin AQP12 [15,16] in pancreatic tissues (Table 3), while the counts were zero in prostate or colon. The same observation can be made at the lower end of the expression range: pancreatic polypeptide PPY, which is annotated in GTEx as a very low expressing pancreas-specific gene, was correctly detected by FFPE TempO-Seq as a low abundance transcript and only detected in pancreas. Similar agreement with established annotation in GTEx can be seen in multiple mid to low-expressing genes, including CLDN10, CFC1, KLB, NPHS1, and TEX11 -all of which are annotated as expressed in pancreas [14], but not in prostate or colon, and which were found to be tissue restricted in FFPE TempO-Seq as well (Table 3).

Expression variability in matched cancer vs. normal tissue samples
The reproducibility shown in Fig 2 validates the precision and reproducibility of the assay when used on separate samples from the same source (same patient or animal, within-donor Extraction-free whole transcriptome gene expression analysis of FFPE sections biological reproducibility). However, this result does not address the expected variation among multiple individuals (what we define as "between-donor biological replicates"). To address this, we performed the assay on matched samples of human normal and cancerous colon tissue from five different patients. Two within-donor biological replicates of each tissue from each of the five patients were assayed. Each patient is shown in one color, with normal samples shown as circles and cancer samples as triangles. As can be clearly seen, despite the expected diversity in gene expression, normal tissues from all five patients cluster together, with the small spread between them representing between-donor biological variability. The tight clustering of measurements reflects within-donor repeatability and technical repeatability of the assay. In contrast, cancers tend to follow their own trajectories, reflecting genomic instability, stages of progression, and varying pathways of oncogenesis and transcriptional phenotype, or subtypes of cancer. The PCA analysis suggests four subgroups of cancer, with two patients within one of the subgroups, that are clearly distinguished from normal tissues.
To further illustrate and quantify assay reproducibility, correlation plots of within-donor biological samples of normal tissue and are shown (Fig 4A and 4B  as exemplar data the difference between patient 5 normal and cancer tissue, showing the differentially expressed genes underlying the PCA results depicted in Fig 3, readily identified because of the high repeatability of within-patient measurements. For comparison the largest difference between normal among the patients (between patients 3 and 5) is depicted in Fig  4F. Finally, Fig 4G and 4H show the difference between the cancer profiles for patients 3 and 5, and patients 2 and 4, respectively; demonstrating how different these cancers are, consistent  with the PCA analysis. These plots demonstrate the high repeatability within donors that permits robust measurement of differential expression, and also show that the betweendonor variability in normal tissue measurements is much less than the variability between cancers from different patients, consistent with expected biological differences and the PCA analysis.

Focal input and sensitivity
Common gene profiling assays generally require RNA purification from large FFPE tissue samples (entire slides or multiple slides). TempO-Seq does not require use of extracted RNA, rather direct sample lysates can be used. Thus, while significant amounts of FFPE are required for RNA extraction, much smaller amounts of FFPE can be assayed as a lysate. The ability to assay very small tissue amounts would spare the use of rare and precious archival FFPE samples, enable profiling of small focal areas with specific pathologies, reduce input for tissues with very low cellularity, and allow profiling of small FFPE samples such as tissue from biopsies or prepared as tissue microarrays.
To evaluate the sensitivity and amount of tissue required for TempO-Seq, we tested areas as small as 2 mm 2 from 5 μm thick tissue sections. The sensitivity measures are given in the format of tissue area/thickness because extraction of RNA from such small samples proved to be extremely technically difficult (with purification losses being prohibitive to analysis). Extractions performed on much larger tissue amounts (whole sections and higher) could not be extrapolated downwards in a meaningful manner to give a valid comparison to using lysate from a 2 mm 2 area, because the percentage of RNA lost from smaller amounts of FFPE are much greater than from larger amounts of FFPE. The lysis buffer volume was scaled accordingly, so for this input, the amount of tissue in the 2 μL volume that is transferred into the assay was the same as for larger tissue excisions.
We excised both 2 mm 2 and 10 mm 2 from 5μm thick mouse liver sections. The correlation of gene expression across biological replicates of the same area excision were similar for 2 mm 2 and 10 mm 2 areas, with the R 2 = 0.969 and 0.95, respectively (Fig 5A and 5B). The correlation between 10 mm 2 and 2 mm 2 inputs was also very good, with R 2 of 0.969 (Fig 5C, average of three samples of each input). These data indicate that the TempO-Seq FFPE assay is highly sensitive and can handle very low input amounts.

Archival tissue
Fixation and paraffin-embedding of tissue allows for long term preservation and storage of samples while retaining useful morphological information. The process of fixation, embedding, and extraction can damage RNA, and long-term storage of such samples can make the damage progressively worse, making gene expression analysis difficult [4,14]. However, due to the nature of DO hybridization and ligation chemistry and the short length of Extraction-free whole transcriptome gene expression analysis of FFPE sections RNA-Sequence that is targeted by each DO pair, TempO-Seq is highly resistant to this type of fragmentation, as well as to the presence of crosslinking.
To determine if storage time of FFPE blocks had a significant effect on performance of the TempO-Seq FFPE assay, we obtained archival human tumor FFPE samples from the University of Arizona Cancer Center Biorepository. Archival tissues, with their indicated year of harvest, were as follows: colorectal cancer (1986), hepatocellular carcinoma (1993), and two separate cases of kidney cancer (1994 and 1988). Blocks had been stored at room temperature, and in early 2016, were cut into 5μm thick sections. Slides were stored for two years before 25 mm 2 areas were scraped for TempO-Seq FFPE processing using the human whole transcriptome panel. The same area was cut from serial sections to produce biological replicates. On average, each sample generated 2.1 M mapped reads, which is sufficient for meaningful data analysis (only a 50 base pair region is sequenced and counted for each gene using TempO-Seq, compared to RNA-Seq in which identification of each gene requires sequencing and counting multiple fragments). We compared gene expression data between biological replicates from Extraction-free whole transcriptome gene expression analysis of FFPE sections the same block as a read out of assay reproducibility. Each of the archival samples had R 2 values of greater than 0.8, with the kidney harvested in 1994 having a within-donor biological replicate R 2 of 0.925 (Fig 6). Although a controlled study demonstrating consistent gene expression profiles in the same samples over decades of storage is not feasible, these results demonstrate that the TempO-Seq FFPE assay can produce robust data from FFPE samples that are more than 30 years old.

Fixation time
The amount of time tissue is exposed to fixative correlates with tissue autolysis and damage caused by endogenous endonucleases. Furthermore, total time of fixation affects RNA integrity, which directly impacts cDNA synthesis from RNA derived from fixed tissues [3,4]. This factor can significantly confound gene expression analysis which relies on methods dependent on reverse transcription such as microarrays or RT-PCR. Furthermore, additional fixation time can lead to overfixation, affecting accessibility of RNA [4,17,18,19]. We tested whether fixation time had a notable impact on our assay by harvesting rat liver tissue and incubating in 10% neutral buffered formalin at 4˚C for 24, 96, 192, and 384 hours before embedding. 10mm 2 of tissue was scraped from 5μm thick sections and used as input for the TempO-Seq FFPE assay using rat whole transcriptome DOs.
There was no negative effect on sequencing quality with additional fixation time beyond 24 hours. Gene expression between biological replicates was high: R 2 = 0.96 for 24 hours; 0.93 for 96 hours; 0.95 for 192 hours, and 0.98 for 384 hours (Fig 7). For all fixation times, the observed expression pattern clearly matched that expected for hepatocytes. These data collectively demonstrate that the TempO-Seq assay performs robustly even on samples that have been fixed for extended periods of time.

FFPE samples vs. fresh samples
Since fixation denatures RNA-binding proteins and disrupts secondary structure, TempO-Seq probes may interact with RNA in the context of fixed tissue differently than in fresh tissue, which could affect sensitivity and conclusions drawn from the samples. We compared the TempO-Seq FFPE assay to the standard assay designed for fresh lysates or purified RNA to determine whether processing of FFPE samples may impact biological conclusions. FFPE cell pellets were made from MCF-7 and MDA-MB-231 breast cancer cell lines, derived from luminal A and claudin low subtypes, respectively. 10mm 2 areas were excised from 5 μm thick sections and used as input into the FFPE assay. Cells from the same plate were lysed fresh in lysis buffer and used as input into the standard TempO-Seq assay [7], to minimize variables other than fixation.
We conducted differential gene expression analysis using DESeq2 between the two cell types for both FFPE and fresh assays. Differentially expressed genes were defined as genes with raw counts > 20, and by p adj < 0.05. A total of 4,461 genes were detected as differentially expressed in FFPE samples using these cutoffs, compared to 3,015 in fresh lysates. Thus, sensitivity was not reduced by fixation, as the FFPE assay actually detected more genes with lower levels of noise than the fresh lysate assay.
The log2 fold change between the two sample types showed a strong correlation (R 2 = 0.970), (Fig 8A). Literature and previous gene expression data for these two cell types agree with genes detected by TempO-Seq as differentially expressed in both sample types [7]. This shows that fixation does not significantly distort the underlying biological data.

Comparison to RNA-Seq
While these data show that the TempO-Seq direct lysis assay of FFPE is reproducible, precise, and sensitive, the question of accuracy remains: how well do these measures reflect biological reality? RNA-Seq is a method which depends on purification and reverse-transcription of RNA (both of which can introduce artifacts), and thus is not a perfect measure of biological reality. However, it has become the gold standard for measuring gene expression changes, and thus represents a sufficiently valid baseline measure for comparison.
We compared MCF-7 vs. MDA-MB-231 FFPE cell pellet results with previously published RNA-Seq data [7] which measured log2FoldChange differences between RNA purified from the same cell types (Fig 8B). These cell lines are well characterized, and the gene expression differences between them is expected and well understood. The data comparison was performed for genes with >20 counts, and whose expression was determined to be significantly different by DESeq2 (p adj <0.05). The agreement of log2FoldChange measures is excellent (Fig 8B) for a cross-platform comparison (R 2 = 0.84), especially when considering sample differences (whole cell FFPE lysate vs. purified, reverse transcribed RNA). By comparison, the SEQC study reported a Pearson correlation between RNA-Seq and Affymetrix microarrays of 0.89, which is an R 2 of 0.79 [20]. A correlation of R 2 = 0.849 was reported between RNA-Seq and Illumina

Stained tissue
Hematoxylin and eosin (H&E) staining of FFPE sections is a practice commonly used for histopathological interpretation of tissue samples and can be used to identify a wide variety of diagnostically relevant features including cellular organization, nuclear morphology, and lymphocytic invasion [23]. The H&E staining process requires deparaffinization and rehydration of tissue slides before staining. Samples that are rehydrated risk hydrolysis of RNA molecules and exposure to RNases, in addition to RNA degradation that can occur due to relatively high acidity of the staining process. Therefore, current practice is to prepare an H&E stained section and then process an adjacent unstained section using the H&E section as a guide. This is fine so long as there is sufficient material and the whole slide is being processed. However, if only a focal area is of interest because of its histology within the section, then marking slides accurately based on a serial H&E stained section can be problematic, particularly if the area is very small. Therefore, we pursued the possibility of profiling the H&E stained slide itself, so that the area scraped could be directly visualized and documented.
To test whether H&E staining would interfere with TempO-Seq, we used a set of 5 μm thick serial human prostate cancer slides. These slides were either processed directly, deparaffinized and processed, or deparaffinized then H&E stained and processed (Fig 9). RNase-free reagents were used for deparaffinization and staining. The same 5 mm 2 area of homogenous tumor was scraped from each serial section and lysed, with 2 mm 2 equivalent of the resulting lysate used as input into the human whole transcriptome TempO-Seq FFPE assay. Gene expression profiles between paraffinized and deparaffinized sections had a high correlation (R 2 = 0.902), indicating that the method of paraffin removal had little effect on assay performance. The R 2 value between deparaffinized and H&E stained was also high (R 2 = 0.855), demonstrating that the assay still worked well with RNA exposed to the H&E chemistry. Gene expression signatures from H&E stained tissue also correlated well with unstained sections (R 2 = 0.841). Overall, these data demonstrate that H&E stained FFPE tissue can serve as input for the TempO-Seq assay, with the note that we were careful to use RNase free reagents in the H&E staining process. This also further validates the ability of the assay to detect significantly degraded RNA within samples (Fig 9).

Discussion
The processing required for fixation and paraffin embedding of tissues tends to fragment nucleic acids, which presents significant obstacles to molecular analysis. This is particularly the case for RNA and measurement of gene expression. However, FFPE samples also preserve tissue morphology well over long periods of time, and are easy to handle and section, factors that make them extremely useful for pathology and long-term storage. Additionally, after the initial damage induced by the process itself, fixation and embedding provide significant protection from further damage and hydrolysis, without the need for expensive or cumbersome measures such as snap-freezing and keeping samples constantly frozen for years or decades. These advantages have led to accumulation of vast archives of annotated FFPE tissues which have until now been difficult or impossible to profile at the molecular level.
TempO-Seq [4,[7][8][9][10][11] is a targeted, ligation-based assay designed to minimize complexities usually associated with gene expression measurements. The targeted approach provides several advantages-since it processes and counts only specific pre-determined probe sequences, it avoids cumbersome bioinformatics (the output of the assay is a simple table of counts for each gene in each sample) and reduces sequencing costs significantly (to 1/10 th or less). The obvious downside is that all non-targeted sequences will be invisible (although a probe can be made for almost any target). Additionally, the assay can be performed in any lab, requiring no specialized equipment beyond a thermocycler and access to a sequencing instrument (commercially, or in most university core facilities). Critically for the purpose of FFPE evaluation, TempO-Seq does not rely on RNA extraction and reverse transcription, which makes it relatively insensitive to fragmentation, as the probes can successfully bind to RNA targets that are <100 nt in length.
In this study, we provide data demonstrating the quality and reproducibility (Fig 2 and Table 1) of the TempO-Seq assay of a variety of FFPE tissue samples across three different species (human, mouse, and rat). The gene expression data agrees well with existing data from the  (Tables 2 and 3), and is highly sensitive, producing excellent gene expression readouts from tissue inputs as small as 2 mm 2 areas of a 5 μm section (Fig 5). In comparison, most other methods require sacrifice of entire sections (or multiple sections) to obtain sufficient extracted RNA for a single attempt at measurement. This level of sensitivity is critical when dealing with precious and irreplaceable archival samples.
The assay is also insensitive to the time samples are stored (Fig 6), although some caveats apply. Namely, while we demonstrate the ability of the assay to detect highly precise and reproducible gene expression information from decades-old samples, the data presented here does not address the question of how good the preservation of biological information is in FFPE over such time periods. Further studies will be needed to confirm the full validity of such datasets, especially when compared to matched frozen tissues. One variable that can be eliminated from consideration is the time of fixation (Fig 7), which does not significantly affect the assay results.
Equivalency between recently fixed and fresh samples was demonstrated in Fig 8A, where the data showed an excellent correlation (R 2 = 0.97). This demonstrates that fixation does not produce large data distortions, or other immediate problems for further analysis. This is true not only within the TempO-Seq platform, but as Fig 8B demonstrates, between differential expression measured using RNA-Seq and the TempO-Seq platform, producing a "between platform" R 2 = 0.84. This is notable, particularly considering that typical between-platform correlations profile the same sample (e.g. aliquots of same extracted RNA); while in this case we compared the RNA-Seq differential expression of RNA extracted from unfixed cells to the whole transcriptome TempO-Seq FFPE assay of cell pellets after they were fixed, embedded in paraffin, and then sectioned before assay. This combined cross-platform and cross-methodology consistency demonstrates that results from this assay are likely to reflect the true expression profile of any assayed sample.
While the cross-platform consistency is good, additional sources of variation besides the platform differences may contribute to the observed R 2 value of 0.84. Two are immediately identifiable: first, the RNA-Seq data was derived using purified RNA from unfixed cells collected two years prior to growing and fixing the cells for our FFPE experiments, which means these are not identical samples. Secondly, purified RNA from unfixed cells will be different from that from fixed cells, and different in fragmentation state quality from RNA present in an unpurified lysate of FFPE. Thus, the comparison includes both the variability of cross-platform differences as well as between-sample differences. A cross-platform comparison between FFPE TempO-Seq and RNA-Seq performed on RNA purified from FFPE would be more closely equivalent. However, our attempts to purify RNA from fixed cell pellets never reached the minimum quality required for good performance in RNA-Seq, leading us to conclude that use of high quality RNA isolated from unfixed cells for RNA-Seq provided a more relevant comparison to TempO-Seq assay of FFPE lysates then would use of low quality RNA isolated from FFPE.
The observed sensitivity to assay small areas of FFPE (Fig 5) becomes especially valuable when coupled with data showing that TempO-Seq can be performed on H&E stained tissues (Figs 1A and 9), as long as the staining is performed using RNase-free reagents. In practice, this means individual tissue sections can be stained, and the staining used to determine precisely delimited areas of tissue to be profiled (e.g. separating epithelial cells from background, or stromal tissue from glands, etc.). Examples of H&E stained tissue are shown in Fig 1A, where the heterogeneity of the FFPE is evident (upper right panel) as well as the consistency of histology within the small scraped area (lower left and right panels). It is notable that such small areas can be profiled quickly and easily by hand, whereas existing alternatives that require the extraction of RNA are expensive, laborious, and typically require much more tissue (and where small amounts of tissue are assayed, such as with laser capture microdissection, required specialized expensive hardware and complex procedures). Thus if 1 mm 2 spatial resolution is sufficient, TempO-Seq provides a high sample throughput, highly repeatable, simple solution for profiling FFPE, even after archiving for a long period of time. This approach should enable investigators to obtain highly histology-specific gene expression data to delineate not only disease states but also the complex interactions between cell types and histologies within a tissue.
The between donor biological repeatability of the TempO-Seq assay revealed an important observation: the biological variability of normal colorectal tissue is quite low in comparison to variability between cancer samples, demonstrating that biological differences in disease phenotype between patients can be clearly seen above any assay variability. Normal samples clustered tightly and very differently from cancer, showing that the regularly seen "noise" in expression profiles due to biological differences between the normal tissue of donors is easily distinguishable from disease processes inherent to oncogenesis. Thus, using a larger cohort of patients, it should be possible to identify phenotypic subtypes or molecular signatures among colon cancer patients.
The combined sensitivity, robustness, and consistency of expression profiling permit the TempO-Seq FFPE assay to be used over a wide variety of applications which would not otherwise be possible. By enabling the assay of many samples and study designs which were previously very technically difficult (or sometimes outright impossible), we believe that whole transcriptome profiling using the TempO-Seq assay of FFPE samples will lead to significant advancements in many fields of biological science.