Non-invasive evaluation of the equine gastrointestinal mucosal transcriptome

Evaluating the health and function of the gastrointestinal tract can be challenging in all species, but is especially difficult in horses due to their size and length of the gastrointestinal (GI) tract. Isolation of mRNA of cells exfoliated from the GI mucosa into feces (i.e., the exfoliome) offers a novel means of non-invasively examining the gene expression profile of the GI mucosa. This approach has been utilized in people with colorectal cancer. Moreover, we have utilized this approach in a murine model of GI inflammation and demonstrated that the exfoliome reflects the tissue transcriptome. The ability of the equine exfoliome to provide non-invasive information regarding the health and function of the GI tract is not known. The objective of this study was to characterize the gene expression profile found in exfoliated intestinal epithelial cells from normal horses and compare the exfoliome data with the tissue mucosal transcriptome. Mucosal samples were collected from standardized locations along the GI tract (i.e. ileum, cecum, right dorsal colon, and rectum) from four healthy horses immediately following euthanasia. Voided feces were also collected. RNA isolation, library preparation, and RNA sequencing was performed on fecal and intestinal mucosal samples. Comparison of gene expression profiles from the tissue and exfoliome revealed correlation of gene expression. Moreover, the exfoliome contained reads representing the diverse array of cell types found in the GI mucosa suggesting the equine exfoliome serves as a non-invasive means of examining the global gene expression pattern of the equine GI tract.


Introduction
Gastrointestinal (GI) disease is of considerable importance to horses and the horse industry, second only to old age as a cause of death [1]. Two decades ago, the cost of colic to the equine industry was $115 million annually, and with continued growth of the equine industry and increasing costs of health care, the staggering financial burden continues to grow [2]. Despite vast research efforts aimed at identifying preventative and treatment strategies for equine GI diseases, they remain a major cause of morbidity and mortality in the horse. The cause of colic in the horse varies considerably, including simple obstructive lesions, strangulating obstructive lesions, and inflammatory conditions. The pathophysiology of these conditions is often poorly understood, resulting in a decreased ability to manage and prevent disease. An important limitation to understanding the pathogenesis of GI disease and assessing GI health is the lack of non-invasive tools to assess cellular and molecular GI function. Magnetic resonance imaging (MRI) and computed tomography (CT) are frequently utilized in assessing the GI tract in human and small animal medicine, but animal size precludes use of these imaging modalities in horses. Abdominal ultrasonography is widely utilized to examine the equine GI tract and has greatly advanced our ability to accurately diagnose intestinal diseases. Sonographic assessment of the GI tract is limited by the acoustics of the gas-filled intestine [3]. Importantly, irrespective of species, imaging alone does not provide information regarding function of the GI tract at the cellular or molecular level. Currently, intestinal mucosal biopsy is the only available means to provide mechanistic and functional data, but several practical limitations to this approach exist. Endoscopic biopsies can only be obtained from the stomach, duodenum, or rectum. While these biopsies can have diagnostic utility, they do not provide a global view of the GI tract [4]. Samples from other anatomic sites can be acquired via surgical biopsies obtained through traditional open surgical or laparoscopic approaches. Surgical biopsies, however, have several disadvantages including surgical complications, limited ability to biopsy several anatomic locations, and most importantly, the inability to easily obtain longitudinal (sequential) data from individuals regarding intestinal function and health for the purpose of monitoring response to therapy.
Non-invasive coprological assays have been used commonly in people and other animals to diagnose GI disease [5][6][7]. For example, fecal calprotectin is used to diagnose non-steroidal anti-inflammatory disease (NSAID) enteropathy in people and inflammatory bowel disease (IBD) in dogs [5,8]. Similar markers of intestinal disease have not been well-studied or validated in horses. Importantly, these are merely markers of inflammation which do not provide mechanistic insight into the cause of the inflammation which would better direct therapeutic interventions. Thus, great clinical and investigative needs exist for the development of noninvasive methods to characterize the health and function of the GI tract to more effectively identify, study, and manage equine intestinal disorders.
A potential strategy to address this limitation is the use of exfoliated intestinal epithelial cells found in feces. Approximately 1/3 of human colonic epithelial cells (up to 10 10 cells in an adult) are exfoliated and shed in the feces daily [9]. A technique to isolate and sequence the mRNA (host transcriptome) from exfoliated intestinal epithelial cells, termed the exfoliome, has been validated in the context of colorectal cancer and neonatal GI development in humans [10][11][12][13][14]. This technique has been utilized in a murine model of NSAID enteropathy, validating its ability to classify animals with GI inflammation [15]. This methodology provides a global view of GI health by assessing the mucosal transcriptome of host cells exfoliated into the GI lumen from the mucosa. To the authors' knowledge, comprehensive evaluation of the transcriptome of the equine GI tract has not been performed. Thus, the objectives of this study were to characterize the gene expression profile found in exfoliated intestinal epithelial cells from normal horses and to compare these data with the tissue mucosal transcriptome from specific locations along the equine GI tract.

Sample population and sample collection
Horses donated to Texas A&M College of Veterinary & Biomedical Sciences for euthanasia for reasons unrelated to GI disease were included in this study. All horses were administered xylazine (AnaSed 1 ; 1.1 mg/kg I.V.) and ketamine (Ketaset 1 ; 2.2 mg/kg I.V.) prior to euthanasia with potassium chloride (960 mEq I.V.). Feces were collected via rectal palpation immediately following euthanasia, homogenized in RNA Shield1 (Zymo Research, Irvine, CA), and stored at -80˚C until processed. A ventral midline incision was performed in routine fashion to gain access to the abdomen. The GI tract was exteriorized and within 10 minutes of euthanasia mucosal samples (1 cm x 1 cm) were collected from the ileum, cecum, right dorsal colon, and rectum. RNA isolation, library preparation, and RNA sequencing were performed similarly for fecal and intestinal mucosal samples. This study was approved by the Texas A&M University Institutional Animal Care and Use Committee (IACUC 2016-0301).

RNA isolation and sequencing from exfoliated cells
PolyA + RNA was isolated from fecal samples as previously described [10,11,14]. Briefly, RNA was extracted using a commercially available kit (Active Motif, Carslbad, CA), quantified (Nanodrop spectrophotometer; Thermo Fisher Scientific, Waltham, MA), and quality assessed (Bioanalyzer 2100; Agilent Technologies, Santa Clara, CA). Each sample was processed with the NuGen Ovation 3'-DGE kit (San Carlos, CA) to convert RNA into cDNA. Following cDNA fragment repair and purification, Illumina adaptors were ligated onto fragment ends and amplified to create the final library. Libraries were quantified using the NEBNext Library Quant kit for Illumina (NEB, Ipswich, MA) and run on an Agilent DNA High Sensitivity Chip to confirm sizing and the exclusion of adapter dimers.

RNA isolation from tissue
RNA was extracted from the mucosa of the ileum, cecum, right dorsal colon, and rectum using an E.Z.N.A. 1 Total RNA kit (Omega Bio-tek, Norcross, GA) following the manufacturer's protocol, including on-column DNase treatment. RNA quality was determined using the Nano6000 chip on a Bioanalyzer 2100 (Agilent Technologies). Sequencing libraries were made using 250 ng of RNA and the TruSeq RNA Sample Preparation kit (Illumina) following the manufacturer's protocol.

Data analysis
The datasets utilized for the current study are available via the NCBI bioproject (accession number PRJNA575706) http://www.ncbi.nlm.nih.gov/bioproject/. Sequencing data were multiplexed and assessed for quality using FastQC. Reads were aligned using Spliced Transcripts Alignment to a reference software with default parameters and referenced against the genome of the horse (EquCab 3.0) [16]. Differentially expressed genes were determined using EdgeR base on the matrix of gene counts [17]. Gene pathway intersections and involvement were analyzed using QIAGEN's Ingenuity Pathway Analysis (IPA, QIAGEN, Redwood City, CA) by uploading gene lists with fold-change and false discovery rate p-values. Statistical analysis was performed using R (v. 3.5.3) statistical software with a level of P < 0.05 considered significant.

Sample population
Four horses were included in the study including 3 geldings and 1 mare, with a mean age of 12 years (range, 4 to 25 years). There were 3 Quarter Horses and 1 Warmblood included. Reasons for euthanasia included cervical vertebral osteomyelitis (n = 1), equine protozoal myeloencephalitis (n = 1), ocular squamous cell carcinoma (n = 1), and chronic navicular degeneration (n = 1).

RNA quality from exfoliome versus tissue
RNA quantity and quality were assessed via bioanalyzer for both tissue and feces. The mean RNA integrity number (RIN) from tissue was 8.9 (range, 7.5 to 10) and the mean RIN from fecal samples was 6.2 (range, 5.5 to 6.9). Representative bioanalyzer trace and virtual gel from RNA isolated from equine feces showing expected peaks of 18S and 28S components of the eukaryotic ribosome are shown in S1 Fig. Sequencing quality was assessed by Fastqc as previously described [15,18]. Representative traces of per base sequence quality demonstrate excellent quality of tissue reads and high quality of exfoliomic reads, albeit more variable and of slightly poorer quality than tissue reads (Fig 1). As has been shown previously, the number of mapped reads was much smaller from the exfoliome than from the tissue [14,15]. Analysis of the RNA-Seq analysis revealed that there was a greater loss of reads from the exfoliome as compared with the tissue samples (Table 1). Despite loss of reads, there were similar number of genes represented by each of the sample types including the exfoliome ( Table 1). The total counts per sample and log (2)

Comparison of tissue and exfoliomic data
Genes present in fewer than 2 samples or represented fewer than 10 times across all samples were removed. The intersection of genes represented in the exfoliome and genes represented in each of the tissue samples were calculated (Fig 2). These data indicate that greater than 94% of the genes present in any tissue sample were also represented in the exfoliome. Next, pathways represented by genes present only in the tissue samples and not present in the exfoliome were examined by uploading these genes into Qiagen1 IPA software. The pathways represented by these genes are depicted in Table 2.
Interestingly 105 genes were present in the exfoliome, but not identified in any tissue samples (Fig 2). This gene list was analyzed with Qiagen IPA software to identify which pathways were present in the exfoliome but not the tissue samples. The top pathways identified are shown in Table 3.
To further compare the gene expression profiles between the tissue and exfoliome, a principal component analysis (PCA) plot was constructed (Fig 3A). This revealed a visual clustering of exfoliome samples together suggesting a similar gene expression profile of the exfoliome from the four normal horses examined in this study. This PCA also revealed that the tissue samples clustered together and were separated from the exfoliomic samples. To evaluate correlation between the exfoliome and tissue samples, data were normalized with EdgeR calcnormfactors using trimmed mean of M-values(TMM). Scatter plots of log(2)-transformed normalized count data between the exfoliome and each tissue source are shown (Fig 3B). There was a strong and significant correlation of all tissues (Spearman's correlation coefficient; ρ > 0.8 and P < 0.0001) and, although of lesser magnitude, significant correlation between the exfoliome and each tissue source (Spearman's correlation coefficient; ρ > 0.15 and P < 0.0001) ( Table 4).

Cell types and anatomic locations represented in the equine exfoliome
It has been previously demonstrated in mice, that the exfoliome gene expression signature arises from multiple anatomic locations and represents a global representation of the GI mucosal transcriptome [15]. In order to determine the source of this signature in horses we extracted the counts of genes previously identified and expressed predominantly in specific anatomic locations (i.e., stomach, small intestine, and colon). Interestingly, we found that the exfoliome contained reads from all major anatomic locations ( Fig 4A). As expected, genes representing the colon and small intestine were heavily represented in the transcriptomes arising from those locations with some overlap. Similarly, in addition to anatomic origin, we also assessed the cell types represented in the exfoliome. Clearly, the intestinal epithelium is comprised of many cell types including absorptive cells (enterocytes and colonocytes depending on anatomic location), intestinal stem cells, goblet cells, Paneth cells (SI), among others as well as a host of infiltrating immune cells depending on depth of the sample (i.e., lamina propria) and disease state of the GI tract (e.g., inflammation vs. homeostasis). In order to determine the  cell types present in these data, we reviewed the literature for marker genes expressed either solely by a specific cell type or at least highly enriched in a specific cell type [19][20][21][22][23][24][25][26][27][28][29][30][31][32]. In particular, we extracted the numbers of reads in each sample for the following cell types: intestinal stem cells, absorptive cells, transit amplifying cells, Paneth cells, tuft cells, goblet cells, macrophages, lymphocytes, neutrophils, and smooth muscle cells. Interestingly, all cell types were present in all datasets as identified by the presence of at least 2 marker genes per cell type ( Fig  4B). These data suggest that the equine exfoliome represents gene expression signatures from the diverse array of cell types expected to be found in the intestinal mucosa.

Discussion
Despite the presence of degradative host and microbial enzymes in the GI lumen, we were able to, for the first time, extract mRNA from exfoliated intestinal epithelial cells that were voided in equine feces. Further, we have demonstrated that the transcriptome of exfoliated cells in horses 1) represents a similar gene expression profile as the GI tissue transcriptome and 2) represents the multiple anatomic regions of the equine GI tract and all major cell types found in the GI mucosa of healthy horses. Given the limitations of assessing the equine GI tract due to the immense size of horses, this non-invasive approach holds great promise for both research and clinical use. Exfoliomics has been used in both people and mice to study GI health, response to disease, and effects of therapeutics [10,11,14,15,33]. There are major physiologic differences between horses and these other species that may have precluded successful use of the approach. For example, the length of the equine GI tract is over 100 feet and GI transit time is up to 48 h in healthy horses [34]. This is vastly different from humans and mice where the GI tract is much shorter and transit time is faster. [35] Degradation of RNA likely occurs when the duration of time between cells exfoliating and voiding of feces is increased. Protection of nucleic acids through proper sample handling is also critical to prevent RNA degradation. While RNA The-log(p-value) of the overlap represents the p-value of the overlap between the inputted gene list and the canonical pathway that is represented.
https://doi.org/10.1371/journal.pone.0229797.t002 Table 3. Canonical pathways enriched by genes found in the equine exfoliome but not found in tissue samples. The-log(p-value) of the overlap represents the p-value of the overlap between the inputted gene list and the canonical pathway that is represented.

PLOS ONE
isolated from exfoliated cells was indeed of lower quality as compared with tissue data, the quality was acceptable and resulted in excellent sequence mapping as compared to human subjects [14]. Our protocol selects for eukaryotic RNA by utilizing oligo dt primers that bind to the polyA tail of eukaryotic transcripts. In people and mice, this approach primarily selects for host mRNA. The microbiota of horses, however, contains vast numbers of eukaryotic organisms. Specifically, as hind-gut fermenters, fermentation in horses is carried out by a host of microorganisms including protozoa in the cecum [36][37][38][39]. These protozoa are just one of many eukaryotic organisms found in the equine GI tract. Other types include helminths and fungal organisms, both of which may have been present in high numbers in the horses examined in  Values represent ρ statistic. All correlations were positive and significant (P < 0.0001) although the magnitude of correlation was smaller between the exfoliome and tissues than between the various tissue sites.
https://doi.org/10.1371/journal.pone.0229797.t004 this study. These large numbers of eukaryotic organisms may have explained why only 300,000 to 400,000 reads mapped to the equine genome from a starting number of nearly 50,000,000 reads per sample despite selectively isolating eukaryotic mRNA.
Despite the relatively few number of reads that mapped to the equine genome relative to the tissue samples, this initial evaluation of the equine exfoliome holds promise. Over 94% of the genes present across all tissue samples were present in the equine exfoliome. There were 624 genes present in 2 or more tissue samples and not present in the exfoliome. Interestingly, 10 of the 13 networks enriched by these genes (Table 2) were from inflammatory and immune signaling pathways. Tissue samples were derived from intestinal mucosa obtained via biopsy. The immune cell-rich lamina propria lies just beneath the mucosa and these genes may have been expressed in cells inadvertently obtained from the deeper layers of the intestinal wall. These same cells and genes were unlikely to be expressed in cells exfoliated into the lumen of the GI tract in these healthy horses with no evidence or history of GI disease. Only 105 genes were found in the exfoliome and not in the tissues. Most likely, these genes originated from cells entering the GI lumen and passing into voided stool with intact RNA. Examples of such cells could be derived from the respiratory tract or oral cavity. Only 5 canonical pathways were significantly enriched by these transcripts. Interestingly, the gustation pathway associated with taste was the second most enriched pathway suggesting that indeed these transcripts may have originated from tongue cells that were exfoliated, swallowed, and passed though the GI tract.
Several important limitations of the study should be considered. First, only four horses were included in the study. In addition, there was a great deal of variation in the horses' age (4 to 25 years) and other factors, which may have contributed to some of the variation observed in the exfoliomic signature. Despite this small sample size, we were able to demonstrate that the exfoliated cell transcriptome reflects the tissue-level transcriptome. Another limitation is that these horses had no overt evidence or known history of GI disease, however, it is possible that subclinical or unknown GI disease existed. Importantly, we do not know how concurrent GI disease could affect this technique. Gastrointestinal disease frequently results in inflammation and increased GI transit time. It is unknown if these factors could affect the quality of RNA isolated from exfoliated cells and/or alter biological interpretation as gene expression from these cells could be altered during passage though the GI tract. An important future step will be to examine the equine exfoliome in the context of both health and disease in order to determine if this approach can be used to discriminate healthy from diseased animals and if this approach can be used to gain temporal insight into the pathophysiology of equine GI diseases. There were unexpected gene expression signatures observed at various tissue sites (e.g. Paneth cell markers observed in large intestinal biopsies) and the exact reasons for this unexpected finding are unknown. One possible explanation is that we extrapolated from human and murine data by using genes thought to be predominantly expressed by specific cell types and anatomic locations. However, these same genes may not be specific for locations or cell types in horses. Finally, we compared the exfoliome to the tissue transcriptome at only four anatomic sites. Future work to compare the exfoliome with tissue transcriptome of other sites and especially more proximal sites is important as many diseases specifically affect these locations. Despite these limitations, this is the first work to compare the tissue transcriptome and exfoliome in horses.

Conclusions
In summary, we have demonstrated that the exfoliated cell global transcriptome closely mirrors the transcriptome of the mucosa of the ileum, right dorsal colon, cecum, and rectum of horses. While the use of exfoliated cells has been validated in other species [12][13][14], this is the first description of the equine exfoliome and its correlation to the tissue-level transcriptome. Application of this non-invasive technique in early identification or monitoring of GI disease in the horse holds promise, but requires further investigation prior to clinical implementation.