Rapid Differentiation of Epithelial Cell Types in Forensic Samples Using Intrinsic Fluorescence and Morphological Signatures

Establishing the tissue source of epithelial cells within a biological sample is an important capability for forensic laboratories. In this study we used Imaging Flow Cytometry (IFC) to analyze individual cells recovered from buccal, contact epithelial, and vaginal samples that had been dried between 24 hours and more than eight weeks. Measurements capturing the size, shape, and fluorescent properties of cells were collected in an automated manner and then used to build a multivariate statistical framework for differentiating cells based on tissue type. Results showed that cell populations from the three tissue types could be differentiated using a Discriminant Function plot of IFC measurements. Epidermal cells were distinguished from vaginal and buccal cells with an average classification accuracy of ~94%. Ultimately, cellular measurements such as these, which can be obtained non-destructively, may provide probative information for many types of biological samples and complement results from standard genetic profiling techniques.


Introduction
Characterizing the type of cells present in biological evidence and, therefore, the tissue they originated from within the body, can assist with crime reconstructions and downstream DNA profiling methods. Traditionally, caseworking methods for determining tissue source are based on microchemical and/or enzymatic reactions targeted toward proteins within bodily fluids, which have limited sensitivity and/or specificity. Recently, there has been considerable research into biomolecular markers for tissue identification. These include mRNA transcripts (1), micro-RNAs (2,3), proteomics (4), and DNA methylation patterns (5). Although promising, the specificity of many of these systems is still being investigated and interpretation can require complex bioinformatic workflows.
In contrast, few forensic techniques have utilized morphological or intrinsic biochemical differences to differentiate between cells from different tissues, particularly epithelial cells. This is likely due to the laborious nature of microscopic characterizations or the need for tissuespecific antibody probes which have limited success on dried or compromised samples (6,7).
One potential strategy to address these challenges is the use of Imaging Flow Cytometry (IFC).
In IFC, conventional flow cytometry analysis whereby the optical properties of individual cells are interrogated with lasers at set wavelengths, is combined with fluorescence and bright field imaging of those same cell events. IFC is routinely used in biomedical and clinical research for identification of unusual cell types as well as high resolution surveys of sub-cellular processes (8). The primary advantage of IFC over conventional microscopic analysis is that images of single cells are collected in a high throughput manner (as many as hundreds per second) and at multiple fluorescence channels simultaneously. The resulting multivariate data streams can therefore be used to compare profiles between individual cells or between larger populations. For forensic applications, another potential advantage of IFC is that it is an inherently nondestructive technique, with the possibility of collecting all cells after analysis for DNA profiling or other biological characterizations.
In this study we tested whether IFC could be used to differentiate epithelial cells from three separate tissue sources: buccal, touch epidermal, and vaginal. Identifying the presence of one or more of these cell types in a biological sample when combined with DNA profiling results may be useful when investigating claims of sexual assault, particularly without ejaculation.
Additionally, because DNA yield has been observed to systematically vary between epidermal cells and other types of epithelial tissue (9), determining the presence and relative quantities of each cell type can help direct downstream DNA profiling efforts. We conducted an initial IFC survey of each epithelial cell type by analyzing existing biological specimens from a forensic sample repository consisting of ten donors per cell type. To assess the robustness and consistency of IFC signatures against samples approximating those that would be encountered during forensic casework, each of the three cell types were collected from multiple donors and were "aged" for different amounts of time prior to analysis, ranging from ~24 hours to more than eight weeks.

Sample Collection and Preparation
Buccal samples were obtained pursuant to VCU-IRB approved protocol ID#HM20000454_CR3. Ten volunteers were asked to swab the inside of cheek for 30 seconds.
For dried samples, swabs were left to dry for 1 hour to 6 days. Dried and fresh swabs were processed in the same manner. Contact samples were obtained pursuant to VCU-IRB approved protocol ID# HM20000454_CR3. Ten volunteers (six of whom were buccal cell donors) were asked to hold/rub a conical tube (P/N 229421; Celltreat Scientific) for five minutes to deposit cells. For dried samples, tubes were left out for 24 hours to 5 days to dry before collecting cells.
Cells were collected from the surface with 1 sterile, pre-wetted swab, and 1 sterile, dry swab.
Ten vaginal samples were obtained pursuant to VCU-IRB approved protocol ID#HM20002931_Ame2. Volunteers were asked to swab the inside of the vaginal cavity, and swabs were dried and stored at room temperature until analysis. Storage times ranged from 48 hours to approximately eight weeks. All collection swabs were eluted in 1 mL of 1x Cell Staining Buffer (P/N 420201; Biolegend), and gently vortexed for 10 seconds. Samples were centrifuged at 1500 × g at 4°C for 5 minutes. The supernatant was discarded, and the cell pellets were dissolved in 100 uL of 1x Cell Staining Buffer for imaging flow cytometry.
A list of all donor samples used in this study and their respective drying times are provided in Table S1.
Magnification was set at 40x and autofocus was enabled so that the focus varied with cell size.
Examples of cell images collected across multiple wavelengths are provided in Figure S1. Raw image files (.rif) were then imported into IDEAS® Software (EMD Millipore). Display Width and Display Height were changed to 120x120 pixels for each image. The 'Shape Change Wizard' option in Ideas was used to gate focused cells on a Gradient RMS_M04Ch04 x Normalized Frequency histogram. Once focused cells were selected, single cells were gated on an Area_M04 x Aspect Ratio_M04 scatterplot. Data for individual cell events were collected for 17 different features: area, aspect ratio, aspect ratio intensity, contrast, intensity, mean pixel, median pixel, max pixel, raw max pixel, raw min pixel, length, width, height, brightness detail intensity ('R3' pixel increment), raw centroid X, raw centroid Y, and circularity. These feature measurements were collected across multiple detector channels (i.e., fluorescence and brightfield wavelengths) with the exception of measurements that could only be determined from brightfield images such as centroid X/Y and circularity. This yielded a total of 88 measurements/variables collected for each cell. Feature data for individual cells was then exported into a Microsoft Excel worksheet. Cell yield varied across each of the study samples but did not appear to be correlated IFC measurement values were then imported into SPSS v23 (IBM, Inc. Chicago, IL).
Differences in mean values between the three cell types were tested using a one-way ANOVA analysis with a Tukey HSD post-hoc test performed in SPSS. Next, multivariate differences among the three cell type groups were analyzed using a Discriminant Function Analysis (DFA) based on the within-group covariance matrix. We initially compared results from direct analysis of IFC measurements and those obtained from transforming the data first into principal components (PCs) and then conducting DFA on the PC scores. We found that the latter approach led to less differentiation in the canonical variate plot and poorer classification accuracy and thus used direct analysis of raw measurements. Different combinations of variables were initially tested based on their impact on group separation in the canonical variate plot and classification accuracy. Inclusion of all variables in the analysis resulted in the greatest degree of separation in the canonical variate plot and the highest rate of accurate classifications.

Results and Discussion
We first tested whether IFC could be used to distinguish cells from the three different epithelial tissue sources. During sample processing and image collection, we noted some general qualitative differences between images from each of the three cell types. For example, nuclei were routinely observed in brightfield images of buccal cells and vaginal cells (e.g., Images 1507, 1796, respectively, Figure 1), while they were rarely observed in contact epithelial cell images. Buccal and vaginal cells were generally larger in size, >40 µm compared to contact cells, which were ~20-50 µm although we noted some size overlap between cell sources. This  (Table S2). Of note were differences in means for circularity (7.8 contact, 4.1 buccal, 4.3 vaginal), intensity (e.g., in 430-505nm channel 3x10 5 RFU contact, 6x10 4 RFU buccal, 5x10 4 RFU vaginal), and brightness detail (e.g., in 403-505 nm channel 1x10 4 RFU contact, 9x10 3 buccal, 7x10 3 RFU vaginal). However, the range of values for each cell group showed a high degree of overlap across the three cell types (Boxplots in Figure S2). Similarly, most variables showed large standard deviations for each cell type, with coefficients of variation for individual measurements ranging from ~20% to more than 280%.
In order to determine whether the observed variation in IFC measurements could be used  Table 1. In general, epidermal cells showed the highest classification accuracy (89%) with six of the ten donor cell populations having accuracies over 90%. Only one cell population, P22, was below 80%. Buccal and vaginal cell populations yielded lower classification rates, 77% and 75% respectively (Table 1). Interestingly, classification rates were highly variable across individual cell populations for these two groups, with buccal cells ranging between 55% and 90% and vaginal cells ranging between 29% and 97%.
In an attempt to improve the classification accuracy for each cell type, individual cell populations were also tested with two-group classification schemes where one tissue group was excluded completely from the analysis, i.e.,

buccal cells against epidermal cells; vaginal cells against epidermal cells; and buccal cells against vaginal cells (Tables 2-4 respectively).
Simplified classification schemes could be run subsequent to the original classification to help identify samples assigned to one of the closely related sample groups, i.e., a cell image classified as a buccal cell in the three group DFA could then be run against a two group DFA containing only buccal and vaginal cells. Tiered or successive DFA analyses have been described for other types of forensic samples (10,14). Additionally, two group comparisons could approximate caseworking scenarios in which one of the epithelial cell types could be ruled out a fortiori for an unknown cell population. Results from two-group DFA showed that buccal and epidermal cell populations could be differentiated with the highest accuracy (~94%). The lowest classification rate of individual donor cell populations in this comparison was 83% (P22, Epidermal) with the majority of cell populations exhibiting classification rates of 95% or higher ( Table 1). The vaginal-epidermal cell classifications showed comparable results with an overall classification rate of ~91%. Two individual cell populations in this scheme exhibited markedly lower success rates, e.g., P22 epidermal 67% and 5005 vaginal 55%. However, the remaining cell populations had classification rates >80% with the majority >95% ( Table 2). The poorest differentiation was observed between buccal and vaginal cells with an overall classification accuracy of 79% (Table   3) and multiple donor cell populations below 60% accuracy (e.g., 5034, Buccal; Z32 Buccal; Although no studies to date have explicitly surveyed cellular differences between these three tissue sources using fluorescence signatures, previous work has shown that changes in cellular autofluorescence can be used to differentiate layers of epidermal tissue with different intracellular components (e.g., keratin, tryptophan, FAD) (16,17).
Additionally, the morphological and size differences detected with IFC (e.g., area and circularity measurements) are consistent with histological context of each cell type, i.e., shed epidermal cells hexagonal and ~20-50 µm, while buccal and vaginal cells are typically >40 µm with elongated shapes (18,19).
The overlap between cell sources shown in Figure 2 and misclassifications of individual cells images, could be due to a number of factors. First, some similarities in fluorescence and/or morphological attributes are expected, particularly for buccal and vaginal cells given that both are derived from non-keratinized epithelial tissue. This is consistent with poorer classification rates of buccal/vaginal cells relative to buccal-epidermal and vaginal-epidermal (Table 3 vs.   Tables 1-2 (Tables 1-3), this should be explicitly tested with future studies. Another factor that could be contributing to misclassifications is inter-individual variation. Previous work from our group has shown that autofluorescence signatures in shed epidermal cells can vary between contributors, likely owing to the presence of exogenous materials associated with the cell (20). Cell populations from different contributors of the same tissue type (epidermal or buccal) and drying time (24 or 48 hours, respectively) showed distinct separation in a preliminary DFA (Figure 3). It is also possible that the number of donor cell populations in this study did not adequately capture the full range of morphological and/or fluorescence variation that exists between contributors. Increasing the number of unique donor cell populations in the reference/comparison dataset will likely help to isolate any tissue-specific signatures that are present. Nevertheless, contributor-specific variation in IFC measurements is a potentially promising avenue of future research for this technique, particularly how it might be used for estimating the number of individual cell populations in a biological sample and/or facilitating front-end cell separation in a DNA profiling workflow.

Conclusion
Our goal with this study was to conduct an initial assessment of high-throughput fluorescence and morphological analysis and its potential applications for characterizing epithelial cell types in an unknown biological sample. High-throughput, single cell measurements combined with a multivariate classification framework were used to distinguish epidermal cells from other epithelial cell sources across a range of drying times with an overall high degree of accuracy. Although a range of factors may contribute to morphological or optical properties in any given sample (e.g., individual-specific signatures and degradation time), these results suggest that multivariate approaches may be used to extract tissue-specific signatures from biological samples. Future work should test alternative classification methods such as machine learning algorithms as well as different combinations of cellular measurements to maximize classification accuracy particularly for cell types with similar biochemical and physical properties but different source tissue, and eventually establish cell count thresholds for inferring the presence and/or proportions of one or more cell types in a biological sample. Lastly, we note that the fluorescence and morphological signatures identified here could be detected and analyzed using other microscopy setups and open source software platforms (e.g., (21)) which may facilitate its use in caseworking laboratories.     Epidermal  Z32  20  178  5  88  Epidermal  Y60  5  192  5  95  Epidermal  K36  0  421  12  97  Epidermal  P22  34  222  107  61  Epidermal  S95  3  79  2  84  Epidermal  I66  14  309  13  92  Epidermal  L49  2  348  1  99  Epidermal  R47  51  251  0  83  Epidermal  N08  5  277  21  91  Total  --138  2468  174  89   Vaginal  2368  200  18  160  42  Vaginal  4017  11  13  288  92  Vaginal  1022  46  13  298  84  Vaginal  1028  93  2  37  29  Vaginal  1031  1  9  357  97  Vaginal  5020  40  0  128  76  Vaginal  5021  1  26  110  80  Vaginal  5005  24  36  108  64  Vaginal  4502  14  0  186  93  Vaginal  4504  75  0  227  75  Total  --505  117 1899 75