New technologies, such as multiplex immunofluorescence microscopy (mIF), are being developed and used for the assessment and visualization of the tumor immune microenvironment (TIME). These assays produce not only an estimate of the abundance of immune cells in the TIME, but also their spatial locations. However, there are currently few approaches to analyze the spatial context of the TIME. Therefore, we have developed a framework for the spatial analysis of the TIME using Ripley’s K, coupled with a permutation-based framework to estimate and measure the departure from complete spatial randomness (CSR) as a measure of the interactions between immune cells. This approach was then applied to epithelial ovarian cancer (EOC) using mIF collected on intra-tumoral regions of interest (ROIs) and tissue microarrays (TMAs) from 160 high-grade serous ovarian carcinoma patients in the African American Cancer Epidemiology Study (AACES) (94 subjects on TMAs resulting in 263 tissue cores; 93 subjects with 260 ROIs; 27 subjects with both TMA and ROI data). Cox proportional hazard models were constructed to determine the association of abundance and spatial clustering of tumor-infiltrating lymphocytes (CD3+), cytotoxic T-cells (CD8+CD3+), and regulatory T-cells (CD3+FoxP3+) with overall survival. Analysis was done on TMA and ROIs, treating the TMA data as validation of the findings from the ROIs. We found that EOC patients with high abundance and low spatial clustering of tumor-infiltrating lymphocytes and T-cell subsets in their tumors had the best overall survival. Additionally, patients with EOC tumors displaying high co-occurrence of cytotoxic T-cells and regulatory T-cells had the best overall survival. Grouping women with ovarian cancer based on both cell abundance and spatial contexture showed better discrimination for survival than grouping ovarian cancer cases only by cell abundance. These findings underscore the prognostic importance of evaluating not only immune cell abundance but also the spatial contexture of the immune cells in the TIME. In conclusion, the application of this spatial analysis framework to the study of the TIME could lead to the identification of immune content and spatial architecture that could aid in the determination of patients that are likely to respond to immunotherapies.
New technologies, such as multiplex immunofluorescence microscopy, are being developed and used for the assessment and visualization of the tumor immune microenvironment (TIME). These assays produce not only an estimate of the abundance of immune cells in the TIME, but also their spatial locations; however, there are currently few approaches to analyze the spatial context of the TIME. Thus, we have developed a framework for the spatial analysis of the TIME and applied this method to the analysis of T-cells collected from patients with high-grade serous ovarian carcinoma in the African American Cancer Epidemiology Study. We found that patients with high abundance and low spatial clustering of tumor-infiltrating lymphocytes and T-cell subsets in their tumors had the best overall survival. Additionally, best survival was observed for patients with tumors displaying high co-occurrence of cytotoxic T-cells and regulatory T-cells. These findings underscore the prognostic importance of evaluating not only immune cell abundance but also the spatial contexture of the immune cells in the ovarian TIME. The use of our framework for spatial analysis of the TIME and immune cell clustering may be applicable in other cancers and provide a novel approach to identification of biomarkers for predicting patient outcomes.
Citation: Wilson C, Soupir AC, Thapa R, Creed J, Nguyen J, Segura CM, et al. (2022) Tumor immune cell clustering and its association with survival in African American women with ovarian cancer. PLoS Comput Biol 18(3): e1009900. https://doi.org/10.1371/journal.pcbi.1009900
Editor: Jean Fan, Johns Hopkins University - Homewood Campus: Johns Hopkins University, UNITED STATES
Received: April 23, 2021; Accepted: February 7, 2022; Published: March 2, 2022
Copyright: © 2022 Wilson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The R package spatialTIME was developed to implement the statistical analyses outlined in the paper and can be found at R CRAN and at https://github.com/FridleyLab/spatialTIME. For details on the R package, see publication by Creed et al (2021) in Bioinformatics. The code for the analyses presented in this manuscript can be found at https://github.com/FridleyLab/Permuted_Ripley_K. Question about the method and R package can be submitted to Fridley.Lab@Moffitt.org or to Brooke.Fridley@moffitt.org. The data used in the survival analysis can also be found at the aforementioned GitHub page.
Funding: AACES is supported by National Institutes of Health (NIH) awards R01CA237318 (JMS) and R01CA142081 (JMS). This research was also supported by a NIH award K99/R00CA218681 (LCP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Immunology has been a break-through area in the treatment of cancer [1,2]. One of the most important findings is the use of agents to block immune checkpoints to activate anti-tumor immunity. Immune checkpoints are the mechanism by which the immune system maintains self-tolerance. That is, immune checkpoints are regulators of the immune system and prevent the immune system from attacking “good” cells. In the case of cancer, the cancerous cells hijack this mechanism to protect themselves from being attacked by the immune system . The use of checkpoint inhibitors in the treatment of cancer has been a revolutionary approach and has resulted in the development of numerous checkpoint inhibitors, such as CTLA-4 and PD-1/PD-L1 inhibitors .
Tumors with a dense infiltration of lymphocytes, also known as tumor-infiltrating lymphocytes (TILs), are consistently associated with more favorable outcomes among cancer patients [5–7]. However, abundance alone may not explain a patient’s clinical outcome, and consideration of the spatial architecture of the tumor immune microenvironment (TIME) may shed new light on clinical outcomes and response to immunotherapies. Lee, et al. (2020) showed that diffuse large B cell lymphoma tumors with similar densities of TILs had heterogeneous spatial patterns of cytotoxic T-cells , and a study in colorectal cancer observed that cell-to-cell distances and spatial heterogeneity were more promising as prognostic biomarkers than cell densities . More recently, as study by Steinhart et al (2021) found that co-localization of tumor-associated macrophages and B cells or CD4 T cells was significant associated with better survival from ovarian cancer .
Many technologies have been developed to study the TIME. One study approach is the use of multiplex immunofluorescence (mIF) microscopy which provides both a summary of the number of cells positive for a given immune marker (e.g., abundance or density) and spatial locations of cells. By having spatial locations of the cells positive for the various immune markers, one can determine spatial clustering and co-occurrence of immune cells. mIF can be applied to both regions of interest (ROIs) selected from a stained tumor slide or tissue microarrays (TMAs) . Many challenges arise with the use of data resulting from TMAs. In particular, the tissue area can become folded or ripped due to the “slicing” done for different experiments, leading to imperfections in the shape and the ability to measure all cells in the area. These imperfections can lead to TMAs that have sections that appear like no cells exist, as depicted in Fig 1A. In contrast, ROIs typically do not exhibit this artifact in the data acquisition process (Fig 1B).
As illustrated in the two example figures, TMAs tend to have more “holes” and uneven cell density as compared to ROIs. Examples of clustering patterns observed in the intra-tumoral ROIs; (C) locations of CD3+ cells in which the pattern deviates from complete spatial randomness (CSR); and (D) location of CD3+ cells in which the pattern is close to CSR.
The most common analysis of data from the TIME involves the use of the summary measures representing immune content in the entire sample (i.e., proportion of CD3+ cells, density). However, this type of analysis ignores the spatial architecture of the immune cells within the tumor, which can vary between tumors. As illustrated in Fig 1, some tumors show clustering of TILs (CD3+ cells; Fig 1C), while other tumors show more dispersion of TILs (Fig 1D). While there have been studies that attempt to describe the relationship between spatial clustering of immune cells and patient outcomes using such measures as nearest-neighbor distance (NND) , Hypothesized Interaction Distribution (HID) [13,14], Morisita-Horn index , and Mander’s correlation coefficient , many approaches fail to account for issues related to: correlation between spatial and abundance measures of immune cells; edge/border effects; and regions in which no cells were able to be measured [17,18]. Thus, we have developed a framework for the spatial analysis of the TIME using Ripley’s K , coupled with a permutation-based framework to estimate and measure the departure from complete spatial randomness (CSR) as a measure of the relationships and interactions between immune cells. We applied this analysis framework to epithelial ovarian cancer (EOC), the deadliest gynecologic malignancy in the U.S. . Specifically, we characterized the TIME and explored links between immune cell abundance of T-cells and their spatial characteristics with overall survival (OS) among EOC patients.
2. Methods and materials
2.1. Ethics statement
All participants provided verbal consent at the time of the baseline telephone interview, and written informed consent was obtained for the procurement of tissue specimens and collection of medical records. The parent protocol for the AACES study was approved by the Duke Health Institutional Review Board (IRB number Pro00022451).
2.2. Study population and immunofluorescence assays
To test out the proposed framework for assessing spatial clustering within the TIME using Ripley’s K, we used the data from the African American Cancer Epidemiology Study (AACES). AACES is a population-based case-control study of 595 African American (AA) women with EOC residing in 11 geographic locations in the U.S. and 752 controls enrolled between December 2010 and August 2016 . Cases were identified through cancer registries and hospitals, and were eligible for the study if they were 1) aged 20–79 years, 2) self-reported AA race, and 3) resided in one of the 11 geographic locations. Study participants completed a telephone survey at baseline, and for ~90% of the cases, formalin-fixed paraffin-embedded (FFPE) tissue blocks of the primary tumor were procured. For 75% of cases with tissue, twenty-five sections were cut from each tissue block, and TMAs were created for the other 25%. A centralized pathology review was completed to confirm diagnosis and histology. A systematic collection of vital status and follow-up data through linkages with cancer registries and the National Death Index (NDI) has been conducted annually. As the distribution of immune cells in the tumor microenvironment and their association with outcomes differ according to histotype , we focused on the most common and one of the deadliest histotypes, high-grade serous carcinoma (HGSC) , to limit contributions of disease heterogeneity.
To determine immune profiles of these tumors, mIF staining was completed using the Opal chemistry and multispectral microscopy Vectra system (Akoya Biosciences). Tumors were stained for one panel including seven fluorophore-labeled markers: CD3, CD8, FOXP3, CD11b, CD15, DAPI and pancytokeratin (PanCK). After staining, slides were scanned, and image capture was performed with the Vectra 3.0 Automated Quantitative Pathology Imaging System (Akoya Biosciences) with images are exported from InForm (Akoya Biosciences) and loaded into HALO (Indica Labs, New Mexico) for quantitative image analysis. Coordinates of the cell locations are based on pixels of the image where the image resolution is 0.4977 microns per pixel (Mpp). For the statistical analyses, we focused on tumor-infiltrating lymphocytes (CD3+) and relevant T-cell subsets (CD3+CD8+ cytotoxic T-cells; CD3+FOXP3+ regulatory T-cells). In addition to TMAs, mIF was performed on whole slide images, where three ROIs from the intratumoral region of each tissue section were selected for analysis (e.g., 100% tumor cells by cellularity and PCK expression). A summary of the study participants is presented in S1 Table, where 94 subjects were on TMAs (263 samples) and 93 subjects had ROIs (260 samples), with 27 subjects with both TMA and ROI data. From the 27 patients that we have both ROIs and TMAs, there are 72 intratumoral ROIs, and 75 TMA cores.
2.3. Ripley’s K and complete spatial randomness
In our proposed framework, the locations of the immune positive cells in the TIME can be thought of as a spatial point process. The arrangement of these cells may not follow the assumption of complete spatial randomness (CSR) for homogenous spatial processes, where positive cells occur at the same rate λ for the entire region. Attraction (clustering), repulsion, and competition (dispersion) are all examples of interactions that would lead to a violation of CSR. Ripley’s K (19) is a popular measures to quantify these interactions by counting the number of neighboring cells within a radius for each positive immune cell, normalizing by the maximum number of pairwise distances, and correcting for border effects. Fig 1D shows an ROI that exhibits a spatial process that is close to CSR, while Fig 1C shows an ROI that violates the CSR assumption.
Ripley’s K is computed at several rings with varying radii, r, that is given by the following formula: where n is the number of cells, A represents the area of the region of interest within the TMA or ROI, wij corresponds to the edge correction, and 1|xj:d(xi,xj)<r| is an indicates whether the jth is a neighbor of the ith cell (the Euclidean distance between cells i and j is less than r), where the expected value for . The edge correction accounts for undercounting of the number of neighboring cells when a cell on the periphery of the TMA core or ROI. The difference between the observed and expected value, where E(n) is the expected number of cells and λ is the intensity of the cell, helps determine the degree of regularity or clustering. A positive difference corresponds to a higher degree of clustering than expected, while a negative difference is evidence of a regular pattern existing [19,24,25]. Ripley’s K can be calculated in a univariate form as described above, or a bivariate form where co-occurrence of immune cell types is explored, for example, the spatial co-localization of cytotoxic T-cells and regulatory T-cells.
Ideally, these point processes occur in a rectangular or circular region of space, however, the region where cells appear are not necessarily perfect rectangles or circles, such as in the case of TMAs. To that end, we can still estimate Ripley’s K over the convex hull of the cells measured in the sample. Another challenge is the “edge effect”, where cells on the periphery of the tissue sample lose neighboring cells that are located outside the sampled region. Two common edge corrections are called isotropic and translation [26,27]. The translation and isotropic edge correction methods can accommodate settings when there are a small number of positive cells in the tissue samples . Additionally, little difference was observed in the estimate of Ripley’s K using these two border correction methods to mIF data, with a correlation value around 1.0.
2.4. Permutation-based measure of CSR
Unlike ROIs selected from whole tissue sections, TMAs have many cores with regions that are folded or torn; in these cases, it appears that there are no cells present at various locations. This will result in the appearance of the intensity function of the observed point process to not be constant across the entire region, while the true underlying process may have been homogenous. Hence, use of the theoretical expected value of K under CSR would not be appropriate. To overcome this challenge, a permutation approach is used to obtain a robust estimate of the spatial statistic under CSR.
A class of non-parametric methods are those based on permutation or Monte-Carlo methods, where the sampling distribution of interest under the null hypothesis (i.e., CSR) is estimated by randomly assigning the labels and computing the desired statistic. This process is carried out 100 times and the resulting distribution of the statistic is an empirically derived distribution. In the context of mIF studies, each cell is labeled based on whether a certain marker is present or absent. These labels are then randomly permuted to each cell and the measure of spatial clustering is computed, maintaining the number of cells present and absent of a marker. This process is repeated a large number of times resulting in an empirically derived sampling distribution under the null hypothesis (i.e., CSR) for the computed spatial statistics (i.e., Ripley’s K). Using this empirical distribution, the mean is computed and used as the estimate of the spatial statistic under CSR  and the degree of spatial clustering is defined as the difference between the observed Ripley’s K and the mean of the empirical distribution of Ripley’s K under CSR.
2.5. Association of immune cell spatial clustering and survival from ovarian cancer
To measure the association of the degree of spatial clustering of immune cells with clinical outcome, while also accounting for the abundance level, Cox proportional hazards models were used. Accounting for the abundance level is critical in assessment of the spatial component as accurate estimation of the spatial measure requires an adequate number of positive cells (i.e., negative relationship between abundance level and spatial clustering value, S1 Fig). That is, it is not possible to estimate spatial measures when there is no positive cells and the estimate of the level of spatial clustering is somewhat related to the number of positive cells. In computing Ripley’s K, the radius r was set to 30. In order to provide clinically interpretable results, the continuous measurement of abundance and degree of spatial clustering were categorized (e.g., absent/low level/high level of abundance for immune marker). Typically, categorization takes place for each variable where values are assigned to groups depending on their relationship to the median, quartiles or some other threshold.
An alternate to setting the pre-defined threshold is use of the “optimal cut-point”, which is selected to maximize the test statistic of interest [29,30]. From a statistical standpoint, the optimal cut-point is “data-snooping” since we are allowing the results from the statistical test to inform creation of categories of the variable . On the other hand, for discovery and hypothesis generation purposes, it is clinically useful to determine an optimal cut-point that can be validated in other studies with other technologies. For this study, we have chosen to use the optimal cut-point approach based on 10-fold cross-validation to determine thresholds that produce the largest difference in the survival curves. The median of the optimal cuts for abundance and degree of clustering were then used to determine the cut-points for abundance and spatial clustering for the final set of analyses (i.e., 5 groups: no immune cells; low abundance and low spatial clustering; low abundance and high spatial clustering; high abundance and low spatial clustering; and high abundance and high spatial clustering). A challenge in determining the optimal cut-points is that the number of samples in a group/category can get very small when categorizing across multiple variables. Thus, in the analysis to determine the association of immune cell abundance and spatial clustering, optimal cut-points were determined based on a grid search where the search was constrained to possible cut-points in which each group had at least 10 samples (after removing samples with zero abundance). This constrained approach ensured that each group had an adequate size for model stability.
For the co-occurrence analysis involving two types of immune cells with OS, the bivariate version of Ripley’s K was used. As bivariate Ripley’s K is only estimable when both types of immune cells are present in a sample, the level of spatial clustering was only computed when both immune markers were present. Hence, five categories were defined as AAN, APN, PAN, PPL, and PPH, where the first letter corresponds to the (A)bsence or (P)resence of the first cell type, the second letter represents the (A)bsence or (P)resence of the second cell type, and the last letter describes the level of spatial clustering (N)one/(L)ow/(H)igh. The marker was defined as present if at least one cell was positive for the immune marker. Note that the bivariate version of Ripley’s K is not a symmetric statistic. That is, analysis is first completed using the cytotoxic T-cell as the “anchor” or center for the computation of Ripley’s K, followed by the analysis using the regulatory T-cells as the “anchor”.
Using these pre-defined categories for the abundance and degree of spatial clustering, association of each cell type and phenotype with overall survival (OS) following EOC was completed using Cox proportional hazard models for CD3+ (T-cells), CD3+CD8+ (cytotoxic T-cells), and CD3+FOXP3+ (regulatory T-cells). The model included clinical covariates of age at diagnosis and stage (I, II, III, IV) and accounted for the repeated measurements per tumor/subject. Analysis was completed separately for the intratumoral ROIs and TMAs. Analyses performed for TMAs were based only on the immune marker data in the tumoral compartment of each core in order to compare findings between the intratumoral ROIs and the TMAs (i.e., restricted analysis to the cells determined to be tumor based on PanCK marker and removed the cells within the stromal compartment). Likelihood ratio tests were used to compare survival models with and without the spatial information. Permutations were performed with the package spatialTIME , and all statistical analyses were completed using RStudio with R v4.1.1.
3.1. Quality control and assessment
Prior to statistical analysis of the mIF data, quality assessment of the data was completed. In calling a cell as positive or negative for a marker, a machine learning algorithm within the HALO (Indica Labs, New Mexico) software was used where a threshold was determined based on the intensity measurements. As the sensitivity and specificity of this method is not 100%, there may be cases where a cell is classified for multiple phenotype combinations that are not possible, such as a cell being classified as both a cytotoxic T-cell and a regulatory T-cell (S2 Fig). For this analysis, between 1 and 87 cells on 134 of the 260 ROI samples, and between 1 and 73 cells on 67 of the 263 TMA cores were identified as being both cytotoxic T-cell and Treg. The locations of these cells were retained but phenotype identification was considered to be CD3+ only.
3.2. Estimate of degree of spatial clustering and CSR in TMAs and ROIs
Following quality control, the levels of spatial clustering were estimated using the permutation-based value for CSR, where the degree of spatial clustering was computed as the difference in the observed Ripley’s K and the mean of the empirical distribution of Ripley’s K under CSR. Fig 2 shows three TMA cores and the empirical distribution of Ripley’s K under CSR for marker CD3. The first row (A) corresponds to a TMA core that does not have large areas where cells are absent, the second row (B) displays a TMA core with a moderate number of missing cells, and the third row (C) shows a TMA core with an extensive level of missing cells. As the level of missing cells increases, the difference between the theoretical and permutation-based estimates for Ripley’s K under CSR increases. Similarly, the histogram of the distance between the permutation-based estimate of CSR and theoretical estimates of CSR is presented in S3 Fig for the ROIs and TMAs. As expected for the ROIs there is not a systematic bias as there is a low level of “missing cell” regions. However, in the case of TMAs there was a systematic bias in which the mean of the difference (Theoretical estimate of CSR–Permuted estimate of CSR) is negative, indicating underestimation of Ripley’s K under CSR based on the theoretical estimate. This bias would result in a biased estimate of the degree of spatial clustering and incorrect association results with the phenotype of interest. In contrast, the empirical estimate of Ripley’s K under CSR would provide a more accurate measure of spatial clustering which accounts for the unevenness in the cell distribution measured in the sample. Lastly, the permutation approach allows for the assessment of spatial clustering within the tumor and/or stroma compartments separately.
The observed value of Ripley’s K is represented by blue dotted line, the mean of the permutation-based CSR distribution is represented by the vertical black solid line, and the theoretical CSR computed assuming no “holes” in the tissue sample image is the vertical red line. As the level of missing cells increases (i.e., more “holes”) there is larger difference between the estimate of CSR based on theoretical computation using area and the estimate of CSR based on the mean of the empirically derived distribution.
3.3. Analysis of spatial clustering using univariate Ripley’s K and ovarian cancer survival
Cox proportional hazard models were fit to assess the association of the abundance and spatial clustering of cells positive for CD3+ (T-cells), CD3+CD8+ (cytotoxic T-cell), and CD3+FOXP3+ (regulatory T-cells), adjusting for age at diagnosis and stage, where the degree of spatial clustering was measured by the difference of the observed estimate of Ripley’s K from the permutation-based estimate of Ripley’s K under CSR. The estimates of abundance and univariate spatial clustering were collapsed into five categorial groups as described in Section 2.5. Table 1 presents the hazard ratio (HR) estimates for the different groups with the “None” group representing the reference group for the ROIs and the TMAs. Table 1 also present the p-values (4 df) for the association of the level of spatial clustering and immune cell abundance collectively with survival. Predicted survival curves for the five groups and the three cell types are presented in Fig 3A for ROIs and Fig 3B for TMAs. S4 Fig presents the corresponding cumulative event curves, highlighting the survival benefit observed in patients with high abundance and low spatial clustering of T cell markers. The optimal cut-point for determining high and low abundance (based on 10-fold cross-validation) was around 1–4% for the various cell types. Using these optimal cut-points for abundance (high vs low) and degree of spatial clustering (high vs low), there was significant evidence that EOC patients with high abundance and low spatial clustering (HL group) of CD3+, CD3+CD8+, and CD3+FOXP3+ cells in their tumors had the best OS in both the ROIs and the TMAs (i.e., the “HL” group had the smallest hazard ratios (HRs) and these HRs were statistically different from the “none” group (i.e., reference group)).
Predicted survival curves from Cox proportional hazard models for the CD3+, CD3+CD8+, and CD3+FOXP3+ cells where the degree of spatial clustering was based the permutation-based estimate of Ripley’s K under CSR (i.e., observed Ripley’s K–the mean of the empirical distribution of Ripley’s K under CSR); (A) results from intra-tumoral ROIs (93 subjects, 260 samples); (B) results from tumor compartment of TMAs (94 subjects, 263 samples). Models adjusted for age at diagnosis and stage within a repeated measures analysis framework.
When restricting the analysis to those samples with high abundance based on the optimal cut-point, a significant difference was observed in the survival curves between patients with low and high spatial clustering of CD3+ (ROIs: HR = 8.00, 95% CI (1.74, 36.77); TMAs: HR = 5.18, 95% CI (1.88, 14.25)), CD3+CD8+ (ROIs: HR = 19.84, 95% CI (1.80, 219.00); TMAs: HR = 4.50, 95% CI (1.59, 12.77)), and CD3+FOXP3+ (ROIs: HR = 2.61, 95% CI (1.18, 5.81); TMAs: HR = 5.27, 95% CI (1.97, 14.16)) immune cells, with EOC patients with low clustering having best survival (Table 2, S5 Fig). Lastly, we found the models including spatial information were significantly better than models with only the abundance level information included for CD3+ and CD3+FOXP3+ cells for both the analysis of the ROIs and TMAs (ROIs: CD3+ p = 1.1E-4, CD3+FOXP3+ p = 0.003; TMAs: CD3+ p = 3.8E-5, CD3+FOXP3+ p = 9.1E-4), while the model for CD3+CD8+ cells in ROI samples was statistically significant (ROIs CD3+CD8+ p = 7.3E-6; TMAs CD3+CD8+ p = 0.14). In summary, the results observed in the analysis of the ROIs were replicated in the TMA analysis, whereby subjects with high abundance and low spatial clustering of CD3+ and T cell subtypes (CD3+FoxP3+ and CD3+CD8+ cells) had significantly better overall survival.
Degree of spatial clustering based on permutation-based estimate of CSR. Models were adjusted for age at diagnosis and stage. Group None is reference group. The overall p-value is the joint association of the five categorial variable based on immune abundance and spatial clustering on overall survival.
3.4. Analysis of co-occurrence of CD3+CD8+ and CD3+FOXP3+ and ovarian cancer survival
Bivariate analysis involving Ripley’s K was completed to assess co-occurrence of cytotoxic T cells (CD3+CD8+) and regulatory T cells (CD3+FOXP3+) with survival following EOC, treating the CD3+FOXP3+ cell as the reference or “anchor” cell type. Results from the association of the measure of co-location with OS is presented in Fig 4 and Table 3. The results using CD3+CD8+ as the reference cell were similar. Among both the ROIs and TMAs, patients with high co-occurrence of cytotoxic T-cells (CD3+CD8+) and regulatory T-cells (CD3+FOXP3+) had the best survival (ROIs: HR = 0.49, 95% CI (0.28, 0.86); TMAs: HR = 0.42, 95% CI (0.25, 0.71)). In contrast, patients with no cytotoxic or regulatory T-cells had the worst survival. Lastly, we found the model including spatial information was significantly better than a model with only the abundance level included (ROIs: p = 0.014; TMAs: p = 0.002).
Predicted survival curves from Cox proportional hazard models for the degree of co-occurrence of CD3+FOXP3+ and CD3+CD8+ cells was based the permutation-based estimate of Ripley’s K under CSR (i.e., observed Ripley’s K–the mean of the empirical distribution of Ripley’s K under CSR); (A) results from intra-tumoral ROIs (93 subjects, 260 samples); (B) results from tumor compartment of TMAs (94 subjects, 263 samples). Models adjusted for age at diagnosis and stage within a repeated measures analysis framework.
Degree of spatial clustering based on permutation-based estimate of CSR. Models were adjusted for age at diagnosis and stage. The no cytotoxic T-cells and no regulatory T-cells (none) group/category is the reference group. The overall p-value is the joint association of the five categorial variable based on immune abundance and spatial clustering on overall survival.
4. Discussion and conclusion
In this research, we present a novel permutation-based analysis framework using Ripley’s K (univariate and bivariate) to explore the relationship between the degree of spatial clustering of immune cells with clinical outcome. The application of this framework to study of the TIME of EOC tumors from African American EOC patients revealed that not only the abundance of CD3+ and CD3+CD8+ immune cells but also the degree of spatial clustering of immune cells within the TIME were associated with overall survival. EOC patients with high abundance and low spatial clustering of tumor-infiltrating lymphocytes (TILs) and T-cell subsets had significantly better overall survival. We also observed that patients with high degree of co-localization of cytotoxic and regulatory T cells had better overall survival. Additionally, this statistical framework can be applied to mIF study involving other cancer types to understand the role of the spatial contexture of the immune cells on clinical outcome, as these findings underscore the prognostic importance of evaluating not only immune cell abundance, but also the spatial contexture of the immune cells in the TIME.
Comparison of the value of Ripley’s K under the assumption of CSR (complete spatial randomness) based on theoretical derivation or based on the permutation-based estimate found that the theoretical value can be a biased estimate the true value of Ripley’s K under CSR, with the bias more pronounced as the level of missing cells or “holes” in the TMA increased. This bias would subsequently impact the downstream association analyses (i.e., incorrect hazard ratios, confidence intervals and p-values). Many of the proposed methods being used for the spatial analysis of digital pathology data, particularly in the setting of TMAs, such as nearest neighbor distances are not correcting for this “missingness” in cell measurements and thus are prone to incorrect estimation of the degree of clustering/co-localization. An additional strength of the proposed statistical framework is that the degree of clustering can be estimated for an entire TMA or ROI or focused on just the tumor or stromal compartments.
However, there are many challenges in completing the spatial analysis of the TIME, with many areas requiring further research. One challenge in using Ripley’s K is selection of the proximity parameter (i.e., r or radius). Often, the selection of this value is based upon prior knowledge or based on practical considerations. For the present analysis of ovarian cancer tumors, we chose to use Ripley’s K a r = 30 to measure the level of clustering of immune cells in a small area (or radius). A possible choice of the proximity parameter is to compute Ripley’s K at several values of r and select the value that has the greatest difference from CSR [15,33]. However, this implementation would likely lead to different proximity parameters being used for each image and would make the spatial measure not comparable across samples. Another approach would be to treat the estimates of Ripley’s K as the various r values as a function or trajectory and applying functional data analysis (FDA), which allows for linking entire spatial trajectory to be associated with a phenotype [34–36].
Another challenge that arises when applying Ripley’s K is that many samples may have zero cells that express the marker of interest (i.e., “immune cold tumors”). For these cases, spatial clustering is undefined. To accommodate this case in the survival analysis, a category was defined in which samples with zero abundance and no spatial clustering was constructed. This challenge was amplified in the bivariate analysis in which both cell types had to be present in the sample for estimation of spatial co-localization. Additionally, using the optimal cut-point is a popular method for determining categories for a continuous variable (i.e., percent abundance, density), however, these methods have been shown to inflate the type I error rate [37–40] with the optimal cut-point varying between studies. Thus, we implemented a 10-fold cross-validation approach to determine the cut-point to use when categorizing the immune cell abundance and the degree of spatial clustering.
In conclusion, this paper illustrates a permutation-based approach for estimating the degree of spatial clustering when studying the TIME, with application to tumor from African American women with ovarian cancer. This approach addresses the unique challenges in the use of TMAs for studying the TIME, such as regions where cells cannot be measured due to the limitation of the sample preparation. The application of this approach also showed that in African American patients with EOC that not only the abundance but also the level of spatial clustering of T cells subtypes in the tumor is predictive of survival, where EOC patients with low level of clustering had better survival compared to patient tumors with high level of spatial clustering. We also found that co-occurrence of cytotoxic T cells with regulatory T cells conferred the best overall survival following EOC. The application of this spatial analysis framework to the study of the TIME could lead to the identification of immune content and spatial architecture that could aid in the determination of patients that are likely to respond to immunotherapies. Future research is needed to validate the findings observed in African American women with ovarian cancer in other racial/ethnic groups, along with replication of these findings in other cancer types.
The relationship between the percent of CD3+ cells and degree of spatial clustering has an exponentially decaying relationship (A). Plots (B) and (D) (colored red in plot A), and (C) and (E) (colored green in plot A) are two ROIs which have two approximately the same percent of CD3+ but different levels of spatial clustering.
Scatter plots showing the cytoplasm intensity, which is used to classify cell positivity, for FOXP3 (Opal 540) and CD8 (Opal 570). These four plots show varying degree of phenotype misclassification and illustrates the challenge of making univariate or bivariate intensity threshold for classifying higher dimensional spaces.
Histogram of the difference between the theoretical estimate and the permuted estimate of CSR for the ROIs (A) and TMA core samples (B) for CD3+, CD3+CD8+ (cytotoxic T-cell), and CD3+FOXP3+ (Regulatory T-cell or Treg). The dashed black line represents zero and the solid black line represents the mean of distribution. To better visualize the distribution, the plots were scaled towards x = 0.
Cumulative event curves from Cox proportional hazard models for the CD3+, CD3+CD8+, and CD3+FOXP3+ cells where the degree of spatial clustering was based the permutation-based estimate of Ripley’s K under CSR (i.e., observed Ripley’s K–the mean of the empirical distribution of Ripley’s K under CSR); (A) results from intra-tumoral ROIs (93 subjects, 260 samples); (B) results from tumor compartment of TMAs (94 subjects, 263 samples). Models adjusted for age at diagnosis and stage within a repeated measures analysis framework.
Predicted survival curves for patients with high abundance stratified by level of spatial clustering from Cox proportional hazard models for the CD3+, CD3+CD8+, and CD3+FOXP3+ cells where the degree of spatial clustering was based the permutation-based estimate of Ripley’s K under CSR (i.e., observed Ripley’s K–the mean of the empirical distribution of Ripley’s K under CSR); (A) results from intra-tumoral ROIs (B) results from tumor compartment of TMAs. Models adjusted for age at diagnosis and stage within a repeated measures analysis framework.
We would like to thank the AACES investigators, Drs. Anthony Alberg, Elisa Bandera, Jill Barnholtz-Sloan, Melissa Bondy, Michele Cote, Ellen Funkhouser, Patricia Moorman, Edward Peters, Ann Schwartz, and Paul Terry, for their contributions to the AACES. We would also like to acknowledge the support of the Advanced Analytical and Digital Pathology Core under the Pathology Department at Moffitt Cancer Center.
- 1. Ribas A, Wolchok JD. Cancer immunotherapy using checkpoint blockade. Science. 2018;359(6382):1350–5. pmid:29567705
- 2. Couzin-Frankel J. Breakthrough of the year 2013. Cancer immunotherapy. Science. 2013;342(6165):1432–3. pmid:24357284
- 3. Pardoll DM. The blockade of immune checkpoints in cancer immunotherapy. Nat Rev Cancer. 2012;12(4):252–64. pmid:22437870
- 4. Havel JJ, Chowell D, Chan TA. The evolving landscape of biomarkers for checkpoint inhibitor immunotherapy. Nat Rev Cancer. 2019;19(3):133–50. pmid:30755690
- 5. Fridman WH, Zitvogel L, Sautes-Fridman C, Kroemer G. The immune contexture in cancer prognosis and treatment. Nat Rev Clin Oncol. 2017. pmid:28741618
- 6. Fridman WH, Pages F, Sautes-Fridman C, Galon J. The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer. 2012;12(4):298–306. pmid:22419253
- 7. Gooden MJ, de Bock GH, Leffers N, Daemen T, Nijman HW. The prognostic influence of tumour-infiltrating lymphocytes in cancer: a systematic review with meta-analysis. Br J Cancer. 2011;105(1):93–103. pmid:21629244
- 8. Lee CW, Ren YJ, Marella M, Wang M, Hartke J, Couto SS. Multiplex immunofluorescence staining and image analysis assay for diffuse large B cell lymphoma. J Immunol Methods. 2020;478:112714. pmid:31783023
- 9. Schwen LO, Andersson E, Korski K, Weiss N, Haase S, Gaire F, et al. Data-Driven Discovery of Immune Contexture Biomarkers. Frontiers in oncology. 2018;8:627. pmid:30619761
- 10. Steinhart B, Jordan KR, Bapat J, Post MD, Brubaker LW, Bitler BG, et al. The Spatial Context of Tumor-Infiltrating Immune Cells Associates with Improved Ovarian Cancer Survival. Molecular cancer research: MCR. 2021. pmid:34615692
- 11. Jawhar NMT. Tissue Microarray: A rapidly evolving diagnostic and research tool. Ann Saudi Med. 2009;29(2):123–7. pmid:19318744
- 12. Vayrynen SA, Zhang J, Yuan C, Vayrynen JP, Dias Costa A, Williams H, et al. Composition, Spatial Characteristics, and Prognostic Significance of Myeloid Cell Infiltration in Pancreatic Cancer. Clin Cancer Res. 2021;27(4):1069–81. pmid:33262135
- 13. Rose CJ, Naidoo K, Clay V, Linton K, Radford JA, Byers RJ. A statistical framework for analyzing hypothesized interactions between cells imaged using multispectral microscopy and multiple immunohistochemical markers. J Pathol Inform. 2013;4(Suppl):S4. pmid:23766940
- 14. Tsakiroglou AM, Fergie M, Oguejiofor K, Linton K, Thomson D, Stern PL, et al. Spatial proximity between T and PD-L1 expressing cells as a prognostic biomarker for oropharyngeal squamous cell carcinoma. Br J Cancer. 2020;122(4):539–44. pmid:31806878
- 15. Magurran AE. Biological diversity. Curr Biol. 2005;15(4):R116–8. pmid:15723777
- 16. Dunn KW, Kamocka MM, McDonald JH. A practical guide to evaluating colocalization in biological microscopy. Am J Physiol Cell Physiol. 2011;300(4):C723–42. pmid:21209361
- 17. Kather JN, Suarez-Carmona M, Charoentong P, Weis C-A, Hirsch D, Bankhead P, et al. Topography of cancer-associated immune cells in human solid tumors. Elife. 2018;7:e36967. pmid:30179157
- 18. Schwen LO, Andersson E, Korski K, Weiss N, Haase S, Gaire F, et al. Data-Driven Discovery of Immune Contexture Biomarkers. Front Oncol. 2018;8:627–. pmid:30619761
- 19. Ripley BD. Modelling Spatial Patterns. Journal of the Royal Statistical Society Series B (Methodological). 1977;39(2):172–212.
- 20. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):7–30. pmid:31912902
- 21. Schildkraut JM, Alberg AJ, Bandera EV, Barnholtz-Sloan J, Bondy M, Cote ML, et al. A multi-center population-based case-control study of ovarian cancer in African-American women: the African American Cancer Epidemiology Study (AACES). BMC cancer. 2014;14:688. pmid:25242549
- 22. Ovarian Tumor Tissue Analysis C, Goode EL, Block MS, Kalli KR, Vierkant RA, Chen W, et al. Dose-Response Association of CD8+ Tumor-Infiltrating Lymphocytes and Survival Time in High-Grade Serous Ovarian Cancer. JAMA Oncol. 2017;3(12):e173290. pmid:29049607
- 23. Peres LC, Cushing-Haugen KL, Kobel M, Harris HR, Berchuck A, Rossing MA, et al. Invasive Epithelial Ovarian Cancer Survival by Histotype and Disease Stage. Journal of the National Cancer Institute. 2019;111(1):60–8. pmid:29718305
- 24. Baddeley A, Rubak E, Turner R. Spatial point patterns: methodology and applications with R. Boca Raton; London; New York: CRC Press, Taylor & Francis Group; 2016. xvii, 810 pages p.
- 25. Gabriel E. A. Baddeley E. Rubak R. Turner: Spatial Point Patterns: Methodology and Applications with R. Mathematical Geosciences. 2017;49(6):815–7.
- 26. Moore M. Spatial Statistics: Methodological Aspects and Applications. New York, NY: Springer-Verlag; 2001. 282 p.
- 27. Cressie N, Ver Hoef JM. Spatial statistical analysis of environmental and ecological data.404–9.
- 28. Good P. Springer Series in Statistics. 1994.
- 29. Bouwmeester W, Zuithoff NP, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, et al. Reporting and methods in clinical prediction research: a systematic review. PLoS medicine. 2012;9(5):1–12. pmid:22629234
- 30. Mabikwa OV, Greenwood DC, Baxter PD, Fleming SJ. Assessing the reporting of categorised quantitative variables in observational epidemiological studies. BMC health services research. 2017;17(1):201. pmid:28288628
- 31. Altman DG, Lausen B, Sauerbrei W, Schumacher M. Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. Journal of the National Cancer Institute. 1994;86(11):829–35. pmid:8182763
- 32. Creed JH, Wilson CM, Soupir AC, Colin-Leitzinger CM, Kimmel GJ, Ospina OE, et al. spatialTIME and iTIME: R package and Shiny application for visualization and analysis of immunofluorescence data. Bioinformatics. 2021. pmid:34734969
- 33. Kiskowski MA, Hancock JF, Kenworthy AK. On the use of Ripley’s K-function and its derivatives to analyze domain size. Biophysical journal. 2009;97(4):1095–103. pmid:19686657
- 34. Cardot H, Ferraty F, Sarda P. Functional Linear Model. Statistics & Probability Letters. 1999;45(1):11–22.
- 35. Ramsay JO, Dalzell CJ. Some Tools for Functional Data Analysis. Journal of the Royal Statistical Society: Series B (Methodological). 1991;53(3):539–61.
- 36. Ramsay JO, Silverman BW. Applied functional data analysis: methods and case studies. New York: Springer; 2002.
- 37. Hilsenbeck SG, Clark GM. Practical p-value adjustment for optimally selected cutpoints. Statistics in medicine. 1996;15(1):103–12. pmid:8614741
- 38. Lausen B, Schumacher M. Maximally Selected Rank Statistics. Biometrics. 1992;48(1):73–85.
- 39. Lausen B, Schumacher M. Evaluating the effect of optimized cutoff values in the assessment of prognostic factors. Computational Statistics & Data Analysis. 1996;21(3):307–26.
- 40. Miller R, Siegmund D. Maximally Selected Chi Square Statistics. Biometrics. 1982;38(4):1011–6.