Unsupervised Deconvolution of Dynamic Imaging Reveals Intratumor Vascular Heterogeneity and Repopulation Dynamics

With the existence of biologically distinctive malignant cells originated within the same tumor, intratumor functional heterogeneity is present in many cancers and is often manifested by the intermingled vascular compartments with distinct pharmacokinetics. However, intratumor vascular heterogeneity cannot be resolved directly by most in vivo dynamic imaging. We developed multi-tissue compartment modeling (MTCM), a completely unsupervised method of deconvoluting dynamic imaging series from heterogeneous tumors that can improve vascular characterization in many biological contexts. Applying MTCM to dynamic contrast-enhanced magnetic resonance imaging of breast cancers revealed characteristic intratumor vascular heterogeneity and therapeutic responses that were otherwise undetectable. MTCM is readily applicable to other dynamic imaging modalities for studying intratumor functional and phenotypic heterogeneity, together with a variety of foreseeable applications in the clinic.


Introduction
Intratumor genetic or epigenetic heterogeneity has been found in many cancers as evidenced by deep sequencing selectively applied to different parts of the same tumor [1,2]. Consequently, cancer cells display remarkable phenotypic variability, including ability to induce angiogenesis, seed metastases, and survive therapy [3][4][5]. Advanced solid tumors often contain vascular compartments with distinct pharmacokinetics, comprising hypoxic regions and spatially intermingled irregular vasculature that is leaky and inefficient [6][7][8]. The complexity of heterogeneity has clinical implications. A more heterogeneous tumor is more likely to fail therapy due to increased drug-resistant variants [3,5], and characteristics of the dominant cell type will not necessarily predict the behaviors of interest rooted in specific cells [4].
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) provides a noninvasive in vivo method to evaluate tumor vasculature architectures based on contrast accumulation and washout [7,9]. While DCE-MRI can potentially depict the intratumor heterogeneity of vascular permeability [10], the quantitative application of DCE-MRI has been hindered by its inability to accurately resolve vascular compartments with distinct pharmacokinetics due to limited imaging resolution [7,11]. We emphasize that identification of spatially mixed multiple vascular cytotypes is principally different from imaging an inhomogeneously distributed single vascular cytotype, and it is the former scenario that presents significant technological challenges to portraying tumor cytotypes. This indistinction among the contributions of different compartments to the mixed tracer signals can confound compartment modeling and deep phenotyping for association studies [4,12,13]. The goal of the present work was to discern vascular heterogeneity and its changes in tumors using DCE-MRI and novel mathematical models, for personalized cancer diagnosis and treatment.
We developed a computational method (multi-tissue compartment modeling -MTCM) for deconvolving intratumor vascular heterogeneity and identifying pharmacokinetics changes in many biological contexts [5,14,15]. MTCM works by applying a convex analysis of mixtures that enables geometrically-principled delineation of distinct vascular structures from DCE-MRI data (Fig. 1ac). A formal mathematical description of the method and its detailed implementation is available in Methods.

Overview of MTCM
Tumors to be analyzed by MTCM contain unknown numbers of distinct vascular compartments. The pixel-wise tracer concentration in a particular vascular compartment is modeled as being proportional to the local volume transfer constant of the vascular compartment (Method). Because there are often significant numbers of partial-volume pixels, MTCM instead estimates pharmacokinetic parameters (flux rate constants) via the timecourses of pure-volume pixels (pixels whose signal is highly enriched in a particular vascular compartment). Convex analysis of mixtures identifies those pure-volume pixels present at the vertices of the clustered pixel time series scatter simplex, without any knowledge of compartment distribution (Method). When the number of underlying vascular compartments is detected using the minimum description length (MDL) criterion, MTCM provides a completely unsupervised approach to characterize intratumor heterogeneity (Methods and Appendix S1 in File S1).
Modeling the pharmacokinetics of each vascular compartment using pure-volume pixel time-courses allowed us to estimate individual compartment flux rate constants (Fig. 1d-e). Nonnegative least-square estimation yielded pixel-wise local volume transfer constants (Methods and Fig. 1f). Using synthetic and mouse DCE-MRI experiments, we showed that MTCM can be used to estimate pharmacokinetic parameters in several vascular compartments simultaneously and to quantitatively reconstruct tissue-specific local volume transfer constants (Data S1-S2, Figs. 2, S1-S2 and Tables S1-S2). Furthermore, MTCM enabled quantitation of differences in tissue-specific vascular permeability across time (for example, therapeutic responses in longitudinal studies; Methods). Thus, the change in values of flux rate constants in a given vascular compartment could be determined, despite an expected difference in that vascular compartment's relative abundance.
We also analyzed the same realistic synthetic dataset using a ''traditional'' way of principal component analysis -PCA. By a comparison of the tracer concentration extracted by PCA (Fig. 2e) to that estimated by MTCM (Fig. 2d), we can see that tracer concentration curves estimated by PCA are highly fluctuant and significant deviated from the ground truth. In fact, similar unsatisfactory results produced by PCA or classic factor analysis have been observed in the earlier studies by us and others (Cinotti, Bazin et al. 1991, Zhou, Huang et al. 1997, Hillman and Moore 2007, Hillman, Amoozegar et al. 2011. We should clarify that MTCM consists of two major analytic parts: convex analysis of mixtures (CAM) and compartment modeling (CM), where the CAM is a critical step that automatically identifies the pure tissue pixels and their time activity curves, followed by the CM that estimates the pharmacokinetics parameters without being contaminated by the partial-volume effect. In contrast, since PCA does not enforce the nonnegative constraint for tracer concentration estimation, a subsequent compartment modeling cannot be performed to estimate pharmacokinetic parameters.

Intratumor vascular heterogeneity in breast cancer revealed by MTCM
In keeping with our goal to use MTCM to better uncover vascular heterogeneity in human tumors, we applied MTCM to DCE-MRI sequence data obtained from a case of advanced breast cancer (Fig. 1a). In this breast tumor [7], vascular heterogeneity is characterized by active angiogenesis in the peripheral ''rim'' and concurrent inner-core hypoxia. Upon preliminary analysis using MDL, we found that a two-tissue compartment model of the fast and slow tracer clearance rates was sufficient to account for the variable permeability at the majority of pixels (Methods). Thus, we used pure-volume pixels associated with these two vascular pools to estimate tissue-specific flux rate constants and to reconstruct local volume transfer constant maps (Methods). MTCM reveals two vascular compartments with distinct flux rate constants (Fig. 1e). Accordingly, we detected distinct spatial patterns of specific local volume transfer constant in the two vascular compartments (Fig. 1f) with a significant fraction of partial volume pixels.
Intratumor vascular heterogeneity identified by MTCM is consistent with the knowledge obtained from ex vivo microscopic and molecular studies [7,13]. Defective endothelial barrier function is one of the better documented abnormalities of tumor vessels, resulting in functional heterogeneity in vascular permeability to macromolecules [7,11]. As a tumor rapidly outgrows its blood supply, it requires neovessel maturation, often leaving an inner core of the tumor with regions where blood flow and oxygen concentration are significantly lower than in normal tissues [6]. MTCM reconstructed local volume transfer constant maps correlate well with the differential gene expression known to regulate angiogenesis [7,13].

Changes in intratumor vascular heterogeneity in longitudinal studies
We also detected changes in pharmacokinetic patterns among longitudinal DCE-MRI data from breast cancer acquired before, during, and after treatment (Fig. 3a), quantified as different flux rate constants over time (Methods and Table S3). For example, the two vascular compartment time-activity curves revealed by MTCM in the baseline data are highly distinct (Fig. 3b). We detected significantly higher permeability in a fast-flow pool and slightly lower permeability in a slow-flow pool when compared with the normal state. In contrast, the interim response ( Fig. 3c) exhibits vascular compartment time-activity curves that are distinct but much closer to each other, whereas the closing response (Fig. 3d) shows a significant decrease in permeability of the fast-flow pool. We also detected different local volume transfer constant maps ( Fig. 3b-d) and changes in the fractions of partial-volume pixels (Table S4).

Comparative studies using standard compartment modeling
We compared tissue-specific pharmacokinetics detected with MTCM to the results of a standard compartment analysis of (total) vascular pool within the region of interest. Total time-activity curves were indistinct across time (Fig. S2) owing to therapeutic effects in some parts of the tumor but not in others and large fractions of partial-volume pixels. In this longitudinal study, we deconvolved total time-activity curves into two phased therapeutic effects using MTCM: a transient ''normalization'' of abnormal yet surviving tumor vasculature together with the significant and consistent drop in the relative volume transfer constants [6,16]. In contrast, standard analysis may not return informative results when both the flux rate constant and volume transfer constant change heterogeneously in response to therapy. These examples illustrate the ability of MTCM to discover intratumor vascular heterogeneity and to detect changes in each vascular compartment over time. Finally, we tested the applicability of MTCM to dynamic fluorescence imaging data acquired on a mouse after bolus injection of indocyanine green dye by deconvolving biodistribution dynamics of the major organs [17] (Fig. S3). The dissected tissue compartments constitute anatomical structures of the mouse that agree well with a digital anatomical mouse atlas.

Discussion
Several previous studies have discussed the problem of intratumor vascular heterogeneity in compartment modelling [7,11,16,18], a major outstanding issue for the characterization of complex phenotypes and therapeutic responses. Some methods have addressed the estimation of multi-compartment pharmacokinetics in the presence of varying partial-volume effects, relying on known regions of pure-volume pixels and number of compartments [10,13,16,17]. The significant advantage of our strategy is its ability to detect and quantify intratumor vascular heterogeneity without any type of external information. The benefits of such a method include its wide applicability, sensitive detection of heterogeneity dynamics, and reliance on longitudinal data from one single subject (Appendix S1 in File S1).
We have identified differential and heterogeneous changes in tissue-specific vascular pharmacokinetics in tumors during treatment that were undetected using standard analysis, including tumor islands of persistent enhancement that have escaped the effects of therapy [18]. These results are particularly intriguing when considered together with recent imaging studies describing foci of resistant and more aggressive clones within a tumor [5,13]. While it is not yet possible to assign causality, these in vivo results allowed us to propose new hypotheses regarding the complex relationships between intratumor heterogeneity, clonal repopulation, cancer stem-cell, and therapeutic efficacy [1,3,5,10,19].
In metastatic disease, recent studies have revealed the emergence of treatment-resistant subclones that were present at a minor frequency in the primary tumour [20]. Thus, modeling cancer diagnosis and treatment in the future should involve characterization of subpopulations within the primary tumour, monitoring of clonal dynamics during treatment and eradication of treatmentemergent clones [21]. To prospectively assess intratumor heterogeneity, profiling of multiregional tumour samples would be required. However, it is impractical and potentially risky to take multiple 'random' biopsies in every patient, owing to both sampling bias and the inability to resolve intermingled heterogeneity [22]. MTCM would not only make longitudinal in vivo surveillance possible but also enable imaging-informed selective biopsies.
The future challenges of applying MTCM lie in the gap between research experiments and clinical practice. Unlike highquality data in well-designed research studies, clinical data are usually with limited spatial and/or temporal resolution, accompanied by higher noise level (Fig. S4). Lower spatial resolution results in less pure-volume pixels and thus reduces the accuracy of MTCM; while limited temporal resolution prevents accurate differentiation and estimation of pharmacokinetic parameters associated with distinctive vascular compartments.
So far we have tested MTCM method on DCE-MRI data [7], dynamic contrast-enhanced optical imaging data [17,23], and dynamic PET imaging data [24], acquired from both human tissue/organ and whole-body mouse model (e.g., Fig. S4). Theoretically, the MTCM method can produce confident estimation on any 'dynamic contrast-enhanced' imaging data with sufficient quality (e.g., spatial and temporal resolution) [25,26]. However, we should emphasize that there are a few fundamental assumptions behind the MTCM methodology, as specified in the newly proved theorems (e.g., linear convex combination, existence of pure-tissue pixels). As in most medical imaging analysis, object motion constitutes a major source of error and can significantly confound the modeling results. Currently, MTCM is limited to 'parallel' compartment models, while the CAM part of the MTCM algorithm is applicable to resolving partial-volume contamination problem independent of the compartment models being used for subsequent parameter estimation.

Multi-tissue compartment modeling of DCE-MRI series
Let us consider J-tissue compartment model of DCE-MRI series (the Jth tissue compartment corresponds to tracer plasma input, indexed by p), whose tracer concentration kinetics are governed by a set of first-order differential equations (Fig. 1c) [27,28] dC 1 (t) dt~K where C j (t) is the tracer concentrations in the interstitial space weighted by the fractional interstitial volume in tissue-type j at time t for j = 1,…, J, where J is the total number of vascular compartments;C p (t) is the tracer concentration in plasma (tracer input function); C measured (t) is the measured tracer concentration; K trans j is the unidirectional volume transfer constant (/min)from plasma to tissue-type j; k ep,j is the flux rate constants (/min) in tissue-type j; and K p is the plasma volume [28].
where C measured (i,t l ) is the tracer concentration at time t l at pixel i, L is the number of sampled time points, K trans 1 (i),:::,K trans J{1 (i) are the local volume transfer constants of the tissue-types 1 to (J-1), at pixel i, respectively; and K p (i) is the local plasma volume at pixel i. Parallelism between multi-tissue compartment modeling and the theory of convex sets Apply a sum-based normalization to pixel time-course C measured (i,t l ) and using vector-matrix notation, we can re-express (2) as where K trans j (i) is accordingly normalized over j, C measured (i) and C j are the vector notations (over time) of pixel time course C measured (i,t l ) and compartment time course C j (t l ), respectively. Since K trans j (i) is always non-negative, as a non-negative linear combination of C J~Cj È É , the set of pixel time-course C measured~Cmeasured (i) f gforms a subset of the convex set readily defined by the set of C j È É MTCM exploits the strong parallelism between the multicompartment model (3) and the theory of convex set. The fundamental principle is that latent compartments are pure vasculatures in a tumor whose pharmacokinetics C j È É reside at the extremities of the scatter simplex occupied by the pixel timecourses, and accordingly, the interior of the simplex is occupied by the partial-volume pixels (linear non-negative mixtures of compartments) (Fig. 1b). Estimates of compartment pharmacokinetics may then be derived from the vertices of the multifaceted simplex that most tightly encloses the pixel time-courses and has the same number of compartments as vertices (Fig. 1d) [30]. MTCM algorithm is supported theoretically by a well-grounded mathematical frameworkas summarised below (see formal proofs in Appendix S2 in File S1).

Theorem 1 (Convexity of pixel time-course)
Suppose that the J compartment pharmacokinetics C j È É are linearly independent, and C measured (i)~P J j~1 K trans j (i)C j where local volume transfer constants K trans j (i) n o are non-negative and have at least one pixel whose signal is highly and exclusively enriched in a particular vascular compartment, then, C measured uniquely specifies a convex set n o , whose vertices are the J compartment pharmacokinetics C j È É .

Theorem 2 (Optimum source dominance)
Suppose that the local volume transfer constants are the J compartment pharmacokinetics C j È É when there is at least one pixel whose signal is highly and exclusively enriched in a particular vascular compartment (Fig. 1b). This means that, in principle, under a noise-free scenario, we can directly estimate C j È É by locating the vertices of H C measured f g (Fig. 1d).

Data preprocessing
First, the tumor area is extracted by masking out the non-tumor tissues surrounding the tumor site [31] (Fig. 1a). Second, the first few image frames, such as the four initial images of DCE-MRI sequences in our experiments, are removed because they correspond to the time prior to sufficient onsite tracer uptake. Third, pixels whose temporal average signal intensity is lower than 5% of the maximum value, or whose temporal dynamic variation is lower than 5% of the maximum value, are eliminated, because these non-informative pixels could have a negative impact on subsequent analyses. Fourth, the pixel time series is normalized over time using a sum-based normalization scheme, focusing the analysis on the ''shape'' of pharmacokinetics rather than on absolute tracer concentration.
It is true that accurate extraction of tumor region is critical to any image-based analysis that is focused on tumor characterization, where non-tumor tissue would constitute a confounding factor. Theoretically, MTCM method can handle well such situation since it is a completely unsupervised approach, relying on the MDL-based model selection. Specifically, since MTCM is specifically designed to work on multiple tissue compartment modeling, when a significant portion of the surrounding healthy tissue is included in the processed 'tumor' area, the healthy tissue will be considered as an additional/individual compartment in Eq.
(1) and Fig. 1c. The MDL-based model selection procedure will statistically determine the number of underlying tissue compartments in the processed area, e.g., whether the contribution of surrounding healthy tissues is significant to be considered as an independent compartment. Though MTCM methodology can accept the processed area extracted by any image segmentation methods, the tumor region in our study can be outlined by a collaborative effort by computer scientists and clinicians (Wang, Adali et al. 1998, Xuan, Adali et al. 2000, Li, Wang et al. 2001). In the case of heavy noise and fuzzy boundary, a consensus approach may be adopted that surveys the results of multiple methods.

Clustering of pixel time-course
To reduce the impact of noise/outlier data points, improve the efficiency of subsequent convex analysis of mixtures, and permit an automated determination of the number of underlying vascular compartments using the minimum description length (MDL) criterion, we aggregated pixel time-courses into representative clusters using a combined affinity propagation and expectationmaximization clustering [32] (Fig. 1b, Appendix S2 in File S1).As an initialization-free and near-global-optimum clustering method, affinity propagation clustering (APC) simultaneously considers all data points as potential exemplars and recursively exchange realvalued messages between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. Let the ''similarity'' s i,m ð Þ indicate how well the mth data point is suited to be the exemplar for ith data point; the ''responsibility'' r i,m ð Þreflects the accumulated evidence for how well-suited the mth data point is to serve as the exemplar for the ith data point; the ''availability'' a i,m ð Þ reflects the accumulated evidence for how appropriate the ith data point chooses mth data point as its exemplar. Then, supposing that there are N data points (e.g., pixels) in total, the responsibilities r i,m ð Þ are computed based on which readily reveals the intratumor vascular heterogeneity.

Synthetic DCE-MRI datasets
We first validated MTCM-generated estimates of tissue-specific vascular pharmacokinetics parameters using a set of realistic synthetic DCE-MRI data with known parameter values. We set J = 3, indexing two tissue compartments and one plasma input. We generated a large number of synthetic DCE-MRI time series by multiplying the customized local volume transfer constant maps Synthetic data were comprised of 50 replicated datasets generated for each of the 12 parameter settings (Data S1). We performed MTCM on all the datasets and compared the estimates of tissue-specific kinetic parameters produced by MTCM with the ground truth, in terms of both biases (accuracy) and variance (reproducibility) of the estimates, measured over 50 replicated datasets. For comparison purposes, we also evaluated the three most relevant methods (Table S1). To determine whether the proposed MDL criterion detects the correct number of underlying tissue compartments, we calculated the MDL values for J~2, 3, :::, 6 and identified the most probable value of J when MDL achieves its minimum value(s) (Fig. 2).
In the second application, we analyzed the real DCE-MRI data of an advanced breast tumor using both MTCM and the classic method. The T1-weighted gadolinium-enhanced (Gd-DTPA) DCE-MRI data set was acquired by three-dimensional scans performed every 30 seconds for a total of 11 minutes after the injection, on a 1.5 Tesla magnet using three-dimensional spoiled gradient-echo sequences (TR,7 msec, TE,1.5 msec, flip angle = 30u, matrix = 1926256, 0.5 averages). Typically, 12-15 slices are obtained and 15-18 time frames are acquired for each case. We visually examined the convexity of projected pixel time-course via the top two convexity-preserved projections where the margin between the ''exterior'' data points and the convex hull is minimized. We observed that two-tissue compartments (a threevertex convex set) were sufficient to describe the observed pixel time-course scatter simplex. While additional compartments can be used to account for outlier vertices, these compartments become difficult to interpret. We analyzed the dataset by setting J~4,5,6 and observed noise-like and biologically implausible pharmacokinetics patterns. The minimum value of MDL confirmed J = 3. The number of clusters M takes values between 12 and 18, determined automatically by the APC algorithm.
Characterizing longitudinal changes of differential vascular pharmacokinetics in treating angiogenic-active breast cancer case Vascular pharmacokinetics parameter values estimated by MTCM reveal longitudinal changes that may serve as the evidence of differential and heterogeneous responses to therapy. We analyzed the data sets arising from a longitudinal study of tumor response to anti-angiogenic therapy using similar imaging protocols (Data S3). Three sets of DCE-MRI data were acquired during standard treatment, each three months apart, serving as the potential endpoints in assessing the response to therapy. To detect various yet potentially hidden changes accounting for intratumor heterogeneity, we applied the same MTCM and MDL (as well as the classic method) to the three data sets separately.
Open source multiplatform standalone MTCM Java-R software Java GUI supported MTCM was implemented in both R and MATLAB, and runs on both Microsoft Windows and Linux platforms (http://mloss.org/software/view/437/). MTCM takes input the.mat data files that record the pixel time-course of DCE-MRI data in matrices. Each row corresponds to a time frame and each column corresponds to a pixel. Results of MTCM are provided to the users via a Java-based GUI (Fig. S5).  Figure S5 MTCM software package in R and Java is developed to implement MTCM algorithm, as well as the other algorithms widely used in blind source separation. The user-friendly Java GUI (a) can generate the tissue-specific local volume transfer constants and pharmacokinetic parameters on the right. Two pop-up windows (b) will show the projection of clustered pixels on the simplex, and (c) will display the estimated tissue-specific compartmental time-activity curves. (TIFF)   File S1 Supplementary discussion (appendix S1) and supplementary method (appendix S2). (DOCX)