Plasticity in the Macromolecular-Scale Causal Networks of Cell Migration

Heterogeneous and dynamic single cell migration behaviours arise from a complex multi-scale signalling network comprising both molecular components and macromolecular modules, among which cell-matrix adhesions and F-actin directly mediate migration. To date, the global wiring architecture characterizing this network remains poorly defined. It is also unclear whether such a wiring pattern may be stable and generalizable to different conditions, or plastic and context dependent. Here, synchronous imaging-based quantification of migration system organization, represented by 87 morphological and dynamic macromolecular module features, and migration system behaviour, i.e., migration speed, facilitated Granger causality analysis. We thereby leveraged natural cellular heterogeneity to begin mapping the directionally specific causal wiring between organizational and behavioural features of the cell migration system. This represents an important advance on commonly used correlative analyses that do not resolve causal directionality. We identified organizational features such as adhesion stability and adhesion F-actin content that, as anticipated, causally influenced cell migration speed. Strikingly, we also found that cell speed can exert causal influence over organizational features, including cell shape and adhesion complex location, thus revealing causality in directions contradictory to previous expectations. Importantly, by comparing unperturbed and signalling-modulated cells, we provide proof-of-principle that causal interaction patterns are in fact plastic and context dependent, rather than stable and generalizable.


Introduction
A key challenge in biology is to understand how information is coordinated globally within cells to generate and control complex cellular processes, such as cell migration. Succinctly, what is the wiring pattern of regulation that governs a particular cell behavior? Importantly, this raises a second fundamental question that we seek to address herein: is the wiring pattern for a particular process stable and generalizable, or, plastic and contextually dependent? The answer to this second question has important implications for our understanding of both complex biological processes and the design of the experimental strategies to address them.
Cell migration is a process of vital importance in numerous physiological and pathological processes including cancer cell metastasis [1]. Cell migration is indeed a highly complex cellular process, arising from a large, self-organizing molecular network to produce behaviors that are dynamic, heterogeneous and adaptable [2]. The dynamism of these behaviors suggests that underlying plasticity in the wiring of the cell migration system might be both more likely and more readily detectable than in relatively constrained cellular phenomena. Thus cell migration provides an appropriate framework within which to assess both the structure and potential plasticity of cellular wiring patterns.
Cell migration is the product of interactions and interdependencies operating across molecular, macromolecular, and cellular scales (see Figure 1). As noted above, a huge diversity of components comprise the network underlying migration at the molecular scale (Figure 1 A) [3][4][5][6][7][8]. Such large-scale molecular networks tend to be arranged into hierarchically nested assemblages or modules [9,10]. These macromolecular modules often represent functional units with distinct roles whose interactions ultimately produce single cell migration behaviors (Figure 1 B and C).
To understand how cell migration behaviors derive from molecular-, macromolecular-and cellular-scale organization, it is desirable to characterize all scales simultaneously and with sufficient spatiotemporal resolution to delineate functional relationships between features at any scale. Live cell fluorescence imaging provides the spatiotemporal resolution to track individual migrating cells while concurrently monitoring features of their molecular-and macromolecular-scale organization. Unfortunately, such imaging does not permit us to directly observe the Figure 1. Rationale for a coarse-grained analysis of causal influence in the cell migration system. Cell migration emerges from biological properties encompassing multiple scales (A-C). At the molecular scale, thousands of distinct components and their interactions produce a complex and modular molecular network comprising the cell migration system (A). At the macromolecular scale, this network gives rise to a variety of functional macromolecular modules (B), which collectively produce single cell migration at the cellular scale (C). Unfortunately, it is not yet possible to synchronously record the state of the molecular network underlying cell migration with spatiotemporal resolution (D). Instead, we coarse-grain (orange arrow from A to B) this molecular complexity by focusing our analysis at the scale of macromolecular modules. Specifically, we focus on CMACs and the F-actin cytoskeleton (green ovals in B) because these core modules directly mediate the process of cell migration. Their observable features exemplify the state of both: i) their own molecular components (green box in A), and; ii) extrinsic sources of regulation distributed throughout the broader molecular network from which they integrate information (black arrows in A). This information is functionalized through adaptive changes in the organization of these core macromolecular modules and at the cellular-scale, leading to associated changes in migratory behaviour. Through imaging and quantitative analysis of individual migrating cells expressing EGFP-Paxillin and RubyRed-LifeAct (markers for CMAC and F-actin modules, respectively), we extract 88 quantitative features defining organizational (E) and behavioural features (F) of the cell migration system (see Supporting Tables S1 and S2 for feature descriptions). Briefly, organizational features include those describing core macromolecular module status (lime-green boxes in E, e.g. CMAC area, CMAC lifetime, RubyRed-LifeAct intensity within CMACs, etc) and cellular-scale morphological features (pink boxes in E, e.g. cell perimeter, cell compactness (roundness), number of CMACs per cell), while behavioral features addressed in this study relate exclusively to cell migration speed (F). Finally, Granger causality analysis enables causal influence mapping to define the nature and direction of causal information flow between pairs of these coarse-grained features of the cell migration system (G). doi:10.1371/journal.pone.0090593.g001 complete state of molecular networks underlying cell migration (Figure 1 D), and as a consequence we cannot synchronously and globally observe how molecular signaling pathways are integrated. The canonical means to overcome such limitations on direct observation have been to assemble, piecewise, the functional contributions and relations of individual molecular components through perturbation-based epistasis analyses. Yet, despite facilitating great progress, reductionist perturbation-based approaches alone may be insufficient to provide a systems-level understanding, particularly of dynamic, heterogeneous processes with potentially emergent properties [11][12][13][14]. In particular, there are substantial risks of misattribution associated with the inference of molecular function based on targeted component perturbation [15]. Thus, there remain significant limitations on our ability to spatiotemporally resolve, either through direct observation or perturbationbased inference, how global information processing at the molecular scale gives rise to migratory behaviors.
Given the modularity in molecular network structure noted above, an alternative strategy is to analyze the state and interactions of functional macromolecular modules within migrating cells. This provides a means to coarse-grain the overwhelming molecular-scale complexity to a level that is tractable with imaging-based approaches, as recently demonstrated [16,17] ( Figure 1 A and B). However, even given such coarse-graining, it remains necessary to focus on a subset of selected modules that are central to the process of interest. In the case of (mesenchymal) cell migration, integrin-mediated cell-matrix adhesion complexes (CMACs -including focal adhesions, focal complexes and nascent adhesions) and the F-actin microfilament system are core macromolecular modules that directly mediate the migratory process (Figure 1 B) [8,18,19]. CMACs and F-actin serve as both mechanochemical-sensors and -transmitters, acting at the interface between cells and their environment to collect and functionalize information derived from extrinsic regulators in the distributed molecular network (Figure 1 A) [7,20]. As such, their status is innately linked to, and therefore partially representative of, the aggregate informational state of the cell migration system as a whole [8].
Using live cell fluorescence imaging and quantitative image analyses, organizational features (morphological, positional, dynamical and compositional) of the cell migration system can be extracted to create a quantitative, multivariate characterization of CMAC and F-actin status on a per cell basis, as well as of cellular scale morphology (Figure 1 E). Simultaneously, behavioral features can be recorded describing the migration of the same individual cells (Figure 1 F). This facilitates two of the critical enabling capabilities of the research framework described herein, namely the: i) direct integration, and ii) temporal resolution of organizational and behavioral data on a per cell basis [21].
Firstly, direct data integration on a per cell basis allows the covariance that arises between any two features as a result of natural cellular heterogeneity to define inter-feature correlations, making perturbations unnecessary to achieve this result. Crucially, this includes defining the relationships between features of cell organization and behavior. Secondly, the temporal resolution of variations in feature values allows us to move beyond correlative analyses to define the direction of information flow, in the form of causal influence, between pairs of system features using the Granger causality concept. Again, unlike many other systems-level causal analyses using, for example, network perturbation approaches [17,[22][23][24][25], directed causal influences can be defined using Granger causality without the need for perturbations.
The Granger causality concept originates from the field of econometrics [26,27], wherein the study of causality cannot routinely be facilitated by perturbation-based analysis. Similarly, Granger causality has become a key tool in the perturbationindependent mapping of neural connectivity and information processing [28]. Herein we demonstrate one of the first implementations of this approach to the study of fundamental cell biology [15]. Essentially, the Granger conception of causality stipulates causation if the combination of past information from both a background variable, X, and a response variable, Y, improves the prediction of future values of Y, when compared to prediction of Y based on only its own past values. Such improved prediction implies the presence of unique, pre-emptive information in variable X and therefore its causal influence over variable Y.
We now apply a statistical implementation of the Granger causality concept to map directionally specific causal influences between pairs of recorded features (Figure 1 G). Thus we begin to define a wiring pattern for the unperturbed cell migration system. Subsequently, we employ well-characterized perturbations to reveal plasticity in this wiring pattern. This is a fundamentally important finding because it demonstrates that the perturbations frequently used to illuminate wiring patterns may, in some cases, partially distort our view of such information processing structures. We therefore propose that, for the purpose of mapping information flow in complex cellular systems, perturbationindependent strategies such as that described herein may provide a valuable complement to existing perturbation-dependent approaches.

Imaging and Quantitative Analysis of Individual Migrating Cells
Live randomly migrating H1299 (human non-small cell lung carcinoma) cells stably expressing EGFP-Paxillin (CMAC marker) and RubyRed-LifeAct (F-actin marker) (H1299-P/L cells) were imaged by confocal microscopy. More than 4200 images were analyzed to segment and track individual cells, as well as their cohort of CMACs (totaling .70 000 CMACs) ( Figure 2, Supporting Movie S1). We then extracted 88 quantitative features (Supporting Table S1) characterizing CMACs, their associated Factin, cell morphology and cell migration on a per cell basis to produce a coarse-grained, synchronous characterization of cell migration system organization and behavior based on morphological, positional, dynamical and compositional features. Features encapsulate differing states of data aggregation in accordance with spatial and temporal data hierarchies (Supporting Figure S1). Experimental and analytic standardization ensured the consistency of quantitative data between, as well as within the time-course of, individual experimental repeats (Supporting Figure S2).

Confirmation of a General Quantitative Link between Organizational and Behavioral Features
A critical premise of this study is that the quantitatively recorded CMAC, F-actin and cell morphological features should act as informative representatives of the organizational state of the broader cell migration system, and should therefore correlate on a per cell basis with the behavioral output of that system. We tested this premise to verify the basic relevance and informational value of our extracted organizational features by determining whether difference hierarchies, summarizing relative statistical distances between cell subpopulations, were equivalent for measures of organization and behavior (Supporting Figure S3). Importantly, we used two complementary methodologies, stratifying control cell subpopulations based on either their organizational (Supporting In both cases, we observed an ordinal equivalence between the difference hierarchies defined in organizational and behavioral data. These findings confirm the existence of a quantitative link between our organizational and behavioral measures, and further hint at the possibility of their functional correspondence.

Selection of Organizational Features Associated with Cell Migration Speed
We next used two independent feature selection methods to identify specific subsets of organizational features that were associated with variations in our main behavioral measure, Instantaneous Cell Speed (ICS). Firstly, we used canonical vector analysis (CVA) to cluster the slowest (S1, 1-20%) and fastest (S5, 81-100%) moving quintiles of control cells using all organizational variables (Figure 3 A). CVA effectively separated these subpop-ulations along the 1 st canonical vector. Accordingly, ranking of variables based on their absolute loading coefficient in the 1 st canonical vector highlighted organizational features associated with the behavioral divergence between fast and slow migrating cells (Figure 3 B).
As an alternative feature selection method, we performed multiple regression via the elastic net (EN-regression) to estimate cell migration behavior (ICS, response variable) based on iteratively optimized combinations of 81 normalized organizational features (background variables, non-normalizable features were excluded to satisfy assumptions of ordinary linear regression). EN-regression was specifically selected for this task due to the enhanced stability of models produced in the presence of multicollinear background variables. The resulting EN-regression model achieved a maximal adjusted R 2 value of 0.43, thus explaining 43% of the observed variability in ICS in a linear regression model (  ). EGFP-Paxillin (C) and RubyRed-LifeAct (D) images were processed by median filtering and background correction. RubyRed-LifeAct signal was used for automated cell segmentation (dark blue outline in E-H). CMACs were segmented based on EGFP-Paxillin signals (red outlines in E, Supporting Movie S1 [center], and enlarged in F from white box in E). The RubyRed-LifeAct channel is shown overlayed by the EGFP-Paxillin-segmentation profile from E (G and enlarged in H from white box in G). Cells and CMACs were tracked via nearest neighbor analysis. CMAC tracking clearly differentiates stationary adhesions (I, enlarged in J from cell front (white box) in I) from sliding adhesions (I, enlarged in K from cell rear (yellow box) in I) (CMAC trajectories color coded for time, #10 time points shown, Supporting Movie S1 [right]). (L) Cell displacement over several hours (images from three time points, 0 h:20 min, 3 h:10 min, 5 h:50 min overlayed as red, green, blue, respectively). Quantitative variables describing cell and CMAC features/dynamics were automatically extracted (88 Single Cell Scale variables, Supporting Table S1, 29 variables defining individual CMACs, Supporting Table S2). Scale bars: A-K = 10 mm; L = 20 mm. doi:10.1371/journal.pone.0090593.g002 Organizational features included within the optimal regression model were ranked according to their absolute coefficient value, which indicates their individual contribution to the estimation of ICS (Figure 3 D).
Notably, comparison of the organizational features highlighted by both CVA-and EN-regression-based feature selection methods revealed more than 50% overlap among the high ranked predictions (Figures 3 B and D, respectively), leading to greater confidence in their potential importance. Conversely, disparities in selected features may be either methodologically driven or reflective of differences in input data, i.e. fast and slow control cell subpopulations (40% of cells, CVA method) versus all control cells (EN-regression method).

Identification of Directed Causal Interactions in Cell Migration
To extend upon the directionally unspecific correlative indications derived via CVA, EN-regression and Spearman correlation analyses, we next engaged the Granger causality concept [26,27]. This permitted detection of the directional flow of information, as revealed by causal influence, between individual features defining cell migration system organization and behavior. Specifically, we performed Granger causality modeling via auto-regression to determine how the values of background (organizational) variables at preceding time-points (temporal lags) contributed to the prediction of future values of the behavioral response variable, ICS. These analyses focused initially on those variables previously highlighted via correlative analyses (Figures 3 and 4).
As shown in Figure 5 A, auto-regression modeling indicated that progressively adding temporal lags of ICS alone increased ICS prediction accuracy to a maximal adjusted R 2 of ,0.4. Thus, current ICS is dependent on past ICS. Notably, as implied by the preceding correlative analyses, temporal lags of the organizational features Sum [CMAC Total RubyRed-LifeAct Intensity] per Cell and Mean [CMAC Lifetime] per Cell each alone enabled (partial) estimation of future ICS (adjusted R 2 max 0.12 and 0.28, respectively, Figure 5 A and B). Such results indicate either correlation and/or causation between these organizational features and ICS. Crucially, explicit evidence of Granger causation was achieved by combining lags of ICS with lags of these background variables, resulting in clear improvements in adjusted R 2 . Thus, both Sum [CMAC Total RubyRed-LifeAct Intensity] per Cell and Mean [CMAC Lifetime] per Cell contributed additional information to the prediction of ICS, implying association with up-stream Granger causal mechanisms.
We next assessed whether the improvements in prediction provided by background variables were statistically significant over a population of .2200 control cell observations. Impressively, the addition of information from either Sum [CMAC Total RubyRed-LifeAct Intensity] per Cell ( Because causal interactions may be bi-directional, we also tested for Granger causation in the reverse direction. As shown in Significantly, consistent with our correlative analyses, Cell Area data did not improve the estimation of future ICS, and nor did past ICS improve estimation of future Cell Area (

Identification of a Chain of Causation in Cell Migration
We next explored the causal wiring pattern of cell migration more comprehensively by examining an extended set of interfeature relationships. This revealed interconnected patterns of Granger causation extending beyond single pairs of variables. As an example, we present a short chain of Granger causal interactions extending both up-and down-stream of ICS (

Differential Perturbation Reveals the Plasticity of the Defined Causal Wiring Pattern
To assess the question of whether causal wiring patterns are stable and generalizable, or plastic and contextually dependent, we assessed the robustness of the Granger causality chain described in Figure 6 A using data derived from cells perturbed via either the inhibition of ROCK or the activation of Rho (Supporting Movies S2 and S3). These represent treatments with strong, wellcharacterized impacts on cell migration. The effects of these perturbations were extensively assessed (Supporting Figure S4), revealing excellent coincidence with previous observations. Remarkably, despite dramatically altering both cell migration system organization and behavior, inhibition of ROCK did not significantly disrupt the chain of Granger causality first detected under control conditions (Figure 6 B), thereby providing additional support for the significance of these findings. In contrast, Rho activation disrupted the first two causal relationships such that no evidence of Granger causation could be detected (Figure 6 C). Strikingly, the presence of the third causal relationship was conserved following Rho-activation, but the impact of the relationship was reversed such that increasing ICS reduced, rather than increased, Cell Compactness. These results, particularly the positive evidence indicating inversion of effects in the third interaction, reveal the differential sensitivity of causal interactions to perturbation and, more generally, their contextual dependence. Fundamentally, this exposes plasticity in the causal wiring pattern underlying the cell migration process.

Discussion
A variety of studies have previously described correlations between organizational and behavioral features of cell migration, yet the orientation of cause and effect in these relationships has remained undefined [16,[29][30][31][32][33]. We now extend on such correlative analyses by describing a novel strategy for directionally specific mapping of information flow in the form of causal influence. A key differentiating aspect of our approach is the focus on charting causal influences based on sensitivity to the natural heterogeneity (specifically, time-resolved inter-feature co-variance) that arises within individual experimental conditions, as opposed to relying on perturbation-induced differences between conditions. This permits: a) the delineation of causal influence patterns on a per condition basis, which in turn facilitates; b) the comparison of these wiring patterns between alternatively perturbed conditions. Through these capabilities we now provide proof-of-principle for plasticity and contextual-dependence in the regulatory wiring of the cell migration system, while also delineating specific, functionally significant relationships between key features within this system.
The plasticity of causal wiring in the cell migration system is illuminated by differences observed in pair-wise causal influences under control (DMSO-treated) and Rho-activated conditions. Specifically, this plasticity is evidenced by the disruption (or weakening to undetectable levels) of two causal interactions within the causality chain defined in control cells, and is positively exemplified by alterations to the causal interaction detected between cell speed and cell compactness. The nature and implications of these specific alterations are discussed further below. In general, however, these perturbation-induced differences in inter-feature connectivity provide proof-of-principle as to the flexibility and adaptability of the causal wiring pattern governing cell migration. Concomitantly, the equivalence between the causality chains revealed in control and ROCK-inhibited cells strongly support the validity and relevance of the defined causal interactions. Thus, in combination, these unique findings demonstrate both robustness and contextually dependent plasticity in the functional relationships between coarse-grained features at the core of the cell migration system. These findings hold the important implication that although some causal regulatory influences may be conserved even under strongly altered conditions, a singular characterization of inter-feature relations is unlikely to be generalizable. This notion seems likely to also be important for the molecular-scale networks and components that underpin the coarse-grained macromolecular features assessed herein.
It should be noted that the concept of plasticity is already well established in the context of cell migration behavior [2]. However, such behavioral plasticity may arise mechanistically from two distinct types of variation within the underlying network: 1) variation in the properties of features (nodes) within the network, and/or; 2) variations in the connectivity (edges) between those features. To date, numerous studies have described how specific perturbations induce changes in the values of features-of-interest within the cell migration system. In notable examples, extensive arrays of F-actin, CMAC and cell behavioral properties were monitored and compared across a spectrum of conditions wherein ECM-density, intracellular contractility and/or growth factor stimulus were modulated [29,30]. Although important correlative links were implied by these studies, direct quantitative correlations between many of the recorded features were inaccessible because data were derived from distinct experimental sources. This highlights the advantages of defining inter-feature connectivity based on multivariate data derived from within individual experimental sources and, indeed, from within individual conditions (rather than by comparison between conditions). Such strategies rely on sensitivity to natural heterogeneity within conditions, thereby allowing the detection of variations in the inter-feature connectivity between conditions. A highly relevant study where this was achieved includes an analysis of how correlative connections between CMAC morphology-features  Causation is tested reciprocally within variable pairs to discern evidence for the causal influence of organizational variables over ICS (left panels, C-G) and vice versa (right panels, C-G). We infer causal influence only where significance-testing patterns are robust and ordered (as in C were altered by individual siRNAs within an RNAi screen [32]. Indeed, this study is particularly noteworthy because it documents plasticity in the correlative relationships between morphological CMAC features. Our work now builds on elements of this important foundation in two ways: 1) assessment of the relationships between features of cell migration system organization and behavior, as facilitated by synchronous acquisition of both data types on a per cell basis, and; 2) definition of the direction of causal influences between features enabled by time-resolved data and Granger causality analysis.
The extension of correlative analyses to delineate causal influence marks a significant advance. As noted in a recent commentary, the majority of analyses in cell biology are correlative, with the analysis of causation remaining infrequent [15]. This is particularly true at a systemic level where complex patterns of inter-feature causation are considered. However, one prominent strategy for empirically defining causal wiring patterns is network perturbation [17,[22][23][24][25]. Generally this approach involves the characterization of an array of features in both the presence and absence of perturbation(s). The variation in feature values enforced by perturbations is then used to detect connectivity between feature-pairs and, specifically, inter-feature causation. Although extremely powerful, some limitations to the network perturbation strategy may bear further consideration here. Firstly, as with more locally focused epistasis strategies, inter-feature connectivity is defined by comparison of feature values between control and perturbed conditions. Because the combination of conditions is used to define a single causal wiring pattern, network plasticity cannot easily be addressed through the comparison of wiring patterns. A noteworthy exception to this is the timeresolved network perturbation analysis performed by Ku et al, which demonstrated the temporal evolution of causal influence patterns between coarse-grained features of microtubule, F-actin and myosin machineries during neutrophil polarization [17]. A second limitation associated with the network perturbation strategy is that the required perturbations are assumed to alter feature values (to illuminate inter-feature connectivity) without distorting the ''native'' network architecture, or at least to do so only locally at the expected point of perturbation [34]. Crucially, our detection of plasticity in causal wiring following Rho activation provides proof-of-principle that this assumption may not always hold. Thirdly, in a critique common to many perturbationdependent strategies, the specificity of targeted perturbations is often fundamental to the interpretation of causal directionality within inferred wiring patterns. Of similar importance is the relative timing of direct and indirect effects that propagate downstream of the initial perturbation, including possible feedback responses. Where these are not clearly characterized, appropriate inference of causal directionality involves assumptions that, although reasonable, may not always be satisfied. Thus, overall, it seems important to recognize that limitations exist in relation to the use of perturbations for the definition of causality, either locally (as in epistasis) or at a network level. Though these perturbationbased approaches are clearly valuable and indispensable, their limitations in some cases coincide with relative strengths of the novel strategy described here, wherein perturbations are not required for the coarse-grained definition of ''native'' causal influences. Thus, perturbation-independent strategies such as this one may provide a useful complement to existing approaches.
The detection of causal wiring pattern-plasticity is a key finding, but the individual inter-feature relationships defined herein each also hold their own specific relevance. For example, the association between F-actin and CMACs is thought to be a critical determinant of migratory behavior [20]. Our finding that increasing RubyRed-LifeAct signal (representing F-actin) associated with CMACs causes decreased cell speed resolves the direction and effect of this previously correlative indication. Similarly, correlations have also been shown between CMAC stability (represented by lifetime) and cell speed [29][30][31]. We now confirm existing assumptions regarding the inhibitory causal influence of high CMAC stability over cell speed. Likewise, our finding that increasing CMAC size inhibits cell speed is consistent with the correlative indications of a recent study that used aggregate inter-feature co-variance enforced by parallel perturbations to define correlations between CMAC features and cell migration [16]. Interestingly, we reveal that a variety of CMAC features have causal influence over cell speed, while Kim et al found that only CMAC area (of several assessed CMAC features) robustly correlated to cell speed. One explanation for this difference may be that causal wiring plasticity induced by each perturbation (as demonstrated herein) may result in an unintentional stringency within the Kim et al study, such that only those correlations conserved under all conditions would be detected in the final analysis. This emphasizes the value of a perturbationindependent strategy for wiring pattern-definition.
While the three causal relationships noted above corresponded to existing expectations based on previous correlative analyses, the findings for cell area, CMAC localization and cell shape were less expected. Firstly, no evidence was detected that cell area has any causal relationship with cell speed. This was unexpected both because cell area globally modulates intracellular signaling networks [35,36] and because numerous perturbations impact coincidently on cell migration and cell area, leading intuitively to inference of a causal dependency. While the absence of evidence for a causal interaction is not proof that no such relationship exists, it might reasonably be interpreted as indicating either the absence or relative weakness of an interaction. This is interesting in the light of our feature selection results wherein cell area was consistently indicated to be an important parameter. This dichotomy may be explained by recognition of the fact that our correlative and Granger analyses were pairwise, while feature selection methods were multivariate. Whereas such pairwise analyses have sensitivity for strong biological relationships that are relatively direct and linear, multivariate analyses may provide sensitivity towards complex and perhaps contingent relationships between regulatory features. Thus, it is conceivable that cell area may modulate the causal impacts of several other features upstream of cell speed, in such a way that its own causal influence is distributed and not directly discernable. It will therefore be crucial in the future to develop multivariate statistical implementations of Granger causality analysis for this setting, including improving sensitivity to non-linear causal interactions [37].
Even more surprising was that both CMAC localization and cell shape features were regulated by cell speed. In general, these findings highlight ongoing questions regarding the possible orientations of causation relative to complexity in such complex biological systems [38], and indeed about the relative positions of such features in a hierarchy of complexity [9,10]. Specifically, we were surprised in the first instance to find that faster cell migration caused increased distances between CMACs and the cell edge, reducing the peripheral skew in CMAC localization [39]. This was unexpected because our empirical findings refute previous perturbation-derived inferences that CMAC centralization may cause the inhibition of cell migration [40,41]. Indeed, when combined with evidence that CMAC lifetime causally regulates cell speed, this finding allows the disentanglement of previously convoluted correlations between CMAC centralization, CMAC stabilization and cell migration. Thus, we find that CMAC stability strongly influences cell speed, which in turn modulates CMAC localization.
Equally unexpected was the finding that cell speed lies upstream of cell shape determination, with faster migration in control cells promoting a less round cell shape (with higher compactness values). While the existence of a correlative relationship between cell shape and cell speed has been abundantly clear [42,43], the direction of causation has not. However, evidences that symmetry breaking and associated cell deformations necessarily precede cell migration [44] promote the assumption that cell shape would act upstream of cell speed, rather than downstream as found here under all three experimental conditions. A further investigation of the relative magnitude and frequency of shape changes during symmetry breaking and at different migration speeds may more fully explain this surprising finding. In general, this observation reveals a possible limitation in the sensitivity of our causal analysis strategy, since each experimental conditions is described by a single causal wiring pattern. This means that potential variations in causal wiring arising within conditions may be lost. It is conceivable, for example, that wiring patterns may differ during alternate phases of migration (symmetry breaking, protrusion, retraction) and/or between different migratory modes (mesenchymal vs amoeboid). Indeed, the latter is hinted at by wiring changes induced following Rho-activation, as discussed further below. Methodologically, the ability to disaggregate data from within experimental conditions to identify recurring migratory phases/ modes may help to contextualize our understanding of causal network structures during evolving migratory behaviors, as previously exemplified during cell polarization [17].
As noted above, the causal influence of cell speed over cell shape revealed in control cells was recapitulated in both ROCKinhibited and Rho-activated cells. This is remarkable given the dramatic alterations to organizational and behavioral features induced by each of these perturbations. However, the inversion of the causal effect of this interaction in Rho-activated cells is particularly noteworthy, and not only as proof-of-principle for causal wiring pattern-plasticity. Indeed, the finding in Rho-activated cells that higher cell speed causes cells to become rounder, rather than more protrusive, is reminiscent of a switch in the organizational state of migratory machinery from a mesenchymal phenotype towards an amoeboid phenotype. This is a known effect of Rho activation [2]. Such a switch in migratory mode is further supported by a comparison of the quantitative changes recorded in our Rhoactivated cells and those observed following Rho-activation in MDA-MB231 cells in 2D culture by means of Smurf1-depletion [45] -a perturbation which drove a pronounced mesenchymalamoeboid transition in 3D culture. Thus, our analysis of causality following Rho-activation reveals not only plasticity in causal wiring, but that this plasticity may correspond to a recognized switch between mesenchymal and amoeboid modes of migratory behavior.
The observation that higher variation in the rate of net EGFP-Paxillin incorporation/release from CMACs is predictive of increased intra-CMAC colocalization between EGFP-Paxillin and RubyRed-LifeAct is also noteworthy. It is consistent with demonstrations that accelerated paxillin dynamics correlate with high force application to CMACs [46,47], which in turn correspond to CMAC maturation and reinforced associations with F-actin [48,49], an effect observable via colocalization analysis [33]. Remarkably, it is specifically the variance in EGFP-Paxillin dynamics that is causal, since the median rate of EGFP-Paxillin incorporation/release at CMACs shows no evidence of causation. This highlights the potential for population variance to have causal influence independent of mean/median values -an information processing feature that may be associated with relative rather than absolute signal processing. Finally, the finding that increasing colocalization between EGFP-Paxillin and RubyRed-LifeAct within CMACs causes inhibition of cell migration is in correspondence with previous correlative findings showing that increased CMAC-component-to-F-actin colocalization coincides with force-driven CMAC maturation and reduced cell migration [29,30].
In summary, we have presented a new imaging-based strategy designed to detect causal influence transmission between coarsegrained features synchronously characterizing the core machinery and behavioral outputs of the cell migration system. This approach employs the Granger causality concept to leverage natural heterogeneity within experimental conditions, facilitating perturbation-independent causal analyses. This is, to the best of our knowledge, the first time the Granger causality concept has been employed within such a cell biology setting. Using this approach, we have mapped several pair-wise causal interactions between features of cell organization and behavior, thereby alternately confirming or refuting existing assumptions of causation derived from previous correlative analyses. Most importantly, we have provided empirical evidence of plasticity in the causal wiring pattern underlying cell migration. This finding indicates that causal analyses should be performed and interpreted in a contextually dependent manner, since global causal wiring patterns are unlikely to be entirely generalizable.
Ultimately, this work provides new insights into the architecture of causal interactions that underlie cell migration, but more generally also presents a highly extensible framework for further exploration of this process. Future developments in experimental design, data acquisition (including an emphasis on molecular-scale information), and Granger causality analysis methods are likely to deliver important insights into cell migration, as well as the  Compactness (A, right panel), indicating that fast moving cells become less round. The causal links between these four variables are summarized schematically (grey boxes), with positive and negative relationships indicated by arrows and capped lines, respectively. Analyses of Granger causality predictions for equivalent inter-feature relationships in ROCK-inhibited cells (B) reveal comparable causal relationships while Rho-activated cells did not (C). Notably, although the final causal relationship (C, right panel) was still detected, the causal effect was reversed such that increasing ICS caused decreasing Cell Compactness, i.e. increased cell speed caused cells to become more round. All variables are defined in Supporting Table S1. doi:10.1371/journal.pone.0090593.g006 potential to explore a variety of complex, dynamic and heterogeneous cellular processes.

Cell Culture and Experimental Conditions
H1299 (human non-small cell lung carcinoma, ATCC) cells were transfected with both EGFP-Paxillin and RubyRed-LifeAct constructs using Lipofectamine 2000 (Gibco) according to the manufacturers instructions. Double-expressing stable clones (H1299-P/L) were selected and maintained in RPMI 1640 medium (Gibco) containing 1 mg/ml Geneticin (G-418 sulfate, Gibco), supplemented with 1 mM Glutamine and 10% fetal bovine serum (Gibco) at 37uC, 5% CO 2 . Fibronectin was purified from human plasma as described previously [50]. Imaging substrates were coated with purified Fibronectin (10 mg/ml) at 37uC for 2 h followed by blocking with 1% heat denatured bovine serum albumin (Sigma-Aldrich) at 37uC for 1 h. H1299-P/L cells were serum starved for 24 h prior to imaging. 4000 cells were plated per well in fibronectin-coated 96-well glass-bottomed plates (0.17 mm optical glass, Matrical Bioscience). Cells were treated immediately with DMSO (Di-methyl sulfoxide, Sigma-Aldrich) as control, or alternatively with either a Rho-associated protein kinase (ROCK)-inhibitor (Y-27632; 25 mM, Sigma-Aldrich), or a Rho activator (Calpeptin, 25 mg/ml, Cytoskeleton). In each experiment, control (DMSO-treated, 9 experimental repeats) cells were paired with one perturbed condition (ROCK-inhibition, 5 experimental repeats or Rho-activation, 4 experimental repeats) to allow subsequent comparison of results between these three conditions. Live cell imaging commenced after 2 h with treatments ongoing.

Confocal Live Cell Fluorescence Imaging
Live cell imaging was performed on a Nikon A1R confocal microscope with a PlanApo VC 60X/1.4 NA oil-immersion objective. Images were acquired at 5 min intervals for 8 h with a pixel resolution of 0.21 mm. Cells were maintained during imaging in normal culture medium absent fetal bovine serum, at 37uC and 5% CO 2 .

Image Processing and Image Analysis
Acquired images were processed in ImageJ (W.S. Rasband, NIH) using the Hybrid 3D median filter followed by local background subtraction. Processed images were then analyzed using PAD software (v6.3) (Digital Cell Imaging Laboratories, Keerbergen, Belgium). Cell boundaries were detected based on Ruby-LifeAct fluorescence, with cells excluded when contacting image borders. CMACs with area .0.05 mm 2 were then segmented based on EGFP-Paxillin signal. Segmented Cells and CMACs were tracked over time based on nearest neighbor analysis. Cell and CMAC segmentation and object tracking were verified by direct visual inspection of all images and image sequences. Quantitative features defining cells, per Cell CMAC populations (Supporting Table S1) and individual CMACs (Supporting Table S2) were then automatically extracted. Intensity values within CMACs were corrected in each channel for local background within a 1 mm radius around the boundary of each CMAC, excluding signal from any other segmented CMAC.

Data Parsing
Automated data parsing was performed prior to extensive statistical analyses. This included the removal of CMACs observed in only a single time point, thus dramatically limiting misclassification of noise as CMACs, especially given imposed limits on the maximal inter-time point displacement of CMACs (2 mm). Similarly, CMACs with minor axis values of less than a single pixel (210 nm) were removed to exclude aberrant form-factor calculations. CMACs present in the first and final images of an image sequence were excluded from aggregate calculations of CMAC Lifetime due to the biasing effect associated with the inclusion of incomplete CMAC lifetimes.

Data Standardization
CMAC fluorescence intensity data was standardized by calculating the median of all individual CMAC mean intensities, per recorded channel, per experimental repeat in the control (DMSO-treated) condition, of CMACs between 0.15 and 0.2 mm 2 . This size range incorporates numerous CMACs and is above both the optical and digital resolution of the acquisition system, limiting ambiguity about actual CMAC area. All CMAC fluorescence intensity values per experimental repeat were standardized to this value.

Data Transformations
Cell trajectories were smoothed through the application of smoothing splines. The parametrization of the smoothing was based on a trade off between maintaining the properties of the original trajectories (essentially no difference can be observed between original and smoothed trajectories), and enhancing the continuity of the instantaneous cell speed distribution. This diversification of ICS values enhanced the efficacy of subsequent statistical analyses (e.g. regression techniques).
Given our usage of ordinary least squares estimations in autoregression and elastic net-regression, and the associated assumptions of residual normality, Box-Cox transformations [51] with individualized tunings were applied to normalize data distributions for each cell migration system feature, so as to fulfill this necessary assumption.

Subpopulation Discovery via the EM-algorithm
Principal component analysis (PCA) using singular value decomposition of the normalized data matrix was first applied to control cell observations based on all 87 organizational variables. Subsequently, the first 25 principal components (containing 87.5% of total variation) were used in an expectation maximization (EM)algorithm [52] for assignment of control cell observations to quantitatively defined subpopulations. EM algorithm-based assignments were repeated 10 times each for designation of between 2 and 8 subpopulations. The Akaike information criterion (based on the log-likelihood of the mixture model penalized by the number of existing groups) [53] was used to identify the optimal number of subpopulations. This number was found to be 4 ( Figure 2). The final assignment of control cell observations between four subpopulations was then determined according to the lowest Akaike information criterion achieved after 100 iterations of the EM-algorithm (using the same data as above). Importantly, the EM-algorithm was initiated using random seeding so as to relieve the locking effect of initiating the algorithm with clustering techniques such as k-means clustering.

Comparison of Defined Cell Subpopulations
Canonical vectors analysis (CVA) [54] was performed using decompositions of the between-group and within-group covariance matrices based on eigenvalue decomposition. Mahalanobis distance measurements were applied to determine the pairwise between-group distances using their structures of covariance.
Agglomerative hierarchical clustering of the 87 dimensional organizational data was performed on the mean vectors of subpopulations, subject to the shortest distance using the Euclidean distance as a metric. Notably, mean vectors were based on principal components of the multivariate data matrix. Outputs were displayed as standard dendograms.
Multivariate analysis of variance (MANOVA) was applied to the 87 dimensional data matrix to assess whether the between-group and within-group covariance matrices differ significantly given specific subpopulation assignments. Significance was determined via Wilk's test.
Pairwise univariate distribution tests were performed using the two-sample Kolmogorov-Smirnov (KS)-test to determine whether two empirical probability density functions were likely to have arisen from the same underlying probability function. Significance was determined based on comparison of between-distribution KSscores to the KS-test's critical values at the 5% significance level. Given KS-scores beyond the significance level, individual variable distributions were visualized as probability distribution functions. These were estimated using kernel density estimation with Gaussian kernels and fixed bandwidths.

Feature Selection
CVA was applied to find the organizational features that contribute to the separation of the fastest (5 th quintile of ICS) and slowest (1 st quintile of ICS) cells (DMSO-treated). Features were selected based on their absolute contribution to the first canonical vector which contained .90% of the between-group variation needed to maximally separate these subpopulations.
Regression via the elastic net (EN-regression) [55] was employed using all control cell observations, with ICS as the response variable (representing behavior) and organizational features as background variables. As noted previously, in order to meet the assumptions of the regression model, all variables were normalized via individually tuned Box-Cox transformations. After supervision, 6 variables were discarded (as non-normalizable) and the regression model was built using the remaining 81 organizational features as background variables. For feature selection, features were ranked based on their absolute contribution (coefficient) at the model iteration yielding the optimal goodness-of-fit along the shrinkage path of the EN-regression (as adjudged by the adjusted coefficient of determination (adjusted R 2 )). The decision to use EN-regression (as opposed to related techniques such as Ridge regression or the LASSO) was based on its grouping effect wherein the coefficients assigned to multicollinear (related) variables are not shrunk independently. The use of the adjusted coefficient of determination was motivated by our interest in the descriptive and not the predictive accuracy of the model.

Analyses of Granger Causality
The concept of Granger causality [26] can be expressed in terms of auto-regression where a variable X is stated to Grangercause another variable Y if lags of X, in the presence of lags of Y, improve the auto-regression model's explanatory power of 'present' values of Y. More specifically, let us have two models A and B, where A has Y as the response variable and lags of Y as the background variables, and B has Y as the response variable and lags of both Y and X as the background variables. Then, given a significant reduction in the residual sum of squares of model B compared to model A, X is said to Granger-cause Y. Our analyses of Granger causality were performed on this basis using linear auto-regression. It is important to observe that as ''older'' lags are included into each regression model, ''younger'' lags are also retained, such that if a lag from 15 minutes prior to the ''present'' is included, the 5 and 10 minute lags are also included in the model. As noted above, all variables used in these analyses were normalized via individually tuned Box-Cox power-transformations. Furthermore, to confirm the stationarity of our data, we applied the augmented Dickey-Fuller test. This indicated that our time-series data did not contain significant trends such as fluctuating means or variances. Notably, data stationarity is necessary to satisfy the assumptions of ordinary least squares-based estimation and therefore the auto-regressive modeling applied in our assessment of Granger-causality. Finally, to test for significant improvements in model fitting (corresponding to the addition of the candidate causal variable) we utilized the Granger-Sargent test. In combination with descriptions of Individual Cell Properties, these two spatial scales of data, collectively termed Single Cell Scale, are used almost exclusively in this study (see green outline, A). Features describing these scales are described in Supporting  Table S1. The properties of Single CMACs are also independently recorded, providing access to Individual CMAC Properties as the maximal level of spatial data disaggregation (data from this spatial scale is presented in Supporting Figure S2, C and D, based on features defined in Supporting Table S2). (B) Image data is acquired over 8 h and therefore the combination of data from all time-points represents the maximal level of data aggregation in the Temporal Data Hierarchy. Varying degrees of data disaggregation via Arbitrary Time Sampling, as employed in Supporting Figure  S2, can provide indications regarding the stability of experimental conditions over time. Because migrating cells can enter and exit imaging fields at any time, per-cell data may be aggregated to reflect individual Cell Observation Periods. Within each cell, data defining the properties of each CMAC may be aggregated over the lifetime of each CMAC. Alternatively, as with Single Cell and Cell Population data, CMAC data may be disaggregated to reflect Instantaneous Dynamics defined at the maximal image sampling frequency (5 min). Instantaneous Dynamics data is used in most cases throughout this study. An expectation maximization (EM)-algorithm for Gaussian mixture models using principal components 1-20 (including .99% of total variance) was employed to assign control cell data into two subpopulations. This was repeated ten times to achieve an optimized assignment as determined by assessment of relative inter-and intra-group variability using the Akaike information criterion (AIC). This procedure was replicated for the assignment of control cell data into between 2 and 8 subpopulations, revealing that control cell organizational feature data is optimally represented as four subpopulations. Multivariate analysis of variance (MANOVA) allowed rejection of hypotheses that 1, 2 or 3 sub-populations exist (P values = 0), indicating support for the existence of 4 or more subpopulations. Accordingly, control cell observations were assigned to one of four subpopulations (G1-G4), with assignments finalized based on the lowest achieved AIC value following 100 randomly-seeded EM-algorithm iterations. Given these assignments, canonical vectors analysis (CVA) was used to visualize the multivariate distributions of subpopulations G1-G4 based on the original 87 organizational features. Subpopulations G1 (blue) and G3 (orange) partially overlap, with G4 (red) relatively proximal and G2 (green) relatively distal. Both standardized Mahalanobis distance measurement (B, blue -near; red -far) and independent hierarchical clustering (C) confirm the structure of the organizational feature-based difference hierarchy for these control cell subpopulations. (D) To compare this organizational difference hierarchy with the corresponding behavioral difference hierarchy, we visualized the probability distribution function (P.D.F.) for Instantaneous Cell Speed (ICS, behavioral feature) associated with each control cell subpopulation defined in A. In correspondence to results in A-C, G1 and G3 subpopulation behavior was highly analogous (KS-test p-value G1 vs G3 = 0.56) with a bias towards fast moving cells, G4 had a bias towards moderately motile cells (KS-test p-value G1 vs G4 = 0), and G2 was biased towards slow moving cells (KS-test p-value G1 vs G2 = 0). Thus, organizational and behavioral difference hierarchies were ordinally equivalent when control cell subpopulations were defined based on organizational state. (E) To complement this methodology, behavioral stratification of control cells was performed according to quintiles (S1 [1-20%, dark blue], S2  Table S1) between control and either Rho-activated (Rho, left column) or ROCK-inhibited (ROCK, right column) conditions reveal a broad array of perturbation effects (white circles denote KS-test P-values ,0.05, colors indicate aggregate inter-distribution distances (blue = proximal, red = distal)). (C-H) Selected probability distribution functions (DMSO (control) = blue, ROCK (ROCK-inhibited) = red, Rho (Rho-activated) = yellow) are highlighted here to provide comparison to previously published analyses of Rho and ROCK manipulation, thereby confirming the expected impacts of these perturbations. Specifically, Rho activation significantly reduced cell migration speeds compared to control [56][57][58], while ROCK inhibition increased the frequency and maximal speeds of fast moving cells (C) [59][60][61]. Rho activation promoted rounder, more compact cell morphologies indicative of increased contractility (D) [62,63], while ROCK inhibition lead to a more protrusive phenotype [64,65]. Both Rho activation and ROCK inhibition promoted an increase in CMAC number per cell, though ROCK inhibition had a markedly stronger effect (E) [66,67]. ROCK inhibition dramatically reduced the median area of CMACs (F) [60,63,68], while Rho activation had inverse effects [67,69,70]. Correspondingly, ROCK inhibition accelerated the turnover of CMACs, while Rho lead to a significant increase in mean CMAC lifetime per cell (G) [61,[71][72][73]. Analysis of intra-CMAC colocalization between EGFP-Paxillin and RubyRed-LifeAct showed that co-recruitment of these components within CMACs was enhanced by Rho activation and inhibited by ROCK inhibition (H). (EPS)  Figure S3). The corresponding units and a concise description are also included. Additional terms are defined above relevant columns. (XLSX)