A combined experimental-computational approach uncovers a role for the Golgi matrix protein Giantin in breast cancer progression

Our understanding of how speed and persistence of cell migration affects the growth rate and size of tumors remains incomplete. To address this, we developed a mathematical model wherein cells migrate in two-dimensional space, divide, die or intravasate into the vasculature. Exploring a wide range of speed and persistence combinations, we find that tumor growth positively correlates with increasing speed and higher persistence. As a biologically relevant example, we focused on Golgi fragmentation, a phenomenon often linked to alterations of cell migration. Golgi fragmentation was induced by depletion of Giantin, a Golgi matrix protein, the downregulation of which correlates with poor patient survival. Applying the experimentally obtained migration and invasion traits of Giantin depleted breast cancer cells to our mathematical model, we predict that loss of Giantin increases the number of intravasating cells. This prediction was validated, by showing that circulating tumor cells express significantly less Giantin than primary tumor cells. Altogether, our computational model identifies cell migration traits that regulate tumor progression and uncovers a role of Giantin in breast cancer progression.


Introduction
Breast cancer is the most commonly occurring cancer in women and the fifth most common cause of cancer-related death [1,2]. The formation of metastases due to the spread of tumor cells to distant organs, is the major factor that determines mortality in breast cancer patients [3,4]. Metastatic dissemination of breast cancer cells is a multi-step process requiring the acquisition of a motile phenotype, the ability to invade the surrounding tissue, break into vessels and eventually extravasate and colonize distant organs [5]. Major research efforts in the past decades led to a firm understanding of the role of cell division, apoptosis and invasion in tumor progression and metastatic dissemination. However, there is surprisingly little knowledge of whether and how tumorigenesis is affected by different migration traits such as speed and directional persistence, i.e. the tendency to keep or change the direction of motion. Furthermore, the differential contribution of cell migration and invasiveness to the ability of cancer cells to escape from the primary tumor remains elusive.
The Golgi apparatus (hereafter referred to as the Golgi) is a major cellular organelle that is involved in the regulation of membrane trafficking and post-translational modifications. In addition, the Golgi plays an important role as a regulator of directional cell migration [6][7][8]. Under normal conditions, the Golgi is an intact single-copy organelle comprised of stacks of flattened cisternal membranes that are laterally linked to form the Golgi ribbon. This structure is maintained by Golgi-matrix proteins. Giantin is a such a Golgi-matrix protein that was shown to be important for maintaining the structural integrity of the Golgi [9][10][11][12]. Several alterations in cultured cancer cells were reported to result in fragmentation of the Golgi (reviewed in [13]). Moreover, RNAi screening approaches have identified a wide range of conditions that affect the morphology of the Golgi [14][15][16][17]. Golgi fragmentation has been reported to cause defects in cell polarization and therefore a delay in cell migration in wound closure assays [6,8,18]. Our previous work indicates that loss of a Golgi matrix protein GM130 in breast cancer cell lines is associated with altered migration and invasion [18]. However, this analysis was so far only performed in cell lines and a confirmation of the pathophysiological significance of Golgi fragmentation in patient samples remains to be shown. Moreover, it does not answer the fundamental question of how cell migration affects tumor progression, a relationship that remains incompletely understood. For instance, increasing cell turnover was (grant 302452), by the Anders Jahre Foundation, and by the Rakel og Otto Kristian Bruun's Legat and by a grant from the Austrian Science Foundation FWF (P 358320). A.K.L is supported by the center for research-based-innovation BigInsight funded by the Research Council of Norway (grant number 237718), by the Research Council of Norway (grant 311188) and by the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 847912. A.K.L and S.G were supported by the UiO:Life Science convergent environment PerCaThe. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests: D.F. is an employee of Xilio Therapeutics and a former employee and a shareholder of Eli Lilly and Company. The other authors declare that they have no competing interests.
found to augment the effect of cell migration of tumor evolution [19]. The presence of cells moving in opposite directions (so called oncostreams) correlates with tumor growth [20,21]. However, how the combination of speed and persistence alters tumor growth is not well characterized.
In the present study, we used in silico simulations to analyze the impact of a wide range of speed-persistence combinations of the shape and size of tumors and on intravasation of cancer cells. To this end, we constructed a mathematical model that accounts for migration, proliferation, invasion and death of cancer cells in a two-dimensional space. This model predicted that speed and persistence are positively correlated with the growth of the primary tumor. We then fed experimental measurements of cell migration and invasion into our model to study the impact of Giantin depletion on tumor progression. The model predicted that Giantin-depleted tumors, while smaller, are nevertheless capable of seeding more intravasating cells. This conclusion was then substantiated by showing that circulating tumor cells have lower Giantin levels than primary tumor cells. Our combination of modeling and experimental validation provides a possible explanation for the poor survival of breast cancer patients with low Giantin levels.

Cell culture and transfection
MDA-MB-231 and BT549 were cultured in RPMI supplemented with 10% FCS and 1% penicillin/streptomycin (GIBCO). Cells were transduced with NucLight green lentivirus (Essen BioScience, 4624). Infections were carried out using a multiplicity of infection (MOI) of three transducing units per cell. The growth medium was changed after 24h of transduction and cells were split after 72h. The transduced cells were sorted via fluorescence-activated cell sorting (FACS) using a BD FACSAria cell sorter. Sorted cells were tested for mycoplasma. For Giantin (GOLGB1) knockdown experiments, BT549 cells in optimal growth phase expressing GFP localizes to the nucleus were semireverse-transfected with siGENOME SMARTpool GOLGB1-siRNA (Dharmacon) with a final concentration of 15nM using HiPerfect (Qiagen, 301704) according to the manufacturer's instructions. The SMARTpool of GOLGB1-siRNA contains 4 individual siRNAs with the following sequences; GAACUAGAGUCUCGGUAUA, UAAGAAUUGCAACCUAA, GUACACAGGUUAAGUGCUU and GAAGGUCUGUGAU ACUCUA. The level of Giantin expression was determined by qRT-PCR. Total RNA was extracted using a GenElute Mammalian Total RNA Miniprep Kit (Merck, RTN70), and cDNA was reverse-transcribed using a High Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific, Cat # 4368814). The expression of Giantin was determined using SensiFAST SYBR Hi-ROX (NordicBio, BIO-92002) and normalized to ACTIN using commercially available primers (Qiagen; for Giantin, QT00038087 and TBP, QT00000721).

Confocal immunofluorescence microscopy
GFP-BT549 cells were grown on glass cover-slips placed in a 6-well plate, and then fixed with 3% paraformaldehyde for 20 min at RT. Afterward, cells were washed in PBS with 20 mM glycine followed by incubation for 5 min at RT in permeabilization buffer containing PBS with 0.2% triton X100. Then, the fixed cells were incubated with mouse monoclonal anti-GM130 antibody (BD-Biosciences/Puls Medical, 610823) dilution 1:1000. After incubation with primary Abs at 37˚C for 1 h followed by washing with PBS three times, cells were incubated for another 1h in goat anti-mouse secondary antibody, Alexa Fluro 488 (Thermo Fisher Scientific, A-11001) diluted in 3% BSA in PBS. Finally, cells were mounted in polyvinyl alcohol with DABCO antifade and imaged. Imaging was performed on Zeiss LSM700 laser scanning confocal microscope. All images were acquired randomly using a Plan-Aphochromat 63x Oil Ph3 M27 objective (NA 1.4).

Two-dimensional dense random migration assay
For optimal tracking efficiency we generated 3 populations; BT549 + GFP-BT549, BT549 + Giantin-silenced GFP-BT549, and MDA_MB_231 + GFP-MDA_MB_231. These cells were co-cultured at a ratio of 3:1 respectively in 96 well image-lock plates (EssenBio, 4739, Lot#17040501) for 24 hours at 37˚C and 5% CO 2 . Then cells were scanned at ten-minute intervals over 24h in Essen BioScience's IncuCyte S3 with objective lenses 10x (NA; 0.95; Image Resolution: 1.24um/pixel) using both the phase channel (HD Phase imaging in gray values) and green channel (Emission Wavelength: 524nm; Excitation Wavelength: 460nm; Exposure Time: 200ms). Images were collected using a Basler Ace 1920-155um camera with CMOS sensor. Cellular viability was assessed throughout the course of the scanning by comparing the phase cellular morphology between mother cells and their corresponding GFP cells. ImageJ was used to linearly correct the brightness and contrast of the images and merging them as multi-TIFF stack. All the images collected from the same experiment were modified the same way to leave the information unaltered. GFP cells were tracked and subjected to trajectory analysis using the in-house developed cellmigRation package as described in the next section [22].

The cellmigRation package
The cellmigRation is an open-source R package that we developed to analyze cell movements using TIFF images captured over time. Its latest stable version is hosted on the peer-reviewed R Bioconductor repository for easy installation and usage (DOI: 10.18129/B9.bioc. cellmigRation). The software includes two modules aimed at i) data import and pre-processing; and ii) advanced analytics and visualization. Image pre-processing (Module 1) is performed similar to what described by DuChez B, 2018 [23]. Briefly, we ported the FastTracks software (https://www.mathworks.com/matlabcentral/fileexchange/60349-fasttracks) from MATLAB to R and added functionality to automate some of the initial optimization steps (background correction, signal detection). Similar to FastTracks, our R package can import TIFF files, perform background removal, detect signal peaks corresponding to fluorescent particles in each stack of the image and record the corresponding centroid coordinates. Next, cell trajectories are computed by connecting the closest centroids from consecutive stacks provided that they are located within a user-defined maximum displacement radius. Cell tracking data can be used as input for the advanced analytic and deep trajectory analysis functions (Module 2). Our software can compute several metrics and statistics. Specifically, cellmigRation includes functions dedicated to the analysis of i) cell persistence and speed; ii) directionality; iii) mean squared displacement; iv) direction auto-correlation; and v) velocity autocorrelation. Module 2 also includes functions aimed at exporting data, automatically building cell-based and population-based visualizations and generating 3D interactive plots. Cell trajectory Principal Component Analysis (PCA) and Clustering are supported as well.

Gelatin degradation assay
Sterile cover-slips (12 mm, #1 Menzel Microscope Coverslips) were coated with a thin layer of 0.2 mg/ml Oregon Green 488-gelatin (Life Technologies, G13186) for 20 min, fixed in 0.5% glutaraldehyde for 40 min and washed three times with sterile PBS. Cover-slips were then transferred to a 24-well plate and incubated in complete growth medium for 30 min at 37˚C. BT549 cells transfected with either Ctrl-siRNA or GOLGB1-siRNA (Dharmacon) for 48h were seeded on top of the cover-slip at a density of 7.5x10 4 cells/well and incubated overnight. Cells were then processed for anti-GM130 immunofluorescence staining. Images were randomly acquired on Andor Dragonfly spinning disk using a Nikon Ti2 inverted optical microscope equipped with a 60 × TIRF objective (Plan-APOCHROMAT 60 × /1.49 Oil). Fluorescence was collected using an EMCCD camera (iXon Ultra 888, Andor). To quantify the gelatin degraded areas per image, appropriate functions of ImageJ were used. The measured area was divided by the number of cells in the corresponding image. Kruskal-Wallis rank sum test was used to compare the gelatin degraded areas between the two conditions.

Patient-derived breast cancer xenografts
Patient-derived xenografts (PDX) were established at Institute for Cancer Research, Oslo University Hospital (MAS98.12 and MAS98.06) or the Institute Curie, France (HBCx-34 and HBCx-39) as previously described [24,25]. Briefly, primary mammary tumor specimens were implanted into immunodeficient mice. After initial establishment, the tumor tissue was serially transplanted into mammary fat pads of nude athymic mice. Based on histopatological and molecular characterization, the tumors were classified as triple-negative basal like (MAS98.12 and HBCx-39) and hormone receptors positive luminal B breast cancer (MAS98.06 and HBCx-34) [24,26,27]. our μm thick paraffin sections were rehydrated in a serial dilation of ethanol and boiled in TRIS-EDTA (pH 9.0) for antigen retrieval. The tissue sections were permeabilized with triton X-100 and then incubated at RT for 2 hours with anti-Giantin antibody (Nordic BioSite, bs-13356) dilution 1:1000. Then the tissue sections were washed with PBS three times, cells were incubated for another 1h in goat anti-mouse secondary antibody, Alexa Fluro 488 (Thermo Fisher Scientific, A-11001). The sections were incubated with Hoechst 33342 dye (Thermo Fisher Scientific, H3570) to stain the nuclei. The sections were rinsed with Milli-Q water and embedded in polyvinyl alcohol mounting medium (Sigma Aldrich, 10981). Fluorescence images were randomly acquired on on Andor Dragonfly spinning disk using a Nikon Ti2 inverted optical microscope equipped with a 60×TIRF objective (Plan-APOCHRO-MAT 60× /1.49 Oil). Fluorescence was collected using an EMCCD camera (iXon Ultra 888, Andor). To quantify the Golgi state, Golgi structure was assessed by visual inspection and categorized as intact, fragmented or undetermined.

Breast tissue array
Invasive ductal carcinoma and adjacent normal breast tissue samples were obtained from US Biomax, BC08118a. The slide was baked at 60˚C for 2h followed by incubation in Neo-clear (Sigma 109843) for 5min. The slide was re-hydrated in serial dilutions of Ethanol and then boiled in Tris/EDTA (pH 9,0) for 15min. After washing with PBS for 5min the slide was permeabilized using triton X-100 (0,25%) for 8min and then blocked in 5% BSA (in PBST) for 1h at RT. To stain the Golgi, the slide was incubated with anti-Giantin antibody (Nordic BioSite, bs-13356) in 1% BSA (PBST) for 2h at RT. After three times washing with PBST for 5min the slide was incubated for another 1h in goat anti-mouse secondary antibody, Alexa Fluro 488 (Thermo Fisher Scientific, A-11001). After several washing steps, the Hoechst 33342 dye (Thermo Fisher Scientific, H3570) was used to label the nuclei. Finally, the slide was rinsed with Milli-Q water and embedded in polyvinyl alcohol mounting medium (Sigma Aldrich, 10981). The fluorescence images were acquired and analyzed as previously described (section 2.6).

Circulating tumor cells/clusters (CTCs) datasets
To profile the expression of Giantin across normal and breast cancer cells in addition to circulating breast tumor cells, we used two single cell RNA-seq (scRNA-seq) datasets. The first dataset consists of normal and breast cancer cells. Data are available in the GEO database with accession numbers GSE114727 [28]. The other dataset consists of single circulating tumor cells collected from the blood of breast cancer patients and obtained from several studies [29][30][31][32][33][34][35][36] compiled in our previous work [37]. Data are available in the GEO database with accession numbers GSE51827, GSE55807, GSE67939, GSE75367, GSE109761, GSE111065, GSE86978 and PRJNA471754. After downloading raw scRNA-seq read count data we followed a standard pre-processing pipeline to filter the cells and genes as described in our previous work [37][38][39]. After filtering the cells we performed the median by ratio normalization method followed by log transformation [39]. For comparison purposes among circulating tumor cells, normal and breast cancer cells, cells with Giantin expression as zero were excluded from further downstream analysis.

Computational model of tumor progression
To investigate the impact of speed and directional persistence of breast cancer cells on tumor growth, we use a 2D stochastic cell-based model inspired by the persistence random walk model [40]. Our newly developed model accounts for cell migration, proliferation, intravasation and death.
The model is based on 8 main assumptions: i) Cells are individual active agents that, for simplicity, live in a 2D section. ii) Cell movement is characterized by a given speed and directionality. iii) The 2D tissue section receives nourishment from functional blood vessels that are assumed to be perpendicular to the tissue section. iv) Cells in the nourished zone adjacent to a functional blood vessel are more likely to proliferate faster and move within this zone than leave it. v) Cell division depends on several factors, including proximity to blood vessels, local cell density and the time since the division of the corresponding mother cell. vi) Cells that spend the duration of a G1 phase in the nourished zones can have a shorter cell cycle length. vii) Cells become necrotic and die if they are not able to divide due to lack of space after several cell cycles. We also consider cells that spend prolonged periods of time without dividing, but are able to re-enter the cell cycle. We refer to those cells as "quiescent". In an in vivo situation, such cells could be quiescent, dormant or senescent. viii) Cells moving towards a blood vessel can intravasate when they get in contact with it.
The aforementioned assumptions were translated into a stochastic mathematical and computational model that is fully described and parameterized in the Table A in S1 Text. The model uses 2D continuous space and discrete time steps. Cells are described as circular agents of a given diameter and their position is determined by the coordinates of the center of the circle in a 2D computational domain. At any given time point, two cells cannot overlap, but to move from one location to another in each time step, cells can cross through locations that are occupied by other cells. Cell movement is simulated according to speed profiles computed following an empirical distribution. The directionality of a cell is determined in terms of persistence ratios, which range between 0 and 1. A persistence ratio of 1 refers to a very persistent cell, which maintained a preferred overall direction of movements within bilateral 90 degrees. Cell division is a stochastic event. The probability of a cell to divide is determined based on a proliferation score PS = T + D + V, that depends on three additive factors: the time since the division of the original agent (T), the local cell density (D), and proximity to blood vessels (V). At each time point, these three factors are monitored for every cell in the simulated field. The local cell density (D) is a discrete index reflecting the number of occupied spots around the target simulated cell in the Moore neighborhood (Fig G in S1 Text). The higher the number (n) of occupied spots in the Moore neighborhood, the smaller is D. However, if the Moore neighborhood of a cell is fully occupied by other cells (n = 8) we assume proliferation is completely inhibited. The proximity to a blood vessel (V) describes a proliferation advantage a cell can get when it is located in well-nourished zones around blood vessels. We assumed that the proliferation rate is 30% higher in well-nourished zones. This assumption was estimated empirically and is based on literature [41][42][43]. The proliferation probability is high when PS is equal or higher than a predetermined cell cycle length. Instead, we assume a very low division probability when PS is lower than the cell cycle length. Additionally, we assume that after cell division, the two cells start the first phase of the cell cycle known as G1 phase. If a cell spends the needed time to complete the G1 phase (t G1 ) in a well-nourished zone, the probability for division is switched to high, regardless of the value of PS. Based on the lifespan length of a cell, it can be classified as undergoing proliferation, quiescent or necrotic. If a cell is not undergoing cell division within a particular period of cell-cycles (n n ) then it is considered necrotic and removed from the field. Cells that did not divide within a given period of cell-cycles (n s with n s < n n ) are considered as quiescent cells. Otherwise, cells are considered to undergo proliferation. Finally, cells are removed from the field to simulate i) cell death if a cell is not undergoing cell division after a number of cell-cycles because of lack of available space. ii) intravasation if a cell succeeded in penetrating into the lumen with a given probability.
In our model, the ability of a cell to move, divide, intravasate and leave a well-nourished zone is a stochastic event. Such stochasticity is achieved by generating a uniform random number between 0 and 1 per cell per process per iteration. A process is triggered when the random number is equal to or less than the corresponding parameter value, representing the probability of the event to take place. The algorithm used for each discrete time step and for each cell is described in detail in the S1 Text. Periodic boundary conditions were imposed to simplify the implementation and to increase the computational efficiency. While the choice of periodic boundary conditions avoids losing cells out of the simulation field, it may affect the spatial distribution of cells when the tumor is large enough to get in touch with the boundaries.
In each simulation, which is characterized by certain speed and directional persistence, we accounted for the following outcomes: the number and distribution of cells in the 2D domain, the number of intravasating cells, the proliferation outcome and the mean time spent in the nourished zones. In this way we were able to estimate the optimal speed-persistence combination that leads to bigger tumors and more intravasation. We also used the model to study the pathological relevance of Golgi fragmentation in breast cancer. Computer simulations were performed mainly in HPC solutions provided by the Oslo Centre for Epidemiology and Biostatistics, at the University of Oslo and also on the Abel computer cluster, owned by the University of Oslo, and operated by the Department for Research Computing at USIT, the University of Oslo IT-department. The source code to simulate the model in addition to the simulation data are available at https://github.com/ocbe-uio/Cancer_simulator.

The effect of cell speed and persistence on tumor progression
Two of the most intuitive parameters to characterize cell movement are the speed and directional persistence. We first investigated the impact of these two parameters on tumor growth and intravasation using computational modeling. Our model accounted for cell migration, proliferation, intravasation, and cell death (see the Methods section and Table A in S1 Text for a full description of the model and its parameterization). Briefly, cells were freely moving in a two-dimensional space, were assigned a certain rate of division, and were assumed to die if they had not performed a division for longer than three doubling times. This allowed us to simulate cell death and thereby tumor necrosis in poorly nourished areas. To account for growth advantage and attraction of the well-nourished areas close to blood vessels, cells were assigned a lower likelihood to leave this area and were given a 30% growth advantage (faster doubling). We implemented extensive numerical simulations of 130 speed-persistence combinations. We used live epifluorescence imaging of cells (BT549 and MDA-MB-231) migrating freely in 2D followed by automated tracking to infer a realistic range of parameter space for persistence and speed (see Figs I and J in S1 Text for the experimental outcome). For each simulation of a 90 days period, we computed the changes in the number and distribution of cancer cells. We characterized the spatial cell distribution using the uniformity index (UI) [44][45][46]. High UI indicates a homogeneous distribution all over the simulation field, whereas low UI is an indication for cluster formation. Due to the model stochasticity, simulations were repeated twenty five times for each speed-persistence combination (see S1 Video).
Simulations showed that cell speed and persistence have a profound effect on the distribution of cancer cells generating tumors with different uniformity indices (UIs) (Fig 1A). At very low speed, the simulation produced tumors that were very compact and in the extreme case exhibited signs of tumor necrosis in the center of very compact tumors (see insets in Figs 1A and A in S1 Text, panels A and B). A key prediction of our in silico simulations is that the size of the tumor increases with both speed and persistence of cell migration (Fig A in S1 Text, panel C). The impact of speed was stronger than that of persistence. When a fixed persistence value was used, increasing the speed had a major impact on the number of cells in the tumor (Fig 1B). The impact of persistence on the size of the tumor was barely noticeable at low speeds ( Fig A in S1 Text, panel D).
Cells moving towards and contacting a blood vessel were assigned a certain probability of leaving the tumor by intravasation. We assigned a probability of 1:10000 (0.01%) of a cancer cell to leave the primary tumor. This number is a pure estimate based on the assumption that a tumor with a mass of 1 gram would contain around 10 8 cells [47] and the number of circulating tumor cells is between 1-3 CTCs per ml of blood assuming an average human to have 5L of blood. Simulations showed that the number of intravasating cells increased with speed only (Fig A in S1 Text, panel E). Under the same persistence condition, speed had a strong impact on the number of intravasating cells, but no effect on persistence was observed when the speed was kept constant (Fig 1C and 1D). This is most likely because speed has a stronger impact on the total number of cells in the tumor and it is intuitive to assume that larger tumors will produce more intravasating cells.

Golgi fragmentation in breast cancer tissue
Next, we sought to explore the biological significance of our computational model. As stated above, alterations of Golgi structure are well known to affect cell migration [6,15]. Thus, we decided to test the pathophysiological relevance of our model predictions by investigating cell migration traits associated with a condition that induces Golgi fragmentation. Although the role of the Golgi in tumor progression has been discussed in the literature [6,13,48,49], there is surprisingly little experimental evidence for structural alterations of the Golgi in patientderived tumor tissue. Golgi fragmentation has been reported in colorectal cancer [50] and prostate cancer [51], but to the best of our knowledge no analysis of a Golgi structure in breast cancer tissue is available. Thus, before directly investigating a specific condition that induces Golgi fragmentation, we wanted to assess Golgi structure in patient-derived xenograft (PDX) samples from triple-negative basal-like, or hormone receptors positive luminal B tumor tissue (Fig 2A and 2B). In the triple-negative tumors, Golgi fragmentation was observed in 50% of the examined cells whereas only 41% of the hormone receptors positive tumors contained fragmented Golgi. The prevalence of this phenotype tempts to speculate about a relevance of Golgi fragmentation in breast cancer pathology. This notion is further supported by the observation of the higher prevalence of Golgi fragmentation in samples from invasive ductal carcinoma compared to adjacent normal breast tissue (Fig B in S1 Text, panels A and B).

Giantin is a Golgi protein that might be involved in breast cancer progression
Previous work has identified several genes whose depletion may result in Golgi fragmentation [14,16]. However, because these data are from siRNA screens, it remains largely unclear whether these gene depletions induce Golgi fragmentation in a direct manner. Thus, we decided to focus on Golgi matrix proteins and tested whether their levels are linked to survival of breast cancer patients using the kmplot database [52,53]. We focused on six matrix proteins that have a relatively well understood role in maintaining Golgi morphology (GOLGA1, GOLGA2, GOLGA3, GRASP55, GOLGB1 and GORASP1). With the exception of GRASP55 and GOLGA3, there was a significant trend indicating that low expression correlates with poor survival (Fig C in S1 Text). This trend was very strong for Giantin (also known as GOLGB1, Fig 2C) and therefore we decided to focus on this Golgi matrix protein for further studies. Silencing Giantin expression in BT549 cells and MDA-MB231 cells worked reproducibly and induced a fragmentation of the Golgi (Figs 2D and 2F and D in S1 Text, panel A). To investigate the migratory pattern of BT549 cells upon Giantin knockdown, we analyzed the trajectories of cells that were freely migrating on a 2D surface. Cells were scaled and projected based on their root mean square speed and persistence ratio. Scaling was done by normalizing them to the mean values, which were assigned a value of 1. Using this approach, we defined four groups with distinct migration and persistence values: group 1 (low speed, low persistence), group 2 (low speed, high persistence). group 3 (high speed, low persistence) and group 4 (high speed, high persistence) (Fig 3A). In comparison to control cells, BT549 cells lacking Giantin were preferentially found in the low-speed groups 1&2 (Figs 3B-3D and D in S1 Text, panel B). On the other hand, cell persistence was not appreciably affected by the depletion of Giantin (Fig D in S1 Text, panels C and D). Seeing these experimental results through the lens of our computational model, the migration pattern of Giantin depleted cells would result in a smaller tumor, which we expect to seed fewer intravasating cells. This is difficult to reconcile with the fact that patients with lower Giantin exhibit lower disease-free survival rates (Fig 2C). Our simulations were all carried out with a fixed rate of intravasation and therefore, it appears intuitive that smaller tumors would produce less intravasating cells. Thus, we asked to what extent must the rate of intravasation of a smaller tumor increase to produce more intravasating cells than a bigger tumor? We therefore re-run our simulations with adjustments of the rates of intravasation. To better link these new simulations to our experimental data, we took into account the number of cells with Golgi fragmentation in conditions of Giantin knockdown, where we observed that 65% of cells display a fragmented Golgi. We assumed that the cells with fragmented Golgi are the ones with the highest level of Giantin knockdown where the biological effects (i.e. slower cell migration) is also expected to be strongest. In the simulations, tumors   with low Giantin, were considered to have 65% of cells with slower cell migration. Because control conditions exhibited 14% of cells with Golgi fragmentation, we used this value for the simulation of tumors with high levels of Giantin. The higher the number of cells with Golgi fragmentation, the smaller the tumor and the less intravasated cells it produced. This is to be expected, because Golgi fragmentation (induced by Giantin depletion) slows cells migration, which in our model reduces the size of the tumor, thereby having less cells than can intravasate. When we assigned the Giantin-depleted tumor a 2-fold higher intravasation rate, we observed that it produced a similar amount of intravasated cells compared to control conditions with only 14% fragmented Golgi (Fig 3E). Of note, the tumor with 65% Golgi fragmentation was still smaller (Fig 3F) indicating that the increase in intravasation rate can overcome the effect of the size of the primary tumor. Increasing the intravasation rate by 4-fold allowed the tumors with high degree of Golgi fragmentation to seed more intravasated cells (Fig 3E), which again could not be explained by the tumor size (Fig 3F). Based on this, we hypothesized that increasing the invasion rate by at least 4-fold would allow smaller tumors to seed more circulating tumors cells, which might give rise to more metastasis and hence explain poor patient survival.
To experimentally test whether Giantin depleted cells exhibited higher invasion rates, we tested their ability to degrade extracellular matrix. To this end, cells were seeded on glass cover slips coated with fluorescent gelatin. By analyzing the pattern and size and the area of the digested gelatin we found that silencing Giantin resulted in a marked increase of the median ability of cells to degrade gelatin. This increase ranged from a few percent to more than 30-fold (see violin plot and representative images in Fig 3G and 3H).
To predict the pathological relevance of Giantin loss in breast cancer we expanded our in silico model to account for a higher rate of invasion which we observed experimentally. The revised model had three additional assumptions: 1) each cell in the tumor has a probability of developing the low Giantin phenotype; 2) cells with low Giantin have higher invasiveness, which yields to higher intravasating capability; 3) cells follow the migratory phenotype of one of the four aforementioned groups based on the percent abundance of the groups. The in silico outcome of tumor progression in two independent populations (S2 Video) that diverge in the percent abundance of cells with low Giantin (14% and 65%) shows significant difference in terms of uniformity index, number of generated cells and number of intravasating cells ( Fig  4A). Furthermore, we tested the impact of increasing fraction of low-Giantin cells in the population (0%, 25%, 50%, 75% and 100%) on tumor progression (S3 Video). The results show that the higher the fraction of cells with low Giantin, the lower the uniformity index and the smaller the tumor but with higher intravasation capability (Fig 4B and 4D). The number of intravasating cells in a population of a thousand cancer cells was found to be statistically significant among all the different groups with the highest number (30 ± 4) in a pure population of cells with low Giantin expression (Fig E in S1 Text). Thus, the combination of our mathematical and cell biological experiments provides a possible explanation for the poor disease-free survival of patients with low Giantin levels.

Giantin expression in breast cancer patients
Because our simulation results suggest that loss of Giantin increases the invasive capability of tumor cells to intravasate into the circulation, we would expect circulating tumor cells to have lower Giantin expression than the primary tumor. To test this hypothesis, first we investigated the expression of Giantin in circulating breast tumor cells (n = 1448) compared to a housekeeping gene (ACTB) (Fig 5A). Then, we profiled the expression of Giantin in individual cancer cells (n = 168) and adjacent normal breast cells (n = 1500) collected from patients with  (Fig 5B). On the other hand, we observed a marginal difference in the expression of a housekeeping gene (ACTB) (Fig 5C).

Discussion
In addition to state-of-the art experiments, mathematical and computational models provide insights into complex cancer processes including tumor growth [54,55], metastasis [56][57][58], treatment response [56,[59][60][61][62][63][64] and cell migration and invasion in 2D and 3D matrices [65][66][67][68]. In particular, cell-based models have been recently applied to investigate the roles of proliferation, migration, and necrosis in tumor progression of different cancers (reviewed in [69]). The present study represents a new step in understanding the impact of different traits of cell migration on the progression of breast cancer through an integrated experimentalcomputational approach. Thereby, we provide a possible explanation for the poor survival of breast cancer patients with low expression of Giantin. Loss of this Golgi matrix protein results in fragmentation of the Golgi apparatus, a phenomenon that we found to be very prevalent in patient-derived breast cancer specimens. Our computational approach shows that a simple model accounting for cellular activities and migratory characteristics can be used to study tumor size and the intravasation rate over a given time. Importantly, our model suggests that cells with lower speed lead to a smaller and more compact tumor. Because silencing Giantin led to a reduction in migration speed, our model predicts that tumors with low Giantin expression would be smaller in size. Thus, our computational model indicated the need for further experimental results to explain the clinical survival data. We then noted that silencing Giantin enhanced invasiveness of breast cancer cells and through further computational simulations, we found that this enhanced invasiveness could overcome the effect of the smaller size such that Giantin-low tumors would produce more circulating tumor cells. This notion is the observation that CTCs express lower Giantin levels than the primary tumor.
Several questions arise from our work. One question is whether the compactness of tumors with lower Giantin levels can be validated clinically. Based on this, it will be interesting to evaluate the chemosensitivity of tumors with different levels of compactness. It is possible that chemotherapeutics might have less access to compact tumors, thus making these tumors more likely to relapse after an initial round of therapy. Our mathematical model focused mainly on the cancer cells and did not consider the role of cancer associated fibroblasts or the tumor stroma. It is possible that Giantin-low tumors might be smaller, but also less compact due to the higher invasiveness. Indeed, less compact tumors were found to be associated with lymphovascular space invasion and the formation of lymph node metastases [70]. Our model and the experimental follow-up do not allow to determine how the Giantin-low trait is being selected for. While the acquisition of a more invasive phenotype is an advantage for intravasation, it appears counterintuitive to select for a trait that would cause a growth disadvantage. However, it is possible that this switch between a pro-invasive trait and a pro-growth trait is a reflection of this phenotypic plasticity that is well known in tumors. It is also possible that the invasive phenotype might alter the interaction of cancer cells with their microenvironment and thereby alter tumorigenesis in a way that can only be investigated using in vivo imaging techniques and is therefore beyond the scope of the current study.
The exact explanation of the elevated invasion capability of cells with low Giantin is not yet clear but a few possible scenarios can be hypothesized. One possible explanation is that Giantin depleted cells exhibit a reduction in their speed, thus having more time for efficient degradation of the collagen while moving slowly on top of it. Another scenario is that silencing Giantin induces the expression of matrix metalloproteinases (MMPs) to degrade many components of the extracellular matrix. Finally, there is evidence that fragmentation of the Golgi enhances the rate of intra-Golgi trafficking [71]. Although this has so far only been shown in the context of neurodegeneration, it is possible that a fragmented Golgi would accelerate the secretory transport of MMPs and thereby enhance extracellular matrix degradation.
Taken together, our work used a combination of mathematical modeling and wet-lab experiments and proposed that the speed of cancer cell migration has a major impact on the shape and size of a tumor and this may explain how loss of Giantin could contribute to the process of tumorigenesis.
Supporting information S1 Text.

Impact of Golgi fragmentation on cell migration. (A) Percentage of cells denoted with fragmented Golgi in BT549 cells. (B) Bar plot showing the percentage of cells based on speed, low speed includes group 1&2 whereas high speed includes group 3&4. (C) Bar plot showing the percentage of cells based on persistence, low persistence includes group 1&3 whereas high persistence includes group 2&4. (D) Bar plot showing the averaged persistence ratio per group. Fig E. Number of intravasating cells in a population of a thousand cancer cells.
Statistical significance was determined using the Kruskal-Wallis rank sum test, followed by pairwise comparisons using the Wilcoxon rank sum test with Bonferroni Correction (* p < 0.05, *** p < 0.001). Fig F. Cell' a cell that moved from a to b. Fig G. Simulation field and