The Clinical Significance and Molecular Features of the Spatial Tumor Shapes in Breast Cancers

Each breast cancer has its unique spatial shape, but the clinical importance and the underlying mechanism for the three-dimensional tumor shapes are mostly unknown. We collected the data on the three-dimensional tumor size and tumor volume data of invasive breast cancers from 2,250 patients who underwent surgery between Jan 2000 and Jul 2007. The degree of tumor eccentricity was estimated by using the difference between the spheroid tumor volume and ellipsoid tumor volume (spheroid-ellipsoid discrepancy, SED). In 41 patients, transcriptome and exome sequencing data obtained. Estimation of more accurate tumor burden by calculating ellipsoid tumor volumes did not improve the outcome prediction when compared to the traditional longest diameter measurement. However, the spatial tumor eccentricity, which was measured by SED, showed significant variation between the molecular subtypes of breast cancer. Additionally, the degree of tumor eccentricity was associated with well-known prognostic factors of breast cancer such as tumor size and lymph node metastasis. Transcriptome data from 41 patients showed significant association between MMP13 and spatial tumor shapes. Network analysis and analysis of TCGA gene expression data suggest that MMP13 is regulated by ERBB2 and S100A7A. The present study validates the usefulness of the current tumor size method in determining tumor stages. Furthermore, we show that the tumors with high eccentricity are more likely to have aggressive tumor characteristics. Genes involved in the extracellular matrix remodeling can be candidate regulators of the spatial tumor shapes in breast cancer.


Introduction
The amount of cancer cell accumulation that is reflected by the tumor's spatial shape is the result of constant interaction between the proliferating cancer cells and their microenvironment. Along their spatial growth, the cells of solid cancers initiate the process of invasion and metastasis that can ultimately lead to fatal distant diseases. The degree of tumor cell accumulation in the primary organ is often measured by the longest diameter, an integral component of the widely used TNM staging system [1,2]. The largest tumor diameter is regarded to represent the risk of cancer metastasis and the probability of distant recurrences. However, the degree of cancer cell accumulation can be poorly determined when the tumor size is solely assessed by the uni-dimensional diameter since each human tumor has a unique three-dimensional shape. Accordingly, researchers have proposed better prognostic models based on the tumor volume measurement rather than using single diameter for various types of cancers [3,4]. The optimal method of measuring tumor burden in the primary organ remains to be tested.
Another issue that should be addressed with regard to the spatial tumor growth is the clinical implications and the underlying mechanisms for the inter-tumoral variations of the spatial tumor shapes. It is largely unknown how each tumor shape its spatial contour and what are the underlying differences in molecular characteristics. Recent studies are now beginning to elucidate the molecular characteristics of this spatial tumor growth. For example, colorectal tumors that show laterally spreading patterns show unique gene expression features including β-catenin, type IV collagen, and aPKC [5]. Mathematical modeling of the spatial tumor growth has been often used to explain the process of longitudinal tumor growth [6][7][8]. Although the modeling approach can reveal many novel aspects of tumor growth, the approach is limited by the difficulties in incorporating other clinical characteristics of tumors.
In this study, we aimed to explore the usefulness of the tumor volume measurement in predicting outcomes of the breast cancer patients. Additionally, we investigated the inter-tumor variations of eccentricity in three-dimensional tumor shapes and the association of this eccentricity with known important prognostic factors in breast cancer. Finally, in a small cohort of breast cancer patients, we explored the relationship between the spatial tumor shape and molecular characteristics of tumors.

Patients and database
The use of the clinical and pathologic data from breast cancer patients for this study was approved by the institutional IRB of Seoul National University Hospital. The written informed consents were obtained prior to the tissue collection for breast cancer tissue repository (IRB No 1405-088-580). For the retrospective analysis, the patients record and identity were anonymized and de-identified prior to analysis by approved researchers (IRB No 1504-057-664). All procedures were done in accordance with the Declaration of Helsinki.
The demographic, clinical, and pathologic information of the studied patients were obtained from the Seoul National University Hospital Breast Care Center Database. The detailed information of the database has been described previously [9]. We retrieved data of all breast cancer patients who underwent breast cancer surgery between Jan 2000 and Jul 2007. Exclusion criteria were patients with multifocal or multi centric tumors; patients who received preoperative systemic treatment, patients who underwent excisional biopsy for the diagnosis of cancer, patients with tumors larger than 10cm, patients with no available three-dimensional tumor size measurement, and patients without immunohistochemistry subtype information. Three-dimensional tumor diameters were measured by the pathologists at the time of pathologic diagnosis.
SED was defined as the proportional volume discrepancy between the STV and ETV of each tumor (SED = (STV-ETV)/STV). SED value increases as the tumors have more ellipsoid spatial shapes. For example, If a tumor has identical three-dimensional diameters, the SED of the tumor would be zero.

Transcriptome and exome profiles associated with SED
The results of transcriptome and exome sequencing of 120 breast cancer tissues, which was approved by the Seoul National University Hospital IRB (IRB No 1109-007-376), has previously been reported [11]. In this study, the data of 41 breast tumors with available threedimensional tumor diameters were analyzed. Breast cancer tissues were collected at the curative surgery for patients who gave informed consents.
Briefly, total RNA was obtained from archived tumor tissues and cDNA library was constructed with the TruSeq RNA kit. The kit protocol included polyA-selected RNA extraction, RNA fragmentation, random-hexamer-primed reverse transcription, and 101 nucleotide paired-end sequencing, which was performed on an Illumina HiSeq2000.
We had checked quality of reads by FastQC v0.11.3 and removed 5 bp at both fragment ends by NGSQCToolkit v2.3.3. We had used in-house custom script to determine fragment sizes and standard deviations after all the paired-end reads were mapped onto NCBI RefSeq transcriptome by BWA v0.7.10. Average fragment size is 195 bp and average standard deviation is 68. Tuxedo protocol (bowtie v2.2.4 and tophat v2.0.13) was used to map the reads onto hg19 human reference genome with refGene transcriptome downloaded from UCSC genome browser. We used HTSeq v0.6.1 to extract reads counts for each gene and in-house R scripts to assess the correlation and linear regression for gene expression and SED values. We have used normalized FPKM values from all RNA-Seq data in order to find positively and negatively correlated genes with SED values. Pearson correlation and p-Value were calculation by cor.test module in R package. We have chosen final candidate genes by selecting p-Value < 0.01, means > 10, and CV > 1. For pathway analysis, we have generated connection network diagrams by using Pathway Studio Web [12].
For whole exome sequencing, genomic DNA was extracted from tumor tissues and blood samples using a QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA). DNA integrity was verified by electrophoresis on 0.8% agarose gels. The quality and quantity of the DNA were measured using a NanoDrop Spectrophotometer and Quant-iT PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA), respectively. An amplicon library was generated using SureSelectXT whole exome v4.0 (Agilent Technologies, Santa Clara, CA, USA) and the whole-exome sequencing (WES) was performed with an Illumina HiSeq 2500 system (Illumina, Inc., San Diego, CA, USA). Among, the 41 patients, two showed poor quality in exome sequencing data, and the final exome analysis was done in 39 patients. Absolute copy numbers based on copy-number variations of samples are inferred by absCN-seq [13], and clonality analysis is performed by sciClone [14]. Somatic mutations are calculated by VarScan v2.3.6 and variants are annotated by in-house annotation pipeline.

Quantitative real time-PCR (qRT-PCR)
Quantitative real time-PCR was performed in 96 well PCR plate (Thermo scientific) containing the SYBR green Master Mix (Applied Biosystems), distilled water, 10 ng of cDNA templates from breast cancer tissues and 200 nM of MMP13 forward and reverse primers. Specific primer sequences are 5'-CTTGACCACTCCAAGGACCC-3' (forward), 5'-CCTCGGAGACTGGTAA TGGC-3' (reverse). qRT-PCR analysis was performed with the 7300 Real Time PCR System (Applied Biosystems) using the following conditions: 95°C for 10 min, followed by 40 cycles of 95°C for 15 sec and 60°C for one min. The results were normalized to the housekeeping gene, β-actin, and the cycle threshold (Ct) values were analyzed. These experiments were repeated three times.

Statistical analysis
Patients were classified into four groups according to their expression status of hormonal receptor (HR) status and HER2 amplification status. HR status was defined as positive when the tumor showed positive expression of either ER or PR. The ER and PR positivity were defined by the cut-off of 10% and HER2-positive tumors were either strong positive on immunochemical staining or gene amplification on FISH. The characteristics were compared by using the χ2 test and Student's t-test. Univariate survival analysis was carried out using the Kaplan-Meier method, and log-rank tests were employed for comparison of survival curves. Multivariate analyses were conducted using Cox's proportional hazard regression model. The relationship between MMP13 gene expression and molecular subtypes of breast cancer was assessed by using RNA Seq data downloaded from cbioportal (http://www.cbioportal.org) [15].

Various types of tumor volume (TV) measurements and survival prediction
We calculated various types of TV measurement in 2,250 breast cancer patients who underwent surgery between Jan 2000 and Jul 2007 based on their three-dimensional diameters measured during the pathologic examinations. As shown in Fig 1a, there was a significant difference between spheroid, oblate, prolate, and ellipsoid TV measurements. The median TV was 5.57 (±39.9) cm3 for spheroid TV measurement and 2.54 (±11.5) cm3 for ellipsoid TV measurement, respectively. In some tumors, the types of TV measurement resulted in change in the ranks of tumor size as shown in the Fig 1b. Ellipsoid TV can reflect the true tumor volume more accurately compared to spheroid TV since it takes account of all three-dimensional diameters. We analyzed whether this potentially improved TV estimation can improve the prediction of recurrence in breast cancer patients. As shown in the Fig 2, classifying patients according to the ellipsoid tumor volume did not improve the prognosis prediction. Our data suggest that improved tumor volume estimation does not lead to better survival prediction.
The association between spatial tumor shape, SED, and molecular subtypes of breast cancer We assessed the effect of breast cancer molecular subtypes on the spatial tumor shapes. First, we analyzed the relative lengths of the diameter b (b/a) and c (c/a) of tumors in different  molecular subtypes as defined by the HR and HER2 expression status. As shown in Fig 3a, the relative lengths of b and c were significantly shorter in HR+ tumors when compared to those of the HR-/HER2-tumors. To measure the tumor eccentricity quantitatively, we calculated the spheroid-ellipsoid discrepancy (SED) for all studied tumors. The SED ranges from 0 to 1, and the SED would be 0 for the tumors that are completely spheroid. As expected, the HR-/HER2tumors had significantly lower SED when compared to HR+ tumors (Fig 3b). In overall, the triple negative breast cancers, which are HR-/HER2-tumors, had most spheroid shape compared to other subtypes.

Prognostic significance of the tumor eccentricity
The prognostic implication of the tumor eccentricity on the distant-metastasis free survival of the breast cancer patients was analyzed. There was a significant association between the degree of eccentricity and the time to distant metastasis in patients with hormone receptor negative tumors (Fig 4). Tumors with higher SED had shorter time to distant metastasis. The prognostic significance of the tumor's SED can be attributed to the association between the SED and known poor prognostic factors such as tumor size, node metastasis, and histologic grade (Table 1). When the patients were classified according to the HR and HER2 status, the significant association between the SED and lymph node metastasis was only seen in HR-/HER2-tumors. After   adjusting for other prognostic factors, the SED did not show a significant impact on distant metastasis (S1 Table). These results show that, although not an independent prognostic factor, the degree of eccentricity is associated with various clinical and pathologic prognostic factors and tumor recurrence in a subset of breast cancer patients.

Gene expression profiles, somatic mutations, and clonality
To investigate the gene expression profiles underlying the determination of breast cancer eccentricity, we obtained the transcriptome data generated by a paired-end massively parallel RNA sequencing of 41 breast tumors, which was a part of the breast cancer RNA sequencing project [11]. After filtering the gene expression data with the significance of correlation, the level of expression, and the coefficient of variation, there were 39 and 15 genes that showed significant negative and positive correlation with SED, respectively ( Table 2). Among the significantly correlated genes with highest p value, there were genes associated with extracellular matrix remodeling such as ADAMTS12 and MMP13 (Fig 5a). MMP13 was significantly associated with SED in both ER positive and ER negative subsets while ADAMTS12 showed marginal significance in ER negative tumors (Fig 5b). We tested the association between the MMP13 expression and SED in an independent cohort of 52 breast cancer patients by qRT-PCR using fresh tissues obtained from the curative surgeries. As shown in the Fig 5c, the MMP13 expression was significantly associated with the SED of the 52 tumors, validating our initial observations. These results suggest that tumor-stromal interaction mediated by MMP13 can contribute to the spatial tumor shape development in breast cancers. Molecular network analysis of the genes having correlated expression with SED is shown in Fig 6a. Network analysis shows that MMP13 gene expression can be positively regulated by ERBB2 and S100A7A, both of which are negatively correlated with SED. The relationship between MMP13 and ERBB2 activity was further investigated by using TCGA RNA Seq dataset. The expression level of MMP13 was significantly higher in HER2-enriched subtype when compared to those of other molecular subtypes (Fig 6b).
The degree of cancer cell clonality and the degree of global somatic mutation were also tested for their association with the tumor eccentricity. The numbers of cancer cell clonality and the numbers of the somatic mutations for each tumor were examined by using the whole exome sequencing data of the same 39 tumors. Neither the cancer cell clonality nor the degree of somatic mutation showed significant correlation with the SED (S1 Fig).

Discussion
In this study, we aimed to describe the clinical relevance of the three-dimensional growth patterns in breast cancer and the biologic explanation underlying the diverse tumor growth patterns. First, we have demonstrated that the ellipsoid tumor volume measurement, which is theoretically more accurate way of estimating true tumor volume, can result in substantial changes in tumor size determination. Wapnir et al [3] has also demonstrated that there is a significant overestimation of tumor volume when using the greatest diameter alone for early breast cancers. They have also suggested that by measuring tumor volume, one can improve the accuracy of prognosis prediction in early breast cancer in their analysis of 165 breast cancer patients. However, our results show that classifying patients according to their tumor volumes did not improve the outcome prediction compared to the current tumor size measurement. The implication of this finding is that measuring the total amount of cancer cell does not provide additional information on the tumor aggressiveness to the conventional measurements. Our finding support the idea that a small proportion of cancer cells in solid tumors, mostly   Three-Dimensional Shapes of Breast Cancer located at the invading front, determines the local invasion and metastasis while the remaining tumor cells are non-metastastic [16]. Also, it is consistent with our previous report showing that the presence of the additional invasive tumors does not lead to worse outcomes in luminal subtypes of breast cancer [17].
Our results also show that the molecular subtypes of breast cancer affect the patterns of the 3-dimensional tumor growth in breast tumors. Three-dimensional tumor diameters, measured by pathologic examination, showed that triple negative tumors had least eccentric shape when compared to other subtypes. Recent breast imaging studies have also shown similar findings showing triple negative tumors having more round shape and smooth margins [18]. In consistent with our results, Bae et al [19] have shown by quantitative MR imaging of 280 breast cancer patients that triple negative tumors have more round tumor shape than other major subtypes. In addition to its association with molecular subtype, the spatial tumor growth pattern in breast cancer, measured by SED, is also associated with known prognostic features such as tumor size, lymph node metastasis, and histologic grade. The tumors with high eccentricity (high SED) showed worse survival outcome due to the high incidences of the unfavorable prognostic factors.
Our transcriptome analysis data of human breast cancer tissues suggests a potential link between the spatial growth patterns and the expression levels of extracellular matrix remodeling genes such as MMP13 which show a significant negative correlation with the SED. Network analysis showed that MMP13 gene expression can be regulated by ERBB2 and S100A7A, both of which also showed negative correlation with SED. ERBB2 is a major gene that determine the molecular subtype of breast cancer [20] and S100A7A has been shown to be down-regulated in estrogen receptor negative tumors [21]. Furthermore, our analysis of TCGA data further showed that MMP13 expression is significantly higher in HER2-enriched subtype and triple negative subtype showed lowest MMP13 expression. Our observation can generate a hypothesis that ERBB2 and S100A7A-mediated MMP13 expression can contribute to the subtype-specific spatial growth patterns in breast cancer.
On the other hand, our study cannot fully explain the molecular mechanisms underlying the relationship between the increased eccentricity and higher incidence of poor prognostic factors in breast cancer. Some of the genes having negative correlation with SED were reported Three-Dimensional Shapes of Breast Cancer to accelerate breast cancer progression. For example, previous studies have shown pro-tumorigenic and pro-metastatic role of ERBB2, WNT7b, S100A7, and MMP13 in breast cancer [22][23][24][25][26][27]. However, Visozo et al [28] have shown that the protein expression levels of MMP13 showed negative correlation with advanced tumor stages suggesting a potential tumor-suppressive role of MMP13 in breast cancers. Furthermore, a recent study showed that silencing MMP13 in the stromal cells increased the rate of mammary cancer metastasis via mechanisms involving peri-tumoral collagen I remodeling [29]. These findings suggest a complex role of MMP13 within the tumor microenvionment, which needs further exploration.
Our study carries several important limitations. First, our assessment of tumor shape is based on the pathologically measured tumor diameters. The complex contour of the human breast cancer cannot be fully reflected by the dimensional diameter alone. An improved tumor shape assessment such as an automated imaging tool may give a better understanding into the spatial tumor shape formation. Second, our RNA sequencing data was derived from a small number of patients. Expanding the analysis to larger patient cohort for whom both transcriptome data and spatial shape data are available can give more clear insight into the molecular mechanism underlying the tumor shape heterogeneity.

Conclusion
In conclusion, our present study demonstrates the accuracy of current tumor size measurement system in predicting breast cancer survival. However, during the analysis, we have observed a significant difference between breast cancer molecular subtypes in terms of tumor eccentricity. The degree of eccentricity was associated with various known prognostic factors such as tumor size and lymph node metastasis. Additionally, by using RNA sequencing, we show that the genes involved in the extracellular matrix remodeling such as MMP13 may contribute to the tumor shape heterogeneity.