Genomic Profiling of Oral Squamous Cell Carcinoma by Array-Based Comparative Genomic Hybridization

We designed a study to investigate genetic relationships between primary tumors of oral squamous cell carcinoma (OSCC) and their lymph node metastases, and to identify genomic copy number aberrations (CNAs) related to lymph node metastasis. For this purpose, we collected a total of 42 tumor samples from 25 patients and analyzed their genomic profiles by array-based comparative genomic hybridization. We then compared the genetic profiles of metastatic primary tumors (MPTs) with their paired lymph node metastases (LNMs), and also those of LNMs with non-metastatic primary tumors (NMPTs). Firstly, we found that although there were some distinctive differences in the patterns of genomic profiles between MPTs and their paired LNMs, the paired samples shared similar genomic aberration patterns in each case. Unsupervised hierarchical clustering analysis grouped together 12 of the 15 MPT-LNM pairs. Furthermore, similarity scores between paired samples were significantly higher than those between non-paired samples. These results suggested that MPTs and their paired LNMs are composed predominantly of genetically clonal tumor cells, while minor populations with different CNAs may also exist in metastatic OSCCs. Secondly, to identify CNAs related to lymph node metastasis, we compared CNAs between grouped samples of MPTs and LNMs, but were unable to find any CNAs that were more common in LNMs. Finally, we hypothesized that subpopulations carrying metastasis-related CNAs might be present in both the MPT and LNM. Accordingly, we compared CNAs between NMPTs and LNMs, and found that gains of 7p, 8q and 17q were more common in the latter than in the former, suggesting that these CNAs may be involved in lymph node metastasis of OSCC. In conclusion, our data suggest that in OSCCs showing metastasis, the primary and metastatic tumors share similar genomic profiles, and that cells in the primary tumor may tend to metastasize after acquiring metastasis-associated CNAs.


Introduction
Oral squamous cell carcinoma (OSCC), which accounts for more than 90% of all oral cancers, is the most common type of head and neck squamous cell carcinoma (HNSCC), and in 2008 was responsible for about 128,000 deaths worldwide [1]. The presence of cervical lymph node metastasis is associated with a 50% decrease in the 5-year survival of patients with OSCC [2,3]. Therefore, it is important to detect or predict the presence of lymph node metastasis in order to treat OSCC effectively. However, even examinations using imaging techniques such as CT, MRI and ultrasonography are still not reliable for detection of micrometastases because of the high incidence of occult neck disease [4][5][6][7][8][9][10][11]. Furthermore, although many parameters of the primary tumors, such as size, thickness and altered gene expression [12][13][14][15][16][17][18], have been reported to be useful for identifying nodenegative patients with a high risk of harboring occult node metastasis [18], the mechanisms by which tumor cells spread from the primary site to local lymph nodes are not well understood [19].
Early studies of colorectal and pancreatic cancers led to a notion that the development and progression of these cancers are associated with accumulation of chromosomal aberrations, a concept referred to as the multistep tumorigenesis model [20]. In fact, it has been reported that the total number of genomic aberrations increases with tumor progression in various tumor types [21]. Meanwhile, recent studies have established extension of the model, designated the clonal evolution model [22][23][24][25], in which a single clone evolves into several distinct subpopulations through accumulation of diverse genetic abnormalities. The predominant population may be replaced by distinct subpopulations within a single tumor mass through the effects of environmental selection pressure and/or the stage of tumor progression. As a consequence, several genetically heterogeneous subpopulations coexist within a single tumor mass. Despite the emergence of these tumor progression models, understanding of the molecular and cellular mechanisms underlying lymph node metastasis is still limited [26,27]. Since 1) the break point and patterns of genomic copy number aberrations (CNAs) tend to be reproduced in descendant clones, and 2) genomic aberrations contribute to the malignant behavior of tumor cells by dysregulating the expression of oncogenes or tumor suppressor genes [28], comparison of genomic profiles between a primary tumor and its paired metastasis should provide clues to the process and mechanism of metastasis.
Array-based comparative genomic hybridization (array CGH) provides information about genomic copy number aberrations (CNAs) across the entire genome [29]. Array CGH is generally used to identify oncogenic or tumor-suppressive genes located in regions showing copy number aberration. Moreover, CGH can also be applied to studies of tumor clonality by collecting multiple samples from within a single tumor [30][31][32][33]. Although several groups have used array CGH to identify regions harboring oncogenic or tumor-suppressive genes in OSCC [34][35][36][37][38][39][40], the relevance of CNAs in the process of lymph node metastasis has not yet been fully characterized. In OSCC, only one study has  compared the genomic profiles of the primary tumor and its associated metastases using array CGH [38]. However, that study included a relatively small number of cases (8 cases) and did not compare the clonality between the genomic profiles of the primary and the metastasis in each case. In HNSCC, a few studies have analyzed the clonal relationship between primary and metastatic tumors [41][42][43]. However, since those studies used conventional metaphase CGH, which has limited resolution, details of genomic regions showing similarities and differences between the two sites were not fully characterized.
Our aim in the present study was to investigate genetic relationships, such as clonality and heterogeneity, between primary tumors of OSCC and their corresponding metastases using high-resolution array CGH, and to identify genomic CNAs related to lymph node metastasis. For these purposes, we collected tumor samples from the metastatic primary tumors (MPTs), their paired cervical lymph node metastases (LNMs), and nonmetastatic primary tumors (NMPTs), analyzed their genomic profiles by array CGH, and compared these profiles between MPTs and paired LNMs, and between LNMs and NMPTs.

Ethics Statement
This study was approved by the ethics committee of Oita University Hospital (Approval No 520 and P-09-03). Informed written consent was obtained from all patients and/or their families. Patients, Tissue Samples and Extraction of Genomic DNA Twenty-five OSCCs were surgically resected at Oita University Hospital. All patients had sporadic tumors and not multiple primary tumors. Tissue sections were cut from formalin-fixed, paraffin-embedded tissues, and stained with hematoxylin-eosin for histological analysis and with toluidine blue (Wako, Osaka, Japan) for extraction of genomic DNA. Using laser-capture microdissection (Arcturus Engineering, Mountain View, CA, USA) ( Figure 1), we collected a total of 42 samples from 25 patients (Table 1 and  Table S1), including 15 paired samples of MPTs and their corresponding LNMs, 10 samples of NMPTs, and 2 samples that were separately microdissected from the same tissue section of LNM from case 8 (see Figure S1). All samples included a proportion of tumor cells exceeding 80% of the total. Patients with metastatic OSCCs were selected randomly. However, to reduce any selection bias in terms of tumor thickness (TT), which is known to have a strong association with the risk of lymph node metastasis, we selected non-metastatic OSCCs with a TT of more than 4 mm. As a result, the difference of median TT between MPTs and NMPTs did not reach statistical significance (p = 0.615, Mann-Whitney U test). Genomic DNA was extracted according to the standard proteinase K digestion method, followed by phenol/ chloroform extraction. As the source of control DNA, genomic DNA was extracted from tissues of normal renal cortex obtained from 12 patients with ureteral or renal pelvic carcinoma, neither of which exhibited invasion or metastasis to the renal cortex. The same amount of genomic DNA extracted from 12 patients was mixed and used as the control DNA.

Array CGH
Array-CGH analysis was performed using 44 k oligonucleotide CGH arrays (Agilent Technologies, Palo Alto, CA, USA). Labeling and hybridization were performed in accordance with the manufacturer's instructions. Briefly, 1.5-2 mg of tumor DNA and an equal amount of control DNA were digested with AluI and RsaI (Promega, Madison, WI, USA). The digested tumor and control DNAs were labeled with Cy5-dUTP and Cy3-dUTP, respectively, using a Genomic DNA Labeling Kit Plus (Agilent Technologies), purified with Microcon YM-30 filters (Millipore, Billerica, MA, USA), and concentrated to 80.5 ml. Equal amounts of tumor and control DNAs were subsequently pooled and mixed with human Cot-1 DNA, dissolved in hybridization buffer (Agilent Technologies), denatured and hybridized to the CGH array at 65uC for 24 h. Glass slides were washed and then scanned in accordance with the manufacturer's instructions.

Data Analysis
Microarray images were analyzed using FEATURE EXTRAC-TION v.9.5.3.1 (Agilent Technologies) with linear normalization (protocol CGH-v4_95_Feb07), and the resulting data were imported into DNA Analytics v.4.0.8.1 (Agilent Technologies). After normalization of the raw data, the log2ratio of Cy5 (tumor) to Cy3 (control) was calculated. Aberrant regions were determined using the ADM-2 algorithm at a threshold of 7.0. To detect gains and losses, we set the values of parameters for the aberration filters as: minimum number of probes in region 2, minimum absolute average log2ratio for region 0.163, maximum number of aberrant regions 10,000, and percentage penetrance per feature 0. We set  the value of the minimum absolute average log2 ratio at 0.163 to detect regions showing a change in the averaged copy number of more than 1.12-fold (log2(1.12) = 0.163). We selected this setting to derive values at least as conservative as the default ones. This was confirmed by a set of reference-versus-reference CGH analyses using the ADM-2 algorithm employing the same aberration filters, in which no aberrant regions were detected (data not shown). This indicated that the estimated false positivity rate was nearly zero, and that our parameter setting was sufficiently conservative. Data generated by probes mapped to the X and Y choromosomes were eliminated. All CNAs detected in each sample and the qualities of array data are summarized in Tables S2 and S3, respectively. All the data are MIAME-compliant, and have been deposited in the GEO database (http://www.ncbi.nlm.nih.gov/geo/, accession number GSE36942). Similarities of genomic profiles between two samples were expressed as concordance rates for probes with which aberrantly gained or lost signals were detected by array CGH. Rates of concordance between paired or non-paired MPTs and LNMs were calculated. In unsupervised hierarchical clustering, we used complete shrinkage as the cluster merge option and (uncentered) correlation as the similarity metric, which were the default settings of the Gene cluster 3.0 software.

Statistical Analysis
Paired t test and Fisher's exact test were used. Differences at P,0.05 were considered to be statistically significant.

Close Similarity with a Minor Degree of Heterogeneity between Genomic Profiles of MPTs and Paired LNMs in Metastatic OSCCs
To investigate the genetic relationship between MPT and the paired LNM in each case, we analyzed the genomic profiles of 15 MPT-LNM pairs using array CGH. One representative case (Case 8) is shown in Figure 2. The MPT and paired LNM of this case shared a similar profile pattern across the entire genome (Figure 2A), especially in chromosome 11q ( Figure 2B), suggesting that tumor cells in the MPT and paired LNM of this case were clonally related. In the same case, however, there were also distinct genomic aberrations in chromosomes 2p, 2q, 7p, 8q, 16q and 20q (Figure 2A). Loss of 2p was observed only in the MPT, and not in the LNM, and loss of 2q and gain of 7p were observed only in the LNM, and not in the MPT ( Figure 2C and D), suggesting that the MPT and paired LNM in this case were composed of genetically distinct subpopulations. To show the geographical distribution of subclones in the same tissue section, we performed separate dissection on tissue sections of LNM from case 8. As shown in Figure S1A, B and C, we collected tumor cells from two sites on the LNM tissue sections and analyzed genomic aberrations by array CGH. Although we were unable to find any clear difference between the patterns of genomic aberration of the two samples ( Figure S1D), we found differences in the log2 ratio of aberrations. As shown in Figure S1E, the intensity of 11q13 amplification was slightly higher in LNM2 than in LNM1 (averaged log2 ratios are 3.2 and 2.1, respectively, indicated by shaded areas), while that of 7p12 amplification was clearly higher in LNM1 than in LNM2 (averaged log2 ratios are 2.4 and 1.2, respectively, Figure S1F). These results suggest that these two sites may be composed of clonally related but genetically distinct subclones. Additionally, the aberration pattern of chromosome 3q of LNM2 was more similar to that of PMT than to that of LNM1 ( Figure S1G), suggesting that tumor cells in LNM2 may be more closely related to those in PMT.
We also compared the genomic profiles of the other 14 paired MPT and LNM samples individually. Although some distinctive differences in genomic profiles were observed between MPTs and paired LNMs, the paired samples in all cases showed similar patterns of CNAs across the entire genome ( Figures S2, S3 and  S4). Furthermore, unsupervised hierarchical clustering based on array CGH data from MPTs and LNMs grouped together 12 of the 15 pairs (Figure 3), indicating that the genomic profile of the LNM was most similar to that of the paired MPT for the 12 clustered pairs. In addition, we analyzed the significance of similarities between MPTs and paired or non-paired LNMs by calculating the concordance rates (See Materials and Methods). As shown in Figure 4, the concordance rate was highest between the MPT and the paired LNM in 13 of the 15 cases. The median of the concordance rate between paired samples was significantly higher than that between non-paired samples (Mann Whitney U test, p,0.01). With a few notable exceptions, such as case 10, in which CNAs of paired MPT and LNM differed the most in clustering analysis ( Figure 3) and did not show highest similarity between the paired samples (Figure 4), our results suggested that MPTs and paired LNMs contain predominantly clonal tumor cells, while minor subpopulations with different CNAs may also exist in metastatic OSCCs.
Next, to determine whether acquisition of CNAs is required for spread of tumor cells from the primary site to regional lymph nodes, we compared the number of CNAs between MPTs and paired LNMs. Seven of the 15 cases showed an increased number of CNAs in the LNM, 6 cases showed a decrease, and the remaining 2 cases showed no change ( Figure 5A). As a result, there were no significant differences in the number of CNAs between MPTs and paired LNMs ( Figure 5A, p = 0.742). Furthermore, to identify CNAs related to lymph node metastasis, we compared the frequencies of CNAs between grouped samples of MPTs and LNMs ( Figure 5B), but were unable to find any CNAs that were significantly more common in LNMs (Table S4). Thus, our data suggested that additional CNAs are not necessarily required in order for tumor cells to spread from the primary site to regional lymph nodes in OSCC.

Identification of Metastasis-associated CNAs in OSCCs
Since no statistically significant differences were detected in the frequencies of CNAs between MPTs and paired LNMs ( Figure 5B), we hypothesized that subpopulations carrying metastasis-related CNAs might be present in MPTs as well as LNMs. Furthermore, it was possible that MPTs might also contain non-metastatic subpopulations. For example, in case 8, loss of 2p was detected only in the MPT, and not in the LNM ( Figure 2C). Therefore, to detect specific CNAs involved in cervical lymph node metastasis, we compared the genomic profiles of NMPTs (Figures S5 and S6) and LNMs of metastatic OSCC. As shown in Figure 6, gains at 3q,  9q,11q13, 14q, 16p, 18p and 20q, and losses at 3p, 4q, 8p, 9p, 10p, 13q, 18q and 21q were detected at a frequency of more than 50% in both NMPTs and LNMs (Table S5), suggesting that these CNAs may be involved in the development of OSCC. On the other hand, gains at 7p, 8q and 17q were differentially detected in LNMs (p,0.05, Figure 6 and Table 2), suggesting that these CNAs may be involved in the lymph node metastasis of OSCC. Interestingly, losses at 1p, 5q, 9p and 19p were detected more frequently in NMPTs than LNMs (p,0.05, Figure 6 and Table2), suggesting that these CNAs may be related to a non-metastatic phenotype of OSCC cells.

Discussion
In this study, the highest similarity of genomic profiles was observed between MPTs and paired LNMs in 13 out of 15 cases of metastatic OSCC (Figure 4). Furthermore, unsupervised hierarchical clustering grouped 12 of the 15 MPT-LNM pairs (Figure 3), suggesting that the MPT and paired LNM of each metastatic OSCC may share a similar genomic profile pattern. In fact, we were unable to find any significant difference between the averaged frequencies of CNAs in MPTs and those in LNMs ( Figure 5B). Meanwhile, we found that all of the cases of metastatic OSCC showed some distinctive genomic CNA patterns between the MPT and paired LNM ( Figure S2, S3 and S4). Of note, CNAs that were detected specifically in the MPT but undetected in the LNM were found in 11 of the 15 metastatic OSCCs, suggesting that not all of the CNAs detected in the MPT are inherited by the tumor cells in the paired LNM. For example, in case 8, loss of 2p was observed only in the MPT, and not in the paired LNM ( Figure 2C). These results imply the co-existence of genetically distinct subpopulations within a single tumor mass of metastatic OSCC. Thus, the findings presented here suggest that although we cannot exclude the possibility that small genetically heterogeneous subpopulations may be admixed in metastatic OSCC, the MPT and paired LNM are composed predominantly of genetically clonal tumor cells.
Based on these findings, we propose a hypothetical model for the development of metastatic OSCC (Figure 7). In this model, the MPT is composed of genetically heterogeneous subpopulations. Among them, subpopulations having metastasis-related CNAs, indicated as small red circles in Figure 7, might metastasize to lymph nodes and form a large part of the subsequent LNM. Indeed, cases with metastasis-related CNAs in MPT, including gains at 7p, 8q and 17q, always harbored the same CNA in the paired LNM (data not shown). In the LNM, one (or a few) subpopulations may again develop further genetically distinct subpopulations through clonal evolution (Figure 7f). To confirm this hypothesis, further studies with larger samples will be required. Interestingly, our model is compatible with the ''punctuated clonal evolution'' model established by Navin et al. [44]. They aimed to reveal the tumor population structure and evolution in two cases of breast cancer by single-cell sequencing and found that some genetically distinct subpopulation with few persistent intermediate subclones reside within a single primary tumor. Their data suggest that tumor cells whose rate of population growth markedly exceeds its rate of genomic evolution can form a subpopulation in a single tumor. Although no previous study has investigated the clonal evolution of OSCC by deep sequencing, including single-cell sequencing, our hypothesis regarding the progression of OSCC might be verified by these sequencing methods in the future.
In this study, we detected gains at 7p, 8q and 17q more frequently in LNMs than in NMPTs, suggesting that these CNAs are associated with lymph node metastasis. Gain at 7p has already been reported to be involved in lymph node metastasis of OSCC [45,46]. Although gain of 8q is frequently detected in OSCCs [37,47], it remains to be determined whether this CNA is actually related to lymph node metastasis of OSCC. In this study, 17q gain was identified as a new candidate CNA related to lymph node metastasis of OSCC. Little is known about the relationship between 17q gain and lymph node metastasis of OSCC. In these regions, many cancer-related genes were located, as summarized in Table 2. Among them, EGFR(7p11.2) [48], TWIST1(7p21.2) [49], RAC1(7p22) [50], MTDH(8q22.1) [51], CCNE2(8q22.1) [52], TNFRSF11B(8q24) [53] and GRB2(17q25) [54] might be good candidates, as they are reportedly associated with invasiveness and metastasis of tumor cells. Further analysis, including gene expression and functional analysis, will be required to clarify this issue. Several previous studies have identified lymph node metastasis-associated CNAs in OSCC [38,45,46,55] These differences may be attributable to differences in sample sizes, comparison methods, and platforms of CGH employed. Among these CNAs, gains at 7p and 11q13 have been identified as lymph node metastasis-related CNAs in multiple studies (.3) including ours, suggesting that these CNAs may play important roles in the metastasis of OSCC. In the present study, gain at 7p was also identified as a lymph node metastasis-related CNA, but gain at 11q13 was not. At the present time, we cannot fully account for this difference between our results and those of others, but one plausible explanation could be that other groups selected non-metastatic OSCC samples randomly, whereas we selected tumors with a TT of more than 4 mm. Whether these differences in findings are due simply to the different criteria employed for the selection of NMPT samples remains to be clarified.
In this study, a high frequency (50-86%) of metastatic OSCCs with gains at 7p, 8q or 17q in LNMs also harbored the same CNA in the paired MPTs. That is, 10 out of 15 cases showed gain at 7p in the LNM and 5 of those 10 cases also had this CNA in the paired MPT (50%). Similarly, 14 out of the 15 cases showed gain at 8q in the LNM and 12 of the 14 cases also had this CNA in the paired MPT (86%). Eleven of the 15 cases showed 17q gain in the LNM, and 7 of those 11 cases also had this CNA in the paired MPT (64%). These results suggest that patients may have a high risk of lymph node metastasis when gains at these CNAs are detected in the primary tumor, even if the patients are diagnosed as having node-negative OSCC. This suggestion is important in the context of diagnosis and treatment of OSCC, because the majority of current therapeutic strategies are decided on the basis of the primary tumor. More extensive development of our data will be required to identify predictive markers of occult LNM in OSCC.
In this study, we found that gains at 3q, 9q,11q13, 14q, 16p, 18p and 20q, and losses at 3p, 4q, 8p, 9p, 10p, 13q, 18q and 21q were detected at a high frequency in both NMPTs and LNMs of OSCC ( Figure 6). To determine whether this tendency is generally observed in OSCC analysis, we compared the frequencies of CNAs in this study with those of publicly deposited CGH arrays (Table S6) [56]. As shown in Figure S7, although most CNAs were detected at a relatively lower frequency than in our analysis, the pattern of the histogram, in which gains at 3q, 11q13 and 20q, and losses at 3p, 8p and 18q were highly detected in 228 OSCC, was similar to that obtained with our data, suggesting that these CNAs may be important for the development of OSCC. On the other hand, some inconsistencies were also observed in the frequency of 8q and 17q gain. We are unable to explain this discrepancy at the present time because of lack of information about lymph node status in the deposited data.
In conclusion, our results suggest that primary tumors of OSCC and their corresponding LNMs share very similar patterns of genomic CNAs, and that cells in the primary tumor tend to become metastatic after acquiring metastasis-associated CNAs, such as gains at 7p, 8q and 17q, during the process of clonal evolution. Further studies will be required to identify the responsible genes located on these CNAs and to clarify the mechanisms underlying the process of lymph node metastasis. This approach may make it possible to predict and treat LNMs by determining whether metastasis-associated CNAs are detectable in biopsy samples from patients with OSCC.