Allelic Variation and Differential Expression of the mSIN3A Histone Deacetylase Complex Gene Arid4b Promote Mammary Tumor Growth and Metastasis

Accumulating evidence suggests that breast cancer metastatic progression is modified by germline polymorphism, although specific modifier genes have remained largely undefined. In the current study, we employ the MMTV-PyMT transgenic mouse model and the AKXD panel of recombinant inbred mice to identify AT–rich interactive domain 4B (Arid4b; NM_194262) as a breast cancer progression modifier gene. Ectopic expression of Arid4b promoted primary tumor growth in vivo as well as increased migration and invasion in vitro, and the phenotype was associated with polymorphisms identified between the AKR/J and DBA/2J alleles as predicted by our genetic analyses. Stable shRNA–mediated knockdown of Arid4b caused a significant reduction in pulmonary metastases, validating a role for Arid4b as a metastasis modifier gene. ARID4B physically interacts with the breast cancer metastasis suppressor BRMS1, and we detected differential binding of the Arid4b alleles to histone deacetylase complex members mSIN3A and mSDS3, suggesting that the mechanism of Arid4b action likely involves interactions with chromatin modifying complexes. Downregulation of the conserved Tpx2 gene network, which is comprised of many factors regulating cell cycle and mitotic spindle biology, was observed concomitant with loss of metastatic efficiency in Arid4b knockdown cells. Consistent with our genetic analysis and in vivo experiments in our mouse model system, ARID4B expression was also an independent predictor of distant metastasis-free survival in breast cancer patients with ER+ tumors. These studies support a causative role of ARID4B in metastatic progression of breast cancer.


Introduction
Breast cancer remains the most commonly diagnosed malignancy among women in the United States [1]. Because the vast majority of breast cancer related mortality is attributable to disseminated metastatic disease, a clear need exists to identify factors that modulate breast cancer metastatic progression. In addition to acquired somatic mutations, there is accumulating evidence that the genetic background on which a tumor arises can influence disease progression [2]. Identifying and characterizing metastasis susceptibility genes would provide additional insights into the mechanisms associated with tumor dissemination and growth, leading not only to better understanding of this complex process but also ultimately to new targets and strategies for clinical intervention.
Due to the complex interactions between inherited factors and somatic mutations in metastatic progression, as well as the genetic complexity of human populations, identification of inherited susceptibility genes directly in human populations is difficult. To circumvent this our laboratory has chosen to apply a systems genetics approach on a mouse model of metastatic luminal breast cancer, the FVB/N-TgN(MMTV-PyMT) 634Mul (MMTV-PyMT) transgenic model. The MMTV-PyMT transgenic mouse model, which expresses the polyoma virus middle T antigen under the control of the mouse mammary tumor virus promoter, rapidly develops tumors in approximately 100% of female mammary glands and .85% of these animals develop pulmonary metastases by 14 weeks of age. When the MMTV-PyMT model is bred onto a variety of different mouse strains, the F 1 progeny display broad and strain-dependent heterogeneity in primary tumor latency, primary tumor growth rate and lung metastatic density [2].
Two strains, the highly metastatic AKR/J and poorly metastatic DBA/2J, were found to have a 20-fold difference in their metastatic capacity but no significant difference in any other measured tumor phenotype. These strains were also the progenitor strains for the AKXD recombinant inbred panel of mice, which consists of more than 20 substrains that are composites of the original parental strains AKR/J and DBA/2J. The MMTV-PyMT model was therefore bred to 18 different AKXD strains, the F 1 mice were phenotyped with respect to primary tumor latency and burden and lung metastatic density, and the phenotypes were compared to haplotype maps of the AKXD strains to determine quantitative trait loci (QTLs) associated with mammary tumor progression [3]. Subsequently, RNA was also harvested from F 1 tumors and gene expression analysis was performed to define individual genes whose expression correlated with progression [4].
In this study we have utilized these resources to identify Arid4b as a novel candidate metastasis susceptibility gene. Although the precise molecular functions of ARID4B are unknown, it has been shown to associate with the SIN3A histone deacetylase (HDAC) complex [5]. As predicted by the genetic linkage and gene expression data, higher expression of Arid4b is associated with more rapid tumor growth in animal models, as well as increased tumor cell motility and invasion. These effects are associated with differential binding of the AKR and DBA alleles of ARID4B to HDAC complex members mSIN3A and mSDS3. ARID4B was also found to bind the mSIN3A-associated breast cancer metastasis suppressor protein BRMS1. Stable shRNA-mediated knockdown of Arid4b significantly inhibited the pulmonary metastatic efficiency of orthotopic mammary tumors without inhibiting primary tumor growth. Consistent with impaired metastasis in the Arid4b knockdown lines was decreased expression of a recently described metastasis-predictive gene network [6]. High expression of ARID4B was associated with an approximately 2-fold increased risk of metastatic progression in human breast cancer patients who were lymph node negative at diagnosis. Taken together these results demonstrate a causal role for Arid4b in tumor growth and metastatic progression and suggest that mechanisms of action involve modification of epigenetic state via the mSIN3A complex and regulation of the conserved Tpx2 gene network.

Identification of Arid4b as a potential tumor progression gene
Previously a cross between the highly metastatic PyMT model and the AKXD recombinant inbred (RI) panel was performed to map QTLs associated with inherited predisposition to developing pulmonary metastasis [3]. In addition to metastasis susceptibility loci on chromosomes 6 and 19, linkage analysis revealed a potential peak on proximal chromosome 13 ( Figure S1). In a subsequent study, gene expression analysis was also carried out on these samples to examine the effect of varying metastatic genotypes on tumor transcriptional patterns [7]. To discover potential candidate genes that may affect metastatic predisposi-tion, correlation analysis was performed using GeneNetwork [8] to identify genes whose differential expression was highly associated with metastasis. Upon integrating the data from these two studies we found that of the top ten genes most significantly associated with metastasis in our expression correlation analysis, two also mapped to potential QTLs: Ttc9c and Arid4b. The potential role of Ttc9c was investigated and no significant differences were detected with respect to orthotopic tumor growth or metastasis of 6DT1 mouse mammary carcinoma cells stably expressing Ttc9c compared to vector control cells (data not shown). Similarly, we detected no significant effects on tumor growth or metastasis when Ttc9c BAC transgenic mice were bred to the MMTV-PyMT model (data not shown). The most likely explanation for why Ttc9c did not pass our validation experiments is that its initial identification in our screens was a false positive owing to its close physical proximity on chromosome 19 to the metastasis modifier gene Sipa1 [9]. Our current studies have therefore focused on Arid4b, which maps within the chromosome 13 locus and whose mRNA expression was positively associated with metastatic disease and tumor growth (Figure 1), suggesting a possible causative role as a progression modifier.

Author Summary
A person's individual genetic background influences not only the likelihood of developing breast cancer, but also the likelihood of that cancer becoming metastatic. The identification of metastasis susceptibility genes using human samples is rendered impractical by the high degree of genetic diversity among people. Our laboratory's strategy is to cross genetically defined inbred mouse strains to recapitulate a degree of genetic diversity that is more readily studied. By breeding these panels of inbred mouse crosses to a mouse model of breast cancer, we can identify regions of the genome that correlate with observed phenotypic variation including metastatic density and then identify individual candidate genes. This manuscript describes the identification of Arid4b as a candidate gene of interest and the experiments we performed to validate its role in metastasis. High expression of Arid4b enhances cell migration and invasion and, conversely, knockdown of Arid4b inhibits metastasis of breast tumor cells to the lungs. The mouse gene and human ARID4B are highly conserved, and among women with ER+ tumors ARID4B expression level is predictive of which patients will progress to develop metastatic disease.
Arid4b is both differentially expressed and polymorphic between AKR and DBA To validate the potential differences in Arid4b expression between strains, microarray data from AKR and DBA normal tissues were examined [4]. Consistent with the AKXD RI results, Arid4b expression was 2.3-fold higher in thymus (p = 9.32610 25 , FDR = 0.0004) and 2.5-fold higher in bone marrow (p = 1.28610 25 , FDR = 0.0005) of DBA mice compared to AKR, suggesting that constitutional polymorphisms can influence Arid4b expression levels in normal tissues. Sequence analysis was also performed to both validate SNPs in the public database as well as identify potential new variants between the AKR and DBA alleles of Arid4b. Complete exon sequencing revealed that the DBA allele matched the consensus C57BL/6 sequence. Analysis of the AKR allele revealed numerous silent SNPs as well as polymorphisms encoding eleven amino acid substitutions, as shown in Figure 2. Interestingly, eight of these eleven polymorphisms are located in exon 22 and their encoded substitutions are densely clustered towards the C-terminal end between amino acids 1171 and 1198. These results are consistent with the possibility that inherited variation of Arid4b may contribute to tumor progression.

Arid4b expression promotes primary tumor growth
Analysis of the data revealed that increased Arid4b expression and increased metastatic susceptibility were associated with the DBA rather than the AKR genotype at the chromosome 13 QTL. This result suggests that the DBA allele at this locus promoted metastatic progression relative to the AKR allele, and a series of in vitro and in vivo assays were performed to test this hypothesis. V5tagged AKR and DBA alleles of Arid4b were ectopically expressed in the mouse mammary carcinoma cell line Met-1, which was originally derived from tumors arising in the MMTV-PyMT transgenic model [10]. Because the Met-1 line was derived from an FVB strain background, we also sequenced the FVB allele of Arid4b and found it to be identical to the DBA and C57BL/6 alleles. Cell lines were then identified that expressed the epitope tagged constructs at levels that were only two to three-fold higher than endogenous levels as measured by QRT-PCR ( Figure S2A), consistent with the approximately two-fold range of Arid4b mRNA between high and low metastatic AKXD strains. Furthermore, the ectopically expressed AKR and DBA alleles were detected at approximately equal levels in our stable lines as assessed by western blots ( Figure S2B).
Orthotopic implantation assays were then performed to examine the role of Arid4b expression in vivo ( Figure 3A). By four weeks post-implantation, cells expressing the DBA allele formed tumors with a 2.6-fold larger mass compared to control cells (741 mg versus 284 mg; p = 6.08610 ' 27). The AKR allele expressing cells formed tumors with a median mass of 480 mg, which was significantly larger than control tumors (p = 0.010) but significantly smaller than the DBA cohort (p = 7.73610 ' 23), consistent with our previous genetic analysis and our in vitro studies.

DBA Arid4b promotes increased in vitro tumor invasion and migration compared to AKR
In vitro assays were performed to address the potential affect of the Arid4b polymorphisms on tumor cell behavior. In vitro growth assays demonstrated no significant difference in proliferation between cells expressing the DBA or AKR alleles or control cells (data not shown). In contrast, ectopic expression of either allele significantly increased the abilities of Met-1 cells to migrate through a porous membrane and to invade through Matrigel, compared to control cells expressing lacZ ( Figure 3B). Notably, Met-1 cells stably expressing the DBA allele were significantly more migratory and invasive than those expressing the AKR allele. Since both cell lines express the epitope-tagged construct at approximately the same level, these results suggest a potential functional consequence for the amino acid substitutions present between the two variants in addition to the effects associated with differential expression.

Knockdown of Arid4b inhibits pulmonary metastasis
Because Met-1 cells are poorly metastatic in our laboratory, and because we were unable to stably overexpress Arid4b in several more aggressive mouse breast cancer cell lines, we adopted a knockdown strategy to examine the role of Arid4b in lung metastasis in vivo. To this end, the highly metastatic 6DT1 cell line [11] was transduced with five lentiviral shRNAs targeting Arid4b, or a scrambled control, and knockdown of ARID4B protein was evaluated using western blots ( Figure 4A) and densitometry ( Figure 4B) to select stable shRNA lines for in vivo studies. No significant knockdown was observed using the scrambled control shRNA. Cells stably transduced with Arid4b shRNAs designated H3 and H4 expressed 81% and 85% less ARID4B protein, respectively, compared to controls, and were therefore selected for further in vivo study.
Following orthotopic implantation of 10 ' 5 cells into the mammary fat pad we observed only slight differences in median primary tumor mass between the scrambled control, H3, and H4 cohorts, and these data did not achieve statistical significance (p = .070, Kruskal-Wallis; Figure 4C). In contrast, we observed a 2-fold decrease in the median number of macroscopic lung metastases in the H3 cohort (10 vs. 22; p = .013) and a 7-fold decrease in the H4 cohort (3 vs. 22; p = 9.72610 ' 25) compared to controls ( Figure 4D). Differences in lung metastasis between the two Arid4b knockdown cohorts were not statistically significant following post hoc testing (p = .066, Conover-Inman). These data demonstrate that ARID4B protein levels are a critical determinant of pulmonary metastatic efficiency in this model system.

Arid4b germline polymorphism modifies binding to mSIN3A
Previous studies demonstrated that ARID4B is a member of the mSIN3A HDAC complex and that binding to mSIN3A involves the C-terminal domain of ARID4B [5], where the majority of the amino acid substitutions were found between the AKR and DBA variants ( Figure 2). Co-IP analysis was therefore performed to examine a potential effect of the observed amino acid substitutions on ARID4B-mSIN3A binding. For these experiments V5-tagged ARID4B was transiently transfected into HEK293 cells and immunoprecipitated using an anti-V5 antibody. Binding to endogenous mSIN3A and another component of the mSIN3A complex, mSDS3 [12], was evaluated by western blots ( Figure 5). Input controls for ARID4B, mSIN3A, and mSDS3 were approximately equal as were the amounts of the two Arid4b variants immunoprecipitated; however, a marked decrease in binding to mSIN3A was observed along with diminished mSDS3 association for the DBA variant ( Figure 5A). Densitometry analysis revealed that binding of the DBA variant was reduced by 51% and 37% for mSIN3A and mSDS3, respectively, compared to AKR ( Figure 5B). These results demonstrate a functional consequence of Arid4b polymorphisms and provide insight into one potential molecular mechanism whereby Arid4b may modulate breast cancer progression.

ARID4B binds the breast cancer metastasis suppressor BRMS1
Breast cancer metastasis suppressor 1 (BRMS1) belongs to the same family of proteins as mSDS3 and is known to associate with the mSIN3A complex as well as ARID4A [13]. Because ARID4B is also known to bind mSIN3A, mSDS3, and ARID4A [14], we postulated that ARID4B might physically bind BRMS1. Proteomics screens to identify BRMS1 interacting proteins also support this association: in a yeast two-hybrid screen for proteins binding full-length BRMS1, ARID4B was the number one hit identified, and ARID4A and mSDS3 were also detected (unpublished data). In a separate screen, mass spectrometry was performed to identify mSIN3A binding proteins in MCF10A human breast epithelial cells. Peptides representing endogenous ARID4B and BRMS1 were detected, providing further evidence for this interaction and demonstrating that it is not simply an artifact of supraphysiologic expression in transfected cells (Douglas Hurst; personal communication). To validate this interaction we performed co-IPs using lysates from 293 cells transiently transfected with the FLAGtagged AKR or DBA variants of ARID4B along with either HAor myc-tagged BRMS1. HA-BRMS1 was readily detected following pull-down of ARID4B using an anti-FLAG antibody ( Figure 6A). Likewise, ARID4B was efficiently co-precipitated with myc-BRMS1 ( Figure 6B). Unlike the associations with mSIN3A and mSDS3 however, the AKR and DBA variants of ARID4B did not exhibit differential binding to BRMS1. One possible explanation for this observation is that BRMS1 binds to a different region of ARID4B than the polymorphic C-terminal domain that mediates binding to mSIN3A.
Arid4b regulates the metastasis-associated TPX2 gene network Because little is known about the specific cellular processes regulated by Arid4b that might influence the metastatic phenotype, we performed expression microarray analysis on the 6DT1 cell lines stably expressing Arid4b shRNAs to identify genes that are differentially expressed as a function of Arid4b levels. Based on the western blot densitometry shown in Figure 4A-4B cell lines expressing hairpins H3 and H4 were chosen to represent the Arid4b knockdown cohort, and the control cohort consisted of untreated 6DT1 cells and lines expressing the scrambled control shRNA or hairpin H5. We detected 2,048 unique genes whose expression was significantly different (p,0.05, ANOVA) between the two groups and those with the greatest fold change are summarized in Table 1. While the most highly upregulated genes function in pathways with diverse biological roles, it was noted that among the most downregulated genes were multiple factors associated with centromeres (Cenpi, Cenpq), microtubule and spindle dynamics (Kif2c, Kif4a, Sass6), and cell cycle regulation (Ccne, Cdc25c). Consistent with this observation were the results of pathway analysis conducted to identify biological functions impacted as a consequence of Arid4b knockdown ( Table 2). The most differentially regulated processes based on gene ontology were checkpoint control and DNA repair, and processes related to centrosome, centriole, and chromosome dynamics.
In examining the microarray data we noticed a striking overlap between genes downregulated in the Arid4b knockdown lines and components of the TPX2 gene network. This transcriptional network was recently identified based on expression profiling of three mouse data sets and two human breast cancer data sets [6]. The TPX2 network is tumor cell-autonomous and conserved across species, its activation is predictive of reduced distant metastasis-free survival (DMFS) in ER-positive patients, and the nine common hub genes in the TPX2 signature (TPX2, BUB1, UBE2C, CDC20, CCNB2, KIF2C, BUB1B, CEP55, CENPA) that were conserved across all five data sets consist primarily of genes involved in microtubule and mitotic spindle function. To determine how Arid4b levels influence the activation state of the TPX2 network, the fold changes of the 311 TPX2 network genes were examined in the Arid4b knockdown lines. Compared to control cell lines, 119 network genes were significantly downregulated (p,0.05) including Tpx2 itself and the other eight common hub genes, versus only 5 network genes upregulated (Figure 7; high resolution available as Figure S3). The downregulation of this gene network concomitant with the inhibition of metastasis observed in the Arid4b knockdown lines provides further support for the role of the TPX2 network in metastatic susceptibility and suggests that a significant portion of this network may be regulated by Arid4b.
High Arid4b expression predicts poor clinical outcome Because Arid4b was identified as a candidate gene in part based on differential expression between high and low metastatic strains of mice in the AKXD panel, and because Arid4b expression levels were associated with tumor growth and metastasis in mice as well as the activity of the metastasis-associated TPX2 network, we tested whether ARID4B expression alone correlated with human patient outcomes. A search of publically available breast cancer microarray data sets using Oncomine (Compendia Bioscience, Ann Arbor, MI) revealed that ARID4B expression was 2.3-fold higher in 40 ductal breast carcinoma samples compared to 7 normal breast tissue samples in the Richardson study [15], confirming that high ARID4B expression is clinically associated with breast cancer ( Figure S4). Analysis of a pooled breast cancer dataset using GOBO (http://co.bmc.lu.se/gobo/) [16] showed that among the subgroup of patients with ER-positive tumors, the cohort with high expression of ARID4B had significantly reduced DMFS compared to the low or median ARID4B cohorts (Figure 8). Because this association was significant among patients with ERpositive tumors who were lymph node negative at the time of diagnosis ( Figure 8A), this finding indicated that ARID4B expression level is predictive of patient progression to metastatic disease. As determined by multivariate analysis (Figure 8B), the hazard ratio compared to the high ARID4B tercile was 0.54 for middle ARID4B (95% C.I. = 0.33-0.89; p = .015) and 0.42 for the low ARID4B tercile (95% C.I. = 0.26-0.70; p = 7.51610 ' 24), indicating that patients with tumors expressing high levels of ARID4B are approximately twice as likely to develop metastatic disease. The association of ARID4B with reduced DMFS was also highly significant among ER-positive patients not receiving adjuvant therapy ( Figure 8C-8D), indicating that ARID4B expression level plays a significant role in the natural metastatic progression of ER-positive breast cancer in human patients and its relevance is not confined solely to our mouse model systems.

Discussion
Arid4b was identified as a candidate gene of interest through linkage and expression correlation analyses, and the in vitro and in vivo data presented here provide the first direct evidence of a causal role of Arid4b in mammary tumor progression and metastasis. The initial QTL analysis revealed association with Arid4b on proximal chromosome 13, and Arid4b was among the most highly correlated genes with the most significant p values in the subsequent eQTL analysis in the AKXD recombinant inbred panel. It was noted that although AKR/J is the more highly metastatic of the two parental strains, progression was associated with the DBA/2J allele, suggesting that the metastasis promoting influence of Arid4b is likely masked by other suppressive factors in a pure DBA/2J background. Although the AKXD recombinant inbred panel lacks the power to detect these epistatic interactions, ongoing experiments using the latest generation of recombinant inbred mice including the Collaborative Cross [17,18] will enable higher resolution QTL mapping and more robust systems genetics analyses going forward.  Mouse Arid4b encodes a protein of 1314 amino acids that shares 89% identity and 95% similarity to the 1312 amino acid human protein. Alternate nomenclature includes breast cancer-associated antigen 1 (BRCAA1), retinoblastoma-binding protein-1-like protein-1 (RBP1L1), and mSIN3A-associated protein of 180 kDa (SAP180). Indeed, there are multiple lines of evidence implicating Arid4b in breast cancer. A ten amino acid peptide was found to represent an antigen epitope expressed in 65% of breast cancer specimens and was significantly upregulated in the sera of breast cancer patients compared to healthy donors [19]. ARID4B was also found to associate with the mSIN3A HDAC complex [5], which is in turn known to be bound by the breast cancer associated tumor suppressor ING1 [20,21], the well-characterized breast cancer metastasis suppressor BRMS1 [13], and the ARID family homolog ARID4A/RBP1 [22], which has also been identified as a breast cancer associated antigen [23].
Ectopic expression of Arid4b at a physiologically relevant two-to three-fold increased level resulted in a 3-fold increase in orthotopic tumor mass relative to controls for Met-1 cells expressing the DBA allele, while the AKR allele induced 1.9-fold larger tumors versus control cells. To our knowledge, this is the first direct evidence that Arid4b upregulation promotes tumor growth. Although Met-1 orthotopic tumors do not readily metastasize in our experience, transwell assays in vitro demonstrated that upregulation of either allele of Arid4b increased tumor cell migration and invasion, consistent with a role of Arid4b in metastatic progression, and cells expressing the DBA allele were significantly more migratory and invasive than cells expressing the AKR allele. While stable upregulation of Arid4b did not induce Met-1 cells to metastasize with any greater frequency, stable knockdown of Arid4b in the highly metastatic 6DT1 cell line did cause a dramatic reduction in pulmonary metastases, raising the possibility that ARID4B may represent a novel therapeutic target. Taken together, the results of the orthotopic implantation and transwell assays are broadly consistent with our genetic linkage and expression correlation analyses that showed an association of the DBA haplotype on chromosome 13 with metastatic progression in the MMTV-PyMT6AKXD mice, and validate a functional role of Arid4b polymorphism in modulating the breast cancer phenotype.
While the molecular mechanisms of Arid4b are incompletely understood, an examination of its sequence and conserved domains provides further insight into its potential functions. Arid4b contains a nuclear localization signal (NLS) towards the Cterminus as well as conserved Tudor, RBB1NT, ARID/BRIGHT, and Chromo domains in the N-terminal half of the protein. The ARID domain mediates binding to DNA, although the affinity for AT-rich sequences varies among members of the Arid superfamily [24,25]. The RBB1NT domain is present in many Rb binding proteins including ARID4A, although it is noteworthy that unlike ARID4A, ARID4B does not contain the LCXCE motif necessary for RB binding [26], and no interaction was observed when we attempted to co-IP ARID4B with RB (data not shown); therefore, the function of the RBB1NT domain of ARID4B remains uncertain. Tudor domains are present in many RNA binding proteins [27] and also bind methylated lysine residues on histone tails [28]. Chromo domain-containing proteins have also been shown to bind methylated lysines and mediate the recruitment of chromatin modifying complexes [29]. Because mSIN3A itself lacks intrinsic DNA binding capability, targeting of mSIN3A-associated HDAC activity depends on interactions with other transcription factors including Mad1 and KLF repressors among others [30]. The presence of putative DNA and histone binding domains in the N-terminal half of ARID4B suggest that its influence on mammary tumor progression involves directing the HDAC activity of mSIN3A complexes to chromatin. This is supported by our observations that the high and low metastatic alleles of ARID4B have a dense cluster of amino acid polymorphisms in the Cterminal domain and bind with different affinities to mSIN3A and mSDS3, though the biochemical significance of this observation remains to be determined. Diminished expression levels or binding affinity of ARID4B may allow mSIN3A to be bound by other proteins with different DNA sequence specificity, perhaps not resulting in a global change in the abundance of any one particular histone mark but rather altering the expression of different subsets of genes. It is noteworthy that the pro-metastatic ARID4B and the metastasis suppressive BRMS1 bind each other and also to the mSIN3A complex in vitro. This observation reinforces the significance of the mSIN3A complex in metastatic progression, and it is tempting to speculate that an HDAC complex may be caught in a molecular tug-of-war between these two metastasis modifier genes. However, the mSIN3A complex is modular in nature and interacts with a great variety of transcriptional regulators [30,31]. Many different complexes exist, and their precise composition and function within the context of breast cancer are not well understood. Further studies will be necessary to define a role, if any, for the ARID4B-BRMS1 interaction in human disease.
While ARID4B expression was not a significant predictor of DMFS across all patients in a meta-analysis of an 1,881 sample data set, statistical significance emerged when patients were stratified based on ER status. The observation that ARID4B is predictive of metastatic progression only in ER+ patients is consistent with the identification of Arid4b as a candidate gene in the context of the MMTV-PyMT mouse model system, in which tumors arise from a predominantly ER+ luminal epithelial cell population [32]. Loss of ER and PR is detected during progression to late carcinomas, however in a systematic analysis of gene expression profiles these late PyMT tumors clustered most closely with human luminal tumors [33], which are ER+. Also consistent with ARID4B promoting metastatic progression of ER+ tumors is our observation that Arid4b knockdown caused a significant downregulation of the core components of the Tpx2 gene network. The TPX2 signature was tumor cell autonomous and predictive of DMFS only in those patients who were ER+ at diagnosis, and was distinct from a CD53 network that was associated with ERnegative stromal components [6]. Polymorphisms in several other tumor cell autonomous metastasis susceptibility genes identified in our laboratory including Sipa1, Rrp1b, and Brd4 are prognostic only in ER+ patients [9,[34][35][36][37], and stable expression of Brd4 can also differentially regulate the Tpx2 network [6]. The association of multiple metastasis susceptibility genes with a transcriptional network comprising many cell cycle and mitotic spindle checkpoint regulatory genes highlights the possibility that these cellular functions are critical determinants of metastatic efficiency. Further experiments are underway in our laboratory to determine whether upregulation of the TPX2 network is causative in promoting metastasis.

Identification of Arid4b germline polymorphisms
Using genomic DNA from AKR/J and DBA/2J mice as templates, PCR was performed to amplify the protein coding Figure 8. High expression of ARID4B is associated with poor clinical outcome. In patients with ER-positive tumors who were node negative at diagnosis (A) distant metastasis-free survival (DMFS) was significantly lower (p = .009) in patients with high expression (blue) compared to middle (red) or low (gray) expression of ARID4B, and multivariate analysis of 440 cases (B) was performed to determine metastatic progression hazard ratios of 0.54 and 0.42 for median and low ARID4B terciles, respectively, compared to the high ARID4B tercile. The association of high ARID4B with poor DMFS was also highly significant (p = 3.05610 ' 24) among ER-positive patients in the absence of adjuvant therapy (C) with similar hazard ratios (D) of 0.53 and 0.49 for middle and low ARID4B groups compared to high ARID4B. doi:10.1371/journal.pgen.1002735.g008 region, exons 2 through 24 (Table S1). PCR Products were subjected to agarose gel electrophoresis, bands isolated using the QIAquick Gel Extraction kit (Qiagen) according to manufacturer's recommendations, and used as templates for sequencing. All sequencing runs were performed by the DNA Sequencing and Gene Expression Core, NCI, Bethesda, MD. Genomic sequences for the AKR and DBA alleles were aligned using pairwise BLAST [38] and non-synonymous polymorphisms verified by manual comparison of chromatograms using Chromas software (Technelysium).

Expression vectors
V5-tagged AKR and DBA alleles of Arid4b were generated using long range PCR with forward primer 59-AACAAAGGTG-CAGGTGAAGC-39 and reverse primer 59-CCTGCACT-CAACTGACATTCCATTC-39 to amplify Arid4b, and PCR products were cloned into pcDNA3.1/V5-His-TOPO (Invitrogen). FLAG-tagged Arid4b vectors were constructed by the Protein Expression Laboratory, SAIC-Frederick, Inc. using Gateway technology (Invitrogen). Briefly, the AKR or DBA allele of Arid4b was PCR amplified and cloned into entry vector pDonr-253, then subcloned by Gateway LR recombination into pDest-737 to generate an expression construct with CMV promoter and Nterminal 3xFLAG tag. Full-length BRMS1 was epitope tagged at the N-terminus by PCR with the HA or myc tag sequence incorporated into the forward primer and cloned into pCMV or pcDNA3-hygro (Invitrogen), respectively. Correct sequences of all vectors were confirmed prior to use.

Cell culture and stable cell lines
Met-1 cells [10] were a gift from Dr. Robert Cardiff (University of California, Davis, CA). 6DT1 cells [11] were a gift from Dr. Lalage Wakefield (NCI, NIH, Bethesda, MD). HEK293 cells were purchased from ATCC (Manassas, VA). Cell lines were maintained in DMEM supplemented with 10% FBS, 2 mM Lglutamine, penicillin and streptomycin. Cells were confirmed to be free of mycoplasma contamination using the MycoAlert detection kit (Lonza).
Met-1 cells seeded onto 10 cm tissue culture plates were cotransfected with 6 mg of the appropriate V5-tagged Arid4b construct described above, or pcDNA3.1/V5-His-TOPO/lacZ (Invitrogen) as a control vector, plus 600 ng of pSuper.Retro.Puro (Oligoengine) as a selectable marker, using FuGENE 6 transfection reagent. Cells were selected using 1 mg/ml G418 plus 4 mg/ ml puromycin and clones derived by limiting dilution. Stable upregulation of Arid4b was verified by performing QRT-PCR using forward primer 59-GGTGAGTGGGAGCTGGTCTA-39 and reverse primer 59-ATAAAGGGCCCACTGAAGGT-39, and western blotting for endogenous and ectopically expressed ARID4B as described below. 6DT1 cells were transduced with one of five different lentiviral shRNAs targeting Arid4b (RMM4534-NM_194262, Open Biosystems) or a scrambled control shRNA in the same pLKO.1 vector. Stable cells were selected using 10 mg/ml puromycin and pooled clones were analyzed for Arid4b knockdown by western blot.

Animal studies
Met-1 or 6DT1 stable lines were orthotopically implanted into the fourth mammary fat pad of six week old female NU/J or FVB/NJ mice using 10 5 cells suspended in 100 ml of PBS per animal. Primary tumors and lungs were harvested 28 days later. All experiments were performed according to the National Cancer Institute Animal Care and Use Committee guidelines.

Migration and invasion assays
Met-1 cells stably expressing Arid4b or lacZ were seeded at 75,000 cells per well into invasion chambers coated with Matrigel basement membrane matrix (534480, BD Biosciences) or control chambers lacking Matrigel (354578, BD Biosciences). After 24 hours, cells were fixed in 100% methanol, stained with crystal violet, and mounted onto glass slides using mineral oil. Cells were visualized at 4006 magnification and five fields were counted for each of three experiments.

Co-immunoprecipitations
HEK293 cells were transfected with the V5-or FLAG-tagged AKR or DBA allele of Arid4b, with or without HA-BRMS1 or myc-BRMS1 where appropriate, using FuGENE 6 transfection reagent. After 30 hours, cells were harvested in mild IP lysis buffer (25 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM EDTA, 1% NP-40, 5% glycerol) supplemented with protease inhibitors (11836170001, Roche) and phosphatase inhibitors (P-5726, Sigma). Protein samples were quantitated using Bradford assays. Gammabind G Sepharose beads (17088501, GE Healthcare) were washed twice in NET buffer (50 mM Tris pH 8.0, 150 mM NaCl, 5 mM EDTA, 1% NP-40, 0.5% BSA, 0.04% sodium azide) supplemented with protease and phosphatase inhibitors, and resuspended to form a 50% bead slurry. Lysates were precleared by adding 40 ml of bead slurry and rotating for 30 minutes at 4uC. Samples were centrifuged at 10,000 rpm for 1 minute at 4uC and pre-cleared supernatant transferred to a fresh tube. Anti-V5, anti-FLAG, or anti-myc tag antibodies was added to a final concentration of 1.0 mg/ml and samples rotated for 1 hour at 4uC, then 50 ml of bead slurry was added and co-IPs performed overnight at 4uC. Beads were then washed four times with NET buffer and resuspended in SDS-PAGE sample buffer.

Expression profiling and network analysis
Total RNA was isolated from pooled clones of 6DT1 Arid4b knockdown cell lines using RNeasy kits (Qiagen) and then arrayed on Affymetrix GeneChip Mouse Gene 1.0 ST arrays by the Microarray Core in the NCI Laboratory of Molecular Technology. Expression data were normalized using Partek Genomics Suite to identify genes whose expression was significantly different (p,.05) between the Arid4b normal cohort (untreated, scrambled control, and H5 lines) and the Arid4b knockdown cohort (lines H3 and H4). The gene list and expression values were then analyzed using Ingenuity Pathways Analysis (Ingenuity Systems, www. ingenuity.com) to identify differentially regulated signaling pathways and biological functions. Expression of the Tpx2 transcriptional network was visualized and figure generated using Cytoscape software [39]. Microarray data are available through the Gene Expression Omnibus under accession number GSE35731. Figure S1 Interval mapping for metastatic progression in the AKXD recombinant inbred panel. A potential QTL peak was detected on proximal chromosome 13. Likelihood ratio score (LRS) for correlation with metastasis is shown in blue with the AKR genotype in red and the DBA genotype in green. (TIF) Figure S2 Quantitation of Arid4b expression in stable cell lines. QRT-PCR data (A) was internally normalized to Ppib and fold change expressed relative to Met-1 cells stably expressing lacZ. Western blots confirm upregulation of the AKR and DBA alleles at the protein level relative to endogenous expression in lacZ controls (B). (TIF) Figure S3 High resolution map of Tpx2 network gene expression in Arid4b knockdown cell lines. Red indicates statistically significant (p,.05, ANOVA) downregulation; green indicates upregulation. (TIF) Figure S4 ARID4B mRNA expression in ductal breast carcinoma versus normal breast tissue. Dots represent minimum and maximum values, whisker bars represent 10 th and 90 th percentiles, boxes represent 25 th to 75 th percentiles, and center bars represent median values. Fold change was 2.299 and statistical significance was determined by two-tailed t-test. Figure adapted from Oncomine representation of ARID4B expression in the Richardson breast cancer data set [15]. (TIF)