The Influence of the CHIEF Pathway on Colorectal Cancer-Specific Mortality

Many components of the CHIEF (Convergence of Hormones, Inflammation, and Energy Related Factors) pathway could influence survival given their involvement in cell growth, apoptosis, angiogenesis, and tumor invasion stimulation. We used ARTP (Adaptive Rank Truncation Product) to test if genes in the pathway were associated with colorectal cancer-specific mortality. Colon cancer (n = 1555) and rectal cancer (n = 754) cases were followed over five years. Age, center, stage at diagnosis, and tumor molecular phenotype were considered when calculating ARTP p values. A polygenic risk score was used to summarize the magnitude of risk associated with this pathway. The JAK/STAT/SOC was significant for colon cancer survival (PARTP = 0.035). Fifteen genes (DUSP2, INFGR1, IL6, IRF2, JAK2, MAP3K10, MMP1, NFkB1A, NOS2A, PIK3CA, SEPX1, SMAD3, TLR2, TYK2, and VDR) were associated with colon cancer mortality (PARTP <0.05); JAK2 (PARTP  = 0.0086), PIK3CA (PARTP = 0.0098), and SMAD3 (PARTP = 0.0059) had the strongest associations. Over 40 SNPs were significantly associated with survival within the 15 significant genes (PARTP<0.05). SMAD3 had the strongest association with survival (HRGG 2.46 95% CI 1.44,4.21 PTtrnd = 0.0002). Seven genes (IL2RA, IL8RA, IL8RB, IRF2, RAF1, RUNX3, and SEPX1) were significantly associated with rectal cancer (PARTP<0.05). The HR for colorectal cancer-specific mortality among colon cancer cases in the upper at-risk alleles group was 11.81 (95% CI 7.07, 19. 74) and was 10.99 (95% CI 5.30, 22.78) for rectal cancer. These results suggest that several genes in the CHIEF pathway are important for colorectal cancer survival; the risk associated with the pathway merits validation in other studies.

Data Availability: The authors confirm that, for approved reasons, some access restrictions apply to the data underlying the findings. Ethical restrictions apply to the patient-level dataset underlying the analyses presented, because of nature of consent forms signed, personnel information contained in the database, and the IRB approval. These restrictions prevent the data from being made fully available in a public repository. Interested researchers are kindly asked to contact the corresponding author for additional information.

Introduction
The CHIEF (Convergence of Hormones, Inflammation, and Energy Related Factors) pathway integrates elements central to the etiology of colorectal cancer (CRC) [1]. The pathway was developed based on our knowledge of the epidemiology of CRC and genes that may influence cancer risk through major components of the pathway, including hormones, inflammation, and energyrelated factors [1]. Many genes in the pathway could influence tumor progression and prognosis given their involvement cell growth, apoptosis, promotion of inflammation and angiogenesis, immune response, and stimulation of tumor invasion and metastasis [2]. The main trunk of the pathway contains a serine/ threonine protein kinase 11 (STK11 or LKB1), mammalian target of rapamycin (MTOR), and the tumor suppressor PTEN (phosphatase tensin homolog deleted on chromosome 10). STK11 responds to changes in cellular energy balance (ATP levels) [3,4] and governs whole body insulin sensitivity [5,6]. NFkB is an important nuclear transcription factor that regulates cytokines and is critical for the regulation of tumorigenesis, cell proliferation, apoptosis, response to oxidative stress, and inflammation while vascular endothelial growth factor (VEGF) plays an important role in regulation of cell growth signaling and is a major mediator of tumor angiogenesis [7] [8].
Cytokines such as interleukins, TGFb-signaling pathway, interferons, and tumor necrosis factor (TNF), are key elements of the inflammatory process in the CHIEF pathway. The TGF-b-signaling pathway is involved in all aspects of tumorigenesis, including stimulation of tumor invasion and metastasis [2]. Signal transduction and activation of transcription (STAT) and mitogen-activated kinases (MAPK) genes are involved in both inflammation and metabolic signaling associated with hormones and energy-related factors. MAPKs serve as an integration point for multiple biological signals and are involved in a variety of cellular processes such as proliferation. Angiogenesis and inflammation are hallmark features of tumorigenesis [9] as well as key elements in the CHIEF pathway, thus it is reasonable to hypothesis that pathway influences survival.
In this paper, we summarize the significance of this pathway as it relates to survival after being diagnosed with colon or rectal cancer using Adaptive Rank Truncation Product (ARTP), building on our previous work that evaluated the pathway with colon and rectal cancer risk where we documented overall risk as well as risk specific to tumor molecular phenotype [10]. This statistical program utilizes a permutation method that allows us to summarize across genes within sub-pathways of the overall pathway to estimate the association with survival of the pathway, genes, and SNPs within the pathway. To further estimate the magnitude of the association of this pathway on survival, we utilize a polygenic risk score that is based on the permutated ARTP findings.

Methods
Two study populations are included in these analyses. The first study, a population-based case-control study of colon cancer, included cases (n51,555 with complete genotype data) identified between October 1, 1991 andSeptember 30, 1994 living in the Twin Cities Metropolitan Area or a seven-county area of Utah or enrolled in the Kaiser Permanente Medical Care Program of Northern California (KPMCP) [11]. The second study, with identical data collection methods, included cases with cancer of the rectosigmoid junction or rectum (n5754 cases with complete genotype data) who were identified between May 1997 and May 2001 in Utah and at the KPMCP [12]. Eligible cases were between 30 and 79 years of age at the time of diagnosis, living in the study geographic area, English speaking, mentally competent to complete the interview, and with no previous history of CRC, and no previous diagnosis of familial adenomatous polyposis, ulcerative colitis, or Crohn's disease. Cases who did not meet these criteria were ineligible as were individuals who were not black, white, Hispanic, or Asian (for the rectal cancer study). All study participants provided written informed consent on Institutional Review Board approved consent forms prior to completing the study questionnaire; the consent form and study protocol was approved by the Institutional Review Board on Human Subjects at the University of Utah, Kaiser Permanente Medical Research Program, and the University of Minnesota.

Tumor Registry Data
Tumor registry data were obtained to determine disease stage at diagnosis and months of survival after diagnosis. Disease stage was categorized using the sixth edition of the American Joint Committee on Cancer (AJCC) staging criteria. One pathologist in Utah did all disease staging. Local tumor registries provided information on patient follow-up including vital status, cause of death, and contributing cause of death. Follow-up was obtained for all study participants and was terminated for the Colon Cancer Study in 2000 and for the Rectal Cancer Study in 2007. At that time all study participants had over five years of follow-up.

Tumor Marker Data
Tumors were defined by specific molecular alterations: any TP53 mutation; any KRAS mutation; MSI+; and CpG Island Methylator Phenotype (CIMP). CIMP status was based on the classic panel and defined as positive if at least two of five markers were methylated [13]. Microsatellite instability (MSI) was based on BAT26, TGFbRII, and a panel of 10 tetranucleotide repeats that has been shown to correlate highly with the Bethesda Panel [14]; our study was done prior to the Bethesda Panel development. These data are included in analysis since we have shown that tumor molecular phenotype influences survival and is associated with SNPs in this pathway [10,15]

TagSNP Selection and Genotyping
TagSNPs were selected using the following parameters: r 2 50.8 defined LD blocks using a Caucasian LD map, minor allele frequency (MAF).0.1, range521500 bps from the initiation codon to +1500 bps from the termination codon, and 1 SNP/LD bin. All markers were genotyped using a custom multiplexed bead array assay format based on GoldenGate chemistry (Illumina, San Diego, California). A genotyping call rate of 99.85% was attained. Blinded internal replicates represented 4.4% of the sample set. The duplicate concordance rate was 100.00%. S1 Table list all genes included in the sub-pathway while S2 Table list number of SNPs assessed for each gene and the P ARTP value for each gene on the platform. We analyzed data from 155 genes which included 10 genes that were previously assessed in our lab (VDR, ESR1, ESR2, AR, IGF1, IGF1R, IGFBR3, IRS1, IRS2, and PPARG) along with 145 genes from the Illumina platform. The initial platform included 1536 SNPs, of these, 1381 were successfully analyzed by Illumina. We included in our analysis only those SNPs were.95% of the population had results, leaving 1246 SNPs for analysis No imputation was done.

Statistical Methods
The goal of the analysis was to evaluate the overall associations between genes and pathways as they relate to colon and rectal cancer survival. To do this, we used ARTP, a statistical program that utilizes a highly efficient permutation algorithm to determine significance at the gene, sub-pathway, and pathway level for survival after diagnosis with colon or rectal cancer [16]. Vital status and survival months were permuted 10,000 times within R version 3.0.2 (R Foundation for Statistical Computing, Vienna, Austria). Since our focus was on colorectal cancer-specific mortality, people who died from other causes or who were lost to follow-up were censored at the date of death or last contact. Months of survival were calculated from date of diagnosis until end of follow-up or date of last contact. Cox Proportional Hazards models were adjusted for age, race/ethnicity, sex, AJCC stage, and tumor molecular phenotype. Tumors were defined by specific molecular alterations: any TP53 mutation; any KRAS mutation; MSI+; and CIMP high. As the proportion of MSI+ tumors in the rectal cases was ,3% [17], we did not include these tumor markers as an adjustment variable for rectal cancer. Associations with SNPs within ARTP were assessed assuming an additive model unless a preliminary check of the hazard ratios indicated a dominant or recessive mode of inheritance. For SNPs with gene p values ,0.05 that were associated with colon or rectal cancer based on ARTP results, we report Hazard Ratios (HR) and 95% confidence intervals (CIs) assessed from Cox Proportional Hazard models in SAS to show the magnitude of the association between these SNPs and hazard of dying after diagnosis with colon or rectal cancer; we also report p values for likelihood ratio test (LRT). We include those genes which contributed to the ARTP permutated gene p value for reference since they could possibly indicate greater significance and are of interest for replication elsewhere. We did not further adjust SNP associations for multiple comparisons since our analytic approach is top down: looking at the overall pathway (where number of genes are adjusted), genes (where number of SNPs are adjusted), and SNPs that contribute to significant permutated P ARTP values. Genes were assigned to only one subpathway prior to the hierarchical analyses. However, we realize many genes could function in other sub-pathways to which they were not assigned for analysis.
To summarize the risk associated with the CHIEF pathway, we calculated polygenic summary scores. To conservatively estimate risk, we included in the risk models SNPs from genes where the gene ARTP p values were 0.10 or less and the SNP p values within those genes were 0.10 or less. Our analysis includes SNPs with p,0.10 only from those genes where the P ARTP was ,0.10. Thus, we include SNPs that were not statistically significant and we omit SNPs that were statistically significant in genes where the P ARTP was.0.10. Since genes are associated with multiple sub-pathways, we did not restrict to genes where the sub-pathway was significant. If SNPs within the same gene had r 2 values of 0.80 or greater only one SNP was included in the model. Risk was modeled using at-risk alleles, using all genotypes with the low-risk genotype or referent group as zero. For the codominant or additive model a score of zero, one, or two was assigned relative to the number of at-risk alleles, while scores of zero or two were assigned for the dominant and recessive models in order to capture the risk associated with the various genotypes. Polygenic scores were then used to summarize risk across the genes and SNPs to better capture the risk associated with the pathway.

Results
The majority of study participants were over 60 years of age, were non-Hispanic white, and male (Table 1). Most cases were diagnosed with an AJCC Stage 1 or 2 tumor. At the end of follow-up roughly 35% of study participants had died. The overall pathway was not statistically significantly associated with survival for either colon or rectal cancer (Table 2). However, the JAK/STAT/SOC was significant for colon cancer survival (P ARTP 50.035) and the interleukin pathway was of borderline significance for rectal cancer (P ARTP 50.06).

Discussion
Several genes were associated with survival after diagnosis with colorectal cancer, although the overall pathway was not statistically significant and only the JAK/ STAT/SOCs sub-pathway had a P ARTP ,0.05. Fifteen genes were associated with colon cancer survival (P ARTP ,0.05) and seven genes were associated with rectal cancer survival. It should be noted this represents 9.6% of genes analyzed for colon cancer and approximately 5% of genes analyzed for rectal cancer and could be chance findings; thus these findings need replications. We observed that the hazard of dying after being diagnosed with either colon or rectal cancer increased  with increasing number of at-risk alleles. The lack of statistical significance observed for the overall pathway could reflect sub-pathway groupings that did not optimize the data. Further evaluation at the gene and SNP level suggested that many components of the pathway contributed to survival, although a large segment of the pathway did not. The JAK/STAT-signaling pathway was the only sub-pathway that was statistically significant using ARTP. This pathway plays a critical role in immune response and regulation of inflammation given its essential affiliation with cytokine signaling. STAT3 specifically has been shown to promote uncontrolled cell growth and survival through dysregulation of gene expression involved in apoptosis, cell-cycle regulation, and angiogenesis. [18] JAK1, JAK2, and STAT3 have been associated with colorectal cancer progression [19]. In our analysis, STAT3 and STAT5 were of marginal significance with colon cancer survival, while JAK2 and TYK2 were statistically significant. Within these genes, several SNPs were significantly associated with survival.
The TGF-b-signaling pathway has been shown to be one of the strongest pathways associated with colon cancer risk in our data. Others have shown that improved disease-free survival after diagnosis with CRC was associated with increased TGF-b expression [34]. Forsti and colleagues looked at nine polymorphisms in the TGF-b-signaling pathway and CRC among 308 cases of colorectal cancer [35] and observed that TGFbRA IVS7G+24A minor allele was associated with better survival. Several others studies have focused on SMAD2, SMAD4, and SMAD7 and found associations with prognosis after CRC diagnosis [36,37]. We only observed marginally significant associations with BMP2 (P ARTP 50.083), BMPR1A (P ARTP 50.053), BMPR1B (P ARTP 50.069) for colon cancer survival. RUNX3 was significantly associated with rectal cancer survival, while BMP1 (P ARTP 50.099) and BMPR1A (P ARTP 50.085) were marginally significant.
SEPX1 was associated with survival for both colon and rectal cancer while SEP15 was marginally associated (P ARTP 50.068) with colon cancer survival. We previously reported that three SNPs in this pathway were associated with rectal cancer survival, SEPN1 rs718391 (HR 1.67, 95% CI 1.11,2.51) and SEPX1 rs13331553 (HR 1.46 95%CI 1.07,2.00) and SEPX1 rs732510 (HR 1.68 95% CI 1.09,2.60) after adjustment for multiple comparisons using FDR. However, taking the gene approach as we did with ARTP, SEPX1 remained significant for both colon and rectal cancer.
Several cytokines, including interleukins and interferons, and other mediators of inflammation were associated with both colon (INFGR1, IL6, IRF2, NFkB1A, TLR2) and rectal cancer survival (IL1A and IL3), as was suppressor of cytokine signaling (SOCS1). Functions of cytokine-related pathways include apoptosis and cell proliferation. INFG has been shown to regulate the expression of apoptosisrelated genes and has been hypothesized to regulate cell sensitivity to apoptosis [42]. TLRs can promote inflammation, cell survival and tumor progression [43]. Studies analyzing associations between risk or survival and SNPs in interleukin genes such as IL1B, IL1RA, IL10 have reported conflicting results; some SNPs being associated with increased risk or survival while others associated with a lower risk or survival for colorectal cancer [44][45][46].
To estimate the magnitude of risk associated with carrying multiple high-risk alleles, we created a polygenic risk score. Our results suggest that the genetic variant load is important for survival after diagnosis since we observed substantial increased risk of dying with increasing numbers of variant genotypes. While one could hypothesize that a single insult to the pathway could influence risk and that additional insults would have minimal effect on risk, our data suggest otherwise. Inflammatory pathways are somewhat redundant, composed of multiple cytokines with overlapping functions; this supports that multiple insults to the pathways would result in increased risk. Our data support the hypothesis that increases in risk and hazard of dying is linear and that as genetic variant load of high-risk genotypes increases, so does the risk of developing cancer and dying after being diagnosed with cancer. However, caution is in order given the data used to identify at-risk alleles, was then used in the polygenic risk score. While we did not just take significant SNPs in creating the risk score, but used our permutated data to identify at-risk alleles, these results still warrant caution, especially in terms of the magnitude of the associations detected. Furthermore, to help place the risk observed in these data to other risk factors for survival, it should be noted that disease stage remains the strongest predictor of survival, with those being diagnosed at AJCC Stage 4 having over a 12-fold increased risk of dying than those diagnosed at a local disease stage.
The pathway approach we used was novel in that it summarized the statistical significance of the pathway and genes rather than focus on individual SNPs. ARTP allowed us to combine single SNP p values using the rank truncated product statistic and assess significance via permutations at multiple levels, including the gene, sub-pathway, and overall pathway level. While we selected genes that we believed were most important to the pathway, there are many other genes and SNPs involved in this pathway that could be important and contribute to colorectal cancer-specific mortality. We also are limited in our ability to assess interaction between genes and with lifestyle factors that could influence risk, since ARTP at this time does not allow for assessment of interactions. Unfortunately, we do not have a separate population to validate these findings and therefore encourage others with similar data to replicate these findings. Likewise, we did not attempt a test and training set, given the impact of that method on study power; lack of replication thus could be from lack of power. Other limitations to our assessment is lack of treatment and other related medical conditions that could impact survival. While we can argue that it is unlikely that these genes and SNPs are associated with treatment, we do not have the ability to test that. However, treatment is highly correlated with AJCC stage, and we have adjusted for stage in our analysis.
It is noteworthy that our findings for colon and rectal cancer are for the most part different. There are several potential explanations for these findings. First, disease pathways could be different for the two cancer sites, and thus genes and sub-pathways that are important could also differ. Another explanation for these differences, could stem from a smaller sample size for rectal than colon cancer. This could explain the lack of replication in rectal cancer from colon cancer findings, however it would explain differences observed in rectal cancer that are not replicated in colon cancer. While the underlying cause of these differences is not clear, it has been observed that risk factors differ between colon and rectal cancer [11,[47][48][49][50][51][52][53][54].
In conclusion, there is support that genes within the CHIEF pathway are associated with colorectal cancer-specific mortality, although the overall pathway did not influence risk. Replication of these findings, along with more detailed assessment of the specific genes may help identify key variants that could importantly contribute to prognosis.