Risk of colorectal cancer in patients with diabetes mellitus: A Swedish nationwide cohort study

Background Colorectal cancer (CRC) incidence is increasing among young adults below screening age, despite the effectiveness of screening in older populations. Individuals with diabetes mellitus are at increased risk of early-onset CRC. We aimed to determine how many years earlier than the general population patients with diabetes with/without family history of CRC reach the threshold risk at which CRC screening is recommended to the general population. Methods and findings A nationwide cohort study (follow-up:1964–2015) involving all Swedish residents born after 1931 and their parents was carried out using record linkage of Swedish Population Register, Cancer Registry, National Patient Register, and Multi-Generation Register. Of 12,614,256 individuals who were followed between 1964 and 2015 (51% men; age range at baseline 0–107 years), 162,226 developed CRC, and 559,375 developed diabetes. Age-specific 10-year cumulative risk curves were used to draw conclusions about how many years earlier patients with diabetes reach the 10-year cumulative risks of CRC in 50-year-old men and women (most common age of first screening), which were 0.44% and 0.41%, respectively. Diabetic patients attained the screening level of CRC risk earlier than the general Swedish population. Men with diabetes reached 0.44% risk at age 45 (5 years earlier than the recommended age of screening). In women with diabetes, the risk advancement was 4 years. Risk was more pronounced for those with additional family history of CRC (12–21 years earlier depending on sex and benchmark starting age of screening). The study limitations include lack of detailed information on diabetes type, lifestyle factors, and colonoscopy data. Conclusions Using high-quality registers, this study is, to our knowledge, the first one that provides novel evidence-based information for risk-adapted starting ages of CRC screening for patients with diabetes, who are at higher risk of early-onset CRC than the general population.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 earlier depending on sex and benchmark starting age of screening). The study limitations include lack of detailed information on diabetes type, lifestyle factors, and colonoscopy data.

Conclusions
Using high-quality registers, this study is, to our knowledge, the first one that provides novel evidence-based information for risk-adapted starting ages of CRC screening for patients with diabetes, who are at higher risk of early-onset CRC than the general population.

Author summary
Why was this study done?
• Diabetes is associated with increased risk of colorectal cancer (CRC), especially in young adults before age 50.
• CRC incidence is increasing among young adults who are not targeted for screening.
• Diabetes has not been considered as a risk factor in any CRC screening guideline.
What did the researchers do and find?
• For each single age, we calculated the risk of developing CRC in the next 10 years; for example, at age 50, which is the most common age for starting CRC screening, the risk of developing CRC during next 10 years (age 50 to 59) in the Swedish population was 0.44% in men and 0.41% in women.
• Men and women with diabetes reached the risk levels for 50-year-old individuals (0.44% and 0.41%, respectively) at about age 45 instead of age 50, i.e., nearly 5 years earlier than the general population, whereas patients with an additional family history of CRC reach these screening risk thresholds, 12 to 21 years earlier than the general population.

What do these findings mean?
• These findings for the first time provide evidence-based information about the best starting age of screening for CRC in patients with diabetes.
• A major strength of this study would relate to the extremely large and comprehensive national (Swedish) datasets available and the duration involved (all Swedish residents born after 1931 and their parents, followed up to 2015).
• Clinicians could inform patients with diabetes (with or without family history of CRC) about this possibility and encourage individualized counseling for CRC screening.

Introduction
Colorectal cancer (CRC) has become the third most common cancer worldwide and is second in cause of death due to cancer, despite being a preventable disease [1]. Since the emergence of CRC screening, myriad studies have demonstrated that screening for CRC is more beneficial than for any other major malignancy and that screening is more cost-effective than not screening [2,3]. In the United States, it has been highlighted that since the introduction of CRC screening, CRC incidence rates have declined [4]. However, it was found that the trend for all ages hid patterns that existed in young people. Since the 1980s, incidence in those aged 20 to 39 increased 1.0% to 2.4% per year, for those aged 40 to 54 incidence increased 0.5% to 1.3% annually and markedly, an adult born in 1990 was observed to have twice the risk of CRC at the same age as an adult born in 1950 [4]. Similar patterns are observed in Europe, where an investigation of 143 million young adults across 20 countries showed CRC incidence rapidly rising in those who are below age 50 years [5]. This trend has been observed rather globally among young individuals, and therefore, screening guidelines should be adjusted accordingly [5,6]. Few efforts have been put forth to combat the issue of rising CRC incidence in young adults. Strategies have included lowering the age of screening for all individuals regardless of risk, which has high financial burdens [7]. Alternatively, it has been suggested to identify risk factors that make young individuals particularly high risk and personalize the screening procedure. However, as of yet, most countries recommend a single average-risk age for CRC screening (most commonly at age 50) [8]. Few risk factors have been highlighted for earlier CRC screening in guidelines that are centered around family history, inflammatory bowel disease (IBD), or rare genetic disorders, which alone cannot account for the widespread increase in young-onset CRC worldwide [8,9]. Hence, it is believed that targeting high-risk young people specifically is the best and least invasive approach, and investigating risk factors in young people is the way to combat this issue [3].
Diabetes mellitus and CRC share common risk factors and are both becoming more prevalent in young adults, and diabetes diagnosis has been consistently associated with an increased risk of CRC later in life. A recent study has also shown that having a diabetes diagnosis before the age of 50 increases the risk of early-onset CRC nearly 2-fold [10]. Despite this, diabetes has never been indicated in CRC screening guidelines as a risk factor [8]. Identifying potential risk groups for early-onset CRC has clinical significance if high-risk individuals are made aware of their risk and potentially screened earlier. We aimed to determine whether individuals with a diabetes diagnosis with and without family history of CRC reach the CRC risk of their peers in the general population at younger ages, and if so, how many years earlier. We used high-quality data from several long-standing nationwide Swedish registers, which resulted in, to the best of our knowledge, the world's largest and most robust study of its kind.

Methods
In this study, we used data from several nationwide registers from Sweden for all individuals born in Sweden since 1931 and their parents. The study dataset was created through the linkage of the data from Multi-Generation Register, Death Register, Swedish Cancer Registry, and national censuses using unique lifetime registration numbers assigned to all residents. The Multi-Generation Register contains genealogical information. The Death Register provides information on date of death, and the national censuses offer data on participants' migration records and similar demographic measures. The linked Swedish Cancer Registry data carried information on date of cancer diagnosis, tumor topography and morphology, and detailed diagnostic reports from physicians for the period 1958 to 2015. All cancer records were reported using International Classification of Diseases (ICD) codes from versions 7 through 10. In the linked Swedish family-cancer datasets, there were about 13 million individuals with genealogical information, of which more than 160,000 were patients with CRC diagnosed during the cancer registry period 1964 to 2015.
The abovementioned datasets, the Swedish National Patient Registers, which include data from all private and public hospitals and specialized doctor visits in Sweden, were linked together using pseudonymized identification numbers (S1 Fig). Hospital (inpatient) records from 1964 to 2015 and day clinic records from 2001 to 2015 with detailed information on disease diagnosis and date of visit were available for this study. Information on patients with diabetes was extracted using ICD codes (ICD-7: 260; ICD-8: 250; ICD-9: 250; ICD-10: E10, E11, E13, and E14). All individuals with pregnancy-and malnutrition-related diabetes as well as those with a diabetes diagnosis following a CRC diagnosis were excluded. We recognized family history of CRC in first-degree relatives (FDRs). In the analysis, personal history of diabetes and family history of CRC were treated as timedependent variables. This means that all individuals were only recorded as diabetic cases from the year in which they were diagnosed. Similarly, an individual was only recorded as having CRC family history from the year in which the FDR was diagnosed. The rationale behind utilizing the dynamic (time dependent) method is that it is understood to be the most appropriate for studies involving risk stratification since it provides real-time risk estimates that can be applied in clinical settings [11]. For instance, if a nondiabetic individual wants to know their risk of developing CRC at the present time, only their known histories can be taken account even if they were to become diabetic later in life. Alternatively, the static (traditional method of ascertaining family and personal disease history in studies) method is possible in registerbased studies where an individual's entire prior personal or family histories are known at the conclusion of study follow-up. Resultantly, the static method would be most appropriate for estimating the effect size that a certain risk factor has on an outcome. We chose to employ the dynamic method since our primary aim was to provide risk-adapted starting age of CRC screening in patients with diabetes that could be used for real-time counseling. Furthermore, the dynamic method reflects the time-varying nature of disease histories, making it ideal for the purposes of this study.
The main outcome measure in the analysis was 10-year cumulative risk, i.e., the risk (%) of developing CRC within the next 10 years at each age. Risk-adapted screening ages in patients with diabetes were calculated using 10-year cumulative risk of CRC. The 10-year cumulative risks were calculated using the following formulas: • Age-specific incidence rate = Total cases at each age (every 1 year) divided by the total personyears at that age • 10-year cumulative rate for age X = Sum of 10 consecutive yearly age-specific incidence rates from age X to age X+9 • 10-year cumulative risk = 1 − exp (−10-year cumulative rate) Rather than aggregating cumulative incidence by age groups (the traditional method of calculating cumulative risk), age-specific precise values from individual participant's yearly data were used in the calculation [12]. Comparing 10-year cumulative risk in each risk group to the population 10-year cumulative risk allowed the inference of risk-adapted screening ages. A smoothing effect to reduce random variation in incidence rates was employed using a moving average. For instance, for the 10-year cumulative risk at age 30, the average of the 10-year cumulative risks at ages 29, 30, and 31 was used, while for age 31, the average of the 10-year cumulative risks at ages 30, 31, and 32 was used, and so on. This method of calculating risk-adapted starting age of cancers has already been used for some other conditions [13][14][15].
Using this method, we could provide the age at which patients with diabetes with/without family history of CRC reached a similar level of CRC risk to that of a 50-year-old individual in the general population, the most commonly recommended age of first screening in guidelines [8]. We also provided the same for 45, 55, and 60 year olds as they represent the variability in starting ages of CRC screening globally. Covariates included age and sex. As a sensitivity analysis, we repeated the 10-year cumulative risk analysis in men, removing all individuals with a prior diagnosis with IBD, an established CRC risk factor, to ensure they did not confound our analysis [16]. All statistical analyses were conducted using SAS statistical program version 9.4 (by SAS Institute, Cary, North Carolina, USA). To avoid risk of identification of participants, researchers had only access to pseudonymized secondary data. The main analyses were planned before starting the execution of data analyses. However, further analyses have been performed to answer reviewers' comments, such as adding supplementary tables of basic characteristics and a table for 10-year cumulative risk by age group, with no influence on our main findings. No data-driven changes to analyses took place.

Ethics statement
The study protocol was approved by the Lund Regional Ethics Committee (2012/795).

Results
From the beginning of follow-up, a total of 12,614,256 individuals with genealogical information were included in the analysis (51% men; age range at baseline 0 to 107 years). From this population, 162,226 patients with CRC were identified. Additionally, a total of 559,375 patients with diabetes were identified, and 101,135 (18%) of them were diagnosed before age 50. Among patients with diabetes, the mean time to CRC diagnosis was 5.8 years. Further characteristics of patients with diabetes and patients with CRC are presented in Tables 1 and 2. The 10-year cumulative risk estimates of developing CRC in patients with diabetes with and without family history of CRC by sex and age group are presented as the Supporting information in S1 Table. Benchmark age 50 Our results in terms of 10-year cumulative risk (Figs 1 and 2) showed that for 50-year-old men in the general Swedish population, risk of developing CRC within the next 10 years was 0.44%. The 10-year cumulative risk for 50-year-old women in the general Swedish population was 0.41%. Men with no family history of CRC but with a diabetes diagnosis before age 50 reached the same 10-year cumulative risk of CRC as 50-year-old men in the general Swedish population at age 45, i.e., 5 years earlier, whereas women with no family history of CRC but with a diabetes diagnosis before age 50 were observed to reach the same 10-year cumulative risk as 50-year-old women in the general Swedish population at age 46, i.e., 4 years earlier. Men and women with diabetes and family history of CRC attained the population level of 10-year cumulative risk at age 32 (18 years earlier) and age 38 (12 years earlier), respectively. Men without diabetes or a CRC family history reached the population level of risk at age 51 (1 year later).

Benchmark age 45
Our results in terms of 10-year cumulative risk ( Table 3) showed that for both 45-year-old men and women in the general Swedish population, risk of developing CRC within the next 10 years was 0.24%. Men with no family history of CRC but with a diabetes diagnosis before age 45 reached the same 10-year cumulative risk of CRC as 45-year-old men in the general Swedish population at age 40, i.e., 5 years earlier, whereas women with no family history of CRC but with a diabetes diagnosis before age 45 reached the same risk level as 45-year-old women in the general Swedish population at age 42, i.e., 3 years earlier. Men and women with diabetes and family history of CRC attained the population level risk at age 31 (14 years earlier).

Other benchmark ages (55 and 60)
As different countries have different benchmark ages for initiation of CRC mass screening in the population (ranging from 45 in the US to 55 to 60 in the United Kingdom), we provided risk-adapted starting ages of CRC screening for different benchmark ages (45, 50, 55, and 60 years; Table 3). Those with a personal history of diabetes and no family history of CRC reached the population level of risk 4 to 5 years earlier than the general Swedish population for benchmark ages of screening 55 and 60. By contrast, those with both diabetes and family history of CRC reached the general Swedish population risk 21 years earlier (men) and 14 to 15 years earlier (women). Finally, both men and women without diabetes and CRC family history reached the population level of risk 1 year later than the general Swedish population (age 56 for benchmark age 55 and age 61 for benchmark age 60).

Comparison with existing guidelines
A comparison between our findings for patients with diabetes with a CRC family history and established screening guidelines for individuals with an FDR with CRC revealed a wide range of difference between our recommended risk-adapted starting ages of screening and those in the current guidelines (from 5 to 21 years), although the difference for other example ages could be even higher (S2 Table). Such a difference for patients with diabetes without family history of CRC was 3 to 5 years depending on sex and benchmark ages of mass screening ( Table 3).

Ten-year cumulative risk after removing patients with IBD
We also excluded patients with IBD from our analysis and did not find any changes of substance to our results. A total of 445,444 cases of IBD (185,869 men; 44%) were excluded from the analysis. Of all IBD cases, 5,957 (1,613 men; 27%) preceded a CRC diagnosis, and 19,232 IBD cases (6751 men; 35%) were comorbid with diabetes. No substantial changes in our main estimates were detected after exclusion of IBD cases.

Discussion
Using several high-quality Swedish nationwide registers, we found that patients with diabetes, without family history of CRC, reach the same level of CRC risk as 50-year-old individuals in The associations between diabetes, family history of CRC, and CRC risk have been already reported [17][18][19]. However, there has been no study to date that assessed how these risk associations can be used in clinical counseling of patients with diabetes with and without family history of CRC and offered risk-adapted starting ages of CRC screening for them. Our current study provided this novel and clinically useful information. Another novel aspect of this study in comparison to others that investigated CRC risk in patients with diabetes is the use of 10-year cumulative risk to plot changes in CRC risk by age [18].
In our study, we compared 10-year cumulative of CRC risk for different combinations of sex, age, CRC family history, diabetes status, and benchmark ages for starting screening. We used a benchmark age of 50 years as an example since this is the recommended age of first screening by most CRC screening guidelines [8]. Our results show that patients with diabetes reach the Swedish population level of 10-year cumulative risk several years earlier, but when also considering that young patients with diabetes have a much higher risk of early-onset CRC as opposed to late-onset CRC, screening even in the 30s might be warranted in people with both CRC family history and diabetes. Although CRC screening in the 30s is unusual, when considering that the mean time for an adenomatous polyp to progress to CRC is between 10 and 12 years [20] and that CRC incidence is rapidly rising in those below screening age, it could be justified. Although the results of randomized trials of colonoscopy use are yet to be learned, elevated CRC rates in young adults have been observed and need action [21][22][23]. It has also been reported that overall CRC screening is effective and cost-effective and that a riskadapted approach is the best [2,24]. Our findings showed that risk-adapted CRC screening by diabetes personal history with and without family history of CRC might be beneficial. Furthermore, similar trends in 10-year cumulative risk of CRC in both men and women with diabetes demonstrate internal validity of our results, and minor differences are in line with known higher risk of CRC in men than in women. It is noteworthy, however, that the evaluation of cost-effectiveness of risk-adapted CRC screening, specifically for patients with diabetes, warrants further investigation.
Our study benefited from several high-quality Swedish nationwide register datasets, including Swedish Cancer Registry, Multi-Generation (genealogy) Register, national censuses, and Inpatient and Outpatient Registers with roughly half a century of follow-up. These resources enabled us to design the world's largest and most robust study of its kind. All datasets were linked through pseudonymized identification number, removing traditional limitations of studies, such as biases due to self-reporting CRC diagnosis, family history of CRC, and also diagnosis of diabetes. Furthermore, this long-term cohort study allowed us to establish CRC incidence over time with 10-year cumulative risk so as to measure risk dynamically with age. This is a more detailed look at CRC risk as compared to just the use of relative risk measures, such as standardized incidence ratio or hazard ratio, used by most population-based studies since we were able to compare all risk groups at various ages, rather than produce a single estimate of relative risk [17]. Another strength of this study was the use of time-dependent history of diseases. Since we had precise information on date of diagnosis of CRC in individuals, in their family members, and date of diabetes diagnosis, we were able to ensure all instances of CRC family history and diabetes diagnosis occurred before CRC diagnosis. This means that we were able to avoid potential issues of reverse causation. The time-dependent method in this study is preferred for risk stratification and identifying individuals for risk-adapted screening since it reflects the dynamic nature of developing diabetes and diagnosis of CRC in family members [11]. In addition, we were able to avoid limitations common in most studies that treat disease history as static conditions, such as immortal time bias and exposure misclassification by ensuring individuals were considered as diabetic cases from the date of diagnosis and non-cases until that point.
One of the limitations of our study was minimal access to data on lifestyle factors. Type 2 diabetes and CRC share several risk factors including obesity and lack of regular physical activity [25][26][27]. However, previous cohort studies have shown that controlling for common risk factors of CRC and type 2 diabetes, such as obesity and diet, does not significantly modify CRC risk estimates in patients with diabetes [28,29]. In a related study, we had data on hospitalization for chronic obstructive pulmonary disease (COPD, a surrogate measure for smoking), obesity, and alcohol use disorder. Adjustment for these risk factors did not alter those results.
We could not stratify our analyses by diabetes type because the ICD codes for diabetes diagnosis in our dataset did not accurately differentiate the type of diabetes until 1997 (ICD-10) and even after that the majority had both diagnoses, which might correspond to older definition of insulin-dependent and non-insulin-dependent diabetes mellitus rather than the actual type of diabetes. Type 1 diabetes (which does not share risk factors with CRC like type 2 diabetes and usually is diagnosed early in life) has also been implicated with a higher risk of CRC. This suggests that the association between diabetes and CRC is not purely dependent on lifestyle factors and therefore, irrespective of type, is an ideal candidate for risk-adapted CRC screening [30].
Another limitation was lack of colonoscopy data to ensure elevated risk of CRC was not confounded by the possibility that patients with diabetes and patients with CRC family history are more likely to be screened for CRC. In a related study, we evaluated risk of CRC in patients with diabetes by calendar period and did not find substantial differences in risk of familial CRC [31]. The lack of CRC screening data also did not have a significant impact on our findings and the potential implication of their application. This is because a nationwide organized CRC screening does not exist in Sweden. An organized screening as an official recommendation (not a law) has been introduced only as a pilot phase in 2008 in the Swedish Stockholm Gotland area merely for age 60 to 69, where even invitational coverage accounted for less than 9% of the nationwide screening-eligible population (age 50 to 74) [32]. Furthermore, patients with diabetes have been recognized to be poor at adhering to diabetes treatment recommendations [33], making it unlikely that they would seek out CRC screening more so than a person in the general population. As a sensitivity analysis, we also removed patients with IBD from our analysis to ensure they did not confound the association between diabetes and CRC and found minimal attenuation to the results.
Since there is a wide disparity in CRC screening guidelines globally for age of first screening, such as age 55 in the Netherlands, age 55 or 60 in the UK (depending on location), age 45 in the US, and age 50 in most other countries such as Germany [8,34], we provided results for various benchmarks. In fact, the applied method can be "personalized" to fit any population or any preferred benchmark age of initial screening in the general population. We found that for all benchmark ages of screening, those with combined CRC family history and diabetes personal history reach the Swedish population level of 10-year cumulative risk much earlier than CRC family history and diabetes personal history individually, suggesting that both criteria contribute to CRC risk differentially. Regardless of the specific benchmark, however, the results of our study may be informative for the development of personal risk calculators, which possibly in combination with other established factors or in combination with genetic risk scores [35][36][37], and used for calculating personalized starting ages of screening in the future. The method of integrating such results into other risk prediction models with more risk factors of cancer (but no information on diabetes) have been discussed elsewhere [15]. Further discussions around the importance of earlier screening in patients with diabetes and the generalizability of our findings have been included in S1 eDiscussion as a Supplementary information, which also contains explanation about age-specific incidence of CRC in Sweden over time ( S3  Fig).

Conclusions
The present study provides population-based evidence for potential risk-adapted starting ages of CRC screening in patients with diabetes with and without family history of CRC. With CRC incidence rising among young adults and the accumulation of evidence associating diabetes with early-onset CRC risk, we observed that patients with diabetes in Sweden reach the general population level of CRC risk several years earlier. Patients with both diabetes and family history of CRC reached the population level of risk 1 to 2 decades earlier than the general Swedish population. Irrespective of disparity and uncertainty regarding the optimal age of screening for average risk individuals globally, our evidence-based results propose a novel risk group who may benefit from earlier initial screening. Despite lack of data regarding type of diabetes and lifestyle factors, our findings warrant investigation into the potential advantages, disadvantages, and efficacy of screening patients with diabetes earlier. Our findings thereby assist to consider a risk-adapted approach to CRC screening or at the very least can be used to inform those with diabetes about how many years earlier than the general population they could initiate CRC screening.