Diagnostic Intervals and Its Association with Breast, Prostate, Lung and Colorectal Cancer Survival in England: Historical Cohort Study Using the Clinical Practice Research Datalink

Rapid diagnostic pathways for cancer have been implemented, but evidence whether shorter diagnostic intervals (time from primary care presentation to diagnosis) improves survival is lacking. Using the Clinical Practice Research Datalink, we identified patients diagnosed with female breast (8,639), colorectal (5,912), lung (5,737) and prostate (1,763) cancers between 1998 and 2009, and aged >15 years. Presenting symptoms were classified as alert or non-alert, according to National Institute for Health and Care Excellence guidance. We used relative survival and excess risk modeling to determine associations between diagnostic intervals and five-year survival. The survival of patients with colorectal, lung and prostate cancer was greater in those with alert, compared with non-alert, symptoms, but findings were opposite for breast cancer. Longer diagnostic intervals were associated with lower mortality for colorectal and lung cancer patients with non-alert symptoms, (colorectal cancer: Excess Hazards Ratio, EHR >6 months vs <1 month: 0.85; 95% CI: 0.72-1.00; Lung cancer: EHR 3-6 months vs <1 month: 0.87; 95% CI: 0.80-0.95; EHR >6 months vs <1 month: 0.81; 95% CI: 0.74-0.89). Prostate cancer mortality was lower in patients with longer diagnostic intervals, regardless of type of presenting symptom. The association between diagnostic intervals and cancer survival is complex, and should take into account cancer site, tumour biology and clinical practice. Nevertheless, unnecessary delay causes patient anxiety and general practitioners should continue to refer patients with alert symptoms via the cancer pathways, and actively follow-up patients with non-alert symptoms in the community.


Introduction
Between 1995 to 2007, the five-year relative survival from breast, colorectal, lung and prostate cancers was 7 to 10% lower in the UK compared with other developed countries [1][2][3]. This was attributed to delayed diagnosis or presentation at a very advanced stage [2,4]. Rapid diagnostic pathways and targets were implemented within the UK National Health Service (NHS), with the aim of improving cancer outcomes and increasing cancer survival in the UK [5,6]. The National Institute for Health and Care Excellence (NICE) also set out referral guidelines to expedite the referral of patients with symptoms that are directly suggestive of cancer [7,8].In the cancer patient pathway, primary care and referral delays account for a bigger proportion of the total delay in days compared to secondary care delay [9]. Shortening diagnostic intervals (time between presentation to health care professionals and diagnosis) was made a priority as part of the early diagnosis initiative by the UK Department of Health [10]. Shorter diagnostic intervals may result to earlier tumour stage at diagnosis, which could then lead to improved outcomes, including survival [11,12]. However, evidence shows that the effect of diagnostic intervals on survival differs by cancer site [11][12][13][14][15][16][17][18][19][20][21][22][23], with some studies suggesting that longer diagnostic intervals are associated with higher mortality for cancers of the urinary tract, colon and breast [11,22,24], while others variously report an absence of any association for breast, colorectal, lung and gastro-oesophageal cancers [13,15,18,22,24], higher mortality with shorter diagnostic intervals for lung cancers [19,20], or higher mortality with both shorter and longer diagnostic intervals for colorectal cancer (i.e. a U-shaped relationship) [21].This variation in associations between diagnostic interval and survival may reflect differences in clinical detection pathways, patient and physician behaviour, functioning of the health care system, and the biological behaviour of the tumour [13]. The role of tumour aggressiveness has been suggested to explain some of the counter-intuitive findings, where patients with longer diagnostic intervals show better survival compared to patients with shorter diagnostic intervals (waiting time paradox) [12]. More aggressive tumours are likely to cause symptoms that would draw attention to the underlying cancer and may prompt earlier diagnosis, but would also spread rapidly and result in poorer prognosis [16,21,24]. Slow growing tumours produce non-definitive symptoms that could lead to longer diagnostic interval [16,21], but would also allow time for treatment [24]. Given this potential for 'confounding by indication', symptom presentation should be considered in analyses of associations between diagnostic interval and survival. Using an historical cohort of patients with breast, colorectal, lung and prostate cancer, we assessed associations of diagnostic interval (time from presentation in primary care to diagnosis) with five-year survival, and stratified these by NICE-qualifying alert and non-alert symptom presentations [7,8].

Data sources
Data for this analysis came from the Clinical Practice Research Datalink (CPRD, formerly General Practice Research Database or GPRD), a large computerized database of anonymised primary care medical records [26]. It currently includes prospectively gathered administrative, clinical and prescribing records for about 5 million active patients from over 600 primary practices throughout the UK, equating to 7% of the population [26]. Individuals registered on the database are representative of the age, sex and geographical distribution of the UK population. Data are subject to thorough validation, audit and quality checks [26,27] and there is a high level of diagnostic validity for cancers [25,28]. The CPRD records were linked to National Cancer Data Repository (NCDR) and to the 2007 English Indices of Multiple Deprivation (IMD) datasets. These linkages were only performed for records from general practices in England who agreed to such linkages (about 52% of CPRD practices). The NCDR captures data from the Merged Cancer Registry (containing clinical and tumour characteristics), the inpatient Hospital Episode Statistics (HES, source of ethnicity data) and the Office of the National Statistics (ONS, source of mortality data) [29]. The IMD dataset contains the 2007 IMD score [30], which was used in the study as an indicator of the level of deprivation (see below). The linked datasets were provided to the researchers in an anonymised form.

Study variables
Diagnostic interval was defined as the time between the date of presentation in primary care and the date of cancer diagnosis. This interval was categorised a priori as <1 week, 1-2 weeks, 3-4 weeks and >1 month for breast cancer, and <1 month, 1-2 months, 3-6 months and >6 months for colorectal, lung and prostate cancer, based on time intervals that the clinical researchers (MR and RM) considered meaningful to clinical practice. We used different categories for the time intervals for breast cancer as 75% of patients were seen within 31 days of symptom presentation compared to 130 days or more for the other cancer sites.A primary care presentation was defined as the earliest date of consultation with a general practitioner (GP), occurring up to one year prior to the first cancer diagnosis, where a patient was recorded by the GP as having either a NICE-qualifying alert or non-alert symptom. A NICE-qualifying alert symptom was defined as a symptom suggestive of cancer, and requires urgent referral based on the NICE guidelines [7]. A non-alert symptom was defined as a symptom suggestive of low risk, but predictive of cancer, and was based on symptoms reported in the published literature [31]. Symptoms were site-specific and the classifications were agreed by clinical researchers and epidemiologists (MR, RM and MTR). We have excluded investigations such as PSA testing and chest x-ray, as these will be preceded by symptoms that would have led to the investigation. The age criterion was applied for symptoms of colorectal cancer with a qualifying age. For symptoms with a qualifying duration (for example, persistent or present for a number of weeks) and description (for example, hard), we assumed that the symptoms fulfilled the criterion. The date of diagnosis was defined as the date of the first event or event of higher priority appearing in the patients' medical records (if recorded within three months of the first event) among the following, in order of priority: histological or cytological confirmation, admission to the hospital or first consultation at the outpatient clinic because of the malignancy [32]. All definitions were in line with the Aarhus checklist for early cancer-diagnosis research [33].Survival was defined as the number of days between the date of diagnosis and the date of outcome (death or censoring). Follow-up was censored at 5 years, as is commonly practiced in popula-  [39]; prostate: adenocarcinoma (8140) and other types [40]. Tumour subsite (for colorectal cancer) was colon, rectosigmoid or rectum.Tobacco smoking status (specific for lung cancer patients) was defined as the last status recorded prior to cancer diagnosis, classified as non-smoker, current smoker and ex-smoker. Comorbidity was measured using the CPRD-based Charlson Comorbidity Index [41], derived for each patient for comorbid conditions occurring in the one year period prior to the date of cancer diagnosis, and categorised as 0, 1 or 2 or more co-morbid conditions. Treatment was included as a proxy variable for stage, and refers to surgery, radiotherapy, chemotherapy and hormone therapy. Each treatment regimen was treated as an individual variable, coded as no treatment, received treatment or unknown. The number of consultations prior to presentation is the total number of consultations within the one year period prior to the date of the first presentation with a symptom of cancer (either NICE-qualifying alert or non-alert). This number was used a proxy for propensity to consult with a clinician, to take into account increased likelihood of being diagnosed with each additional consultation, and was categorised as 0, 1-2, 3-5, 6-10 or more than 10.

Data Analysis
Relative survival (RS) is a measure of survival that accounts for mortality due to causes other than cancer. It is the ratio of the observed survival of cancer patients to the probability of survival that would have been expected if patients had the same survival probability as the general population [42]. Estimates of relative survival were stratified by the nature of the symptoms (NICE-qualifying alert and non-alert) and were computed using the complete approach (where all patients diagnosed between 1998 and 2009 were included, regardless of whether they had full five-year or partial follow-up) [43]. These estimates were expressed as percentages and were computed using the STRS command in STATA. Survival probabilities were estimated at intervals of 6 months in the first year, then yearly up to 5 years. We used age-, sex-, region-and deprivation specific single-year life tables [44] to account for differences in the underlying mortality, and used the Ederer II method [42] to determine expected survival. We also estimated five-year relative survival, conditional on survival after 1 year to account for the effects of factors that strongly influence survival in the first year after diagnosis. This is to take into account the effect of stage, in the absence of a staging variable that could be used for breast, lung and prostate cancers. This is the cumulative four-year survival (at the fifth anniversary of diagnosis) for patients who were alive at the end of the first year. To determine the association between diagnostic interval and mortality, Excess Hazards Ratios (EHR) at five years were computed using a generalised linear model with a Poisson error structure [45]. The EHR is the ratio of mortality rates in the presence of one factor (e.g. White ethnicity) and the mortality rates in the absence of the same factor, once the reference population mortality is taken into account [45]. EHRs can be interpreted as equivalent to the mortality risk ratio. Univariable and multivariable models were built, specific for each cancer site, and stratified by the nature of the symptoms. Multivariable models controlled for the effects of age, sex (for colorectal and lung cancers), ethnicity, region of residence, level of deprivation, period of cancer plan implementation, stage and tumour subsite (for colorectal cancer), tumour differentiation, morphology, tobacco smoking status (for lung cancer), comorbidity, treatment and number of consultations prior to symptom presentation. We also tested for evidence of an interaction between diagnostic interval and age (with age as a binary variable, dichotomised at 60 years). We used the likelihood ratio test to determine goodness of fit.We employed multiple imputation using chained equations (ICE) to account for missing data on tumour differentiation, morphology, Dukes' stage (colorectal cancer only), deprivation quintile, surgery, radiotherapy, chemotherapy and hormone therapy (Tables 1 and 2) [46,47]. Imputation models were derived for each missing variable and included: the exposure of interest (diagnostic interval); the incomplete variables; all other covariables; and outcome (survival time and outcome (dead or censored)). Since data for tumour differentiation was missing for more than half of the lung and prostate cancer patients, this variable was not imputed for these cancer sites, and the category of unknown was included in the analyses. A total of 20 complete datasets were constructed to reduce sampling variability from the imputation process [48] and the results were combined using Rubin's Rules [46,47]. The distributions of the imputed variables were similar to the distributions of the measured variables. All regression analyses were based on the imputed dataset.All statistical analyses were carried out using STATA v12 software [49].

Results
The final sample was comprised of 8,639 female breast, 5,912 colorectal, 5,737 lung and 1,763 prostate cancer patients. The distributions of the cases by the different socio-demographic, tumour and clinical characteristics are presented in Tables 1-2 and S1 Table. The majority (94.3%) of breast cancer patients presented with alert symptoms in the year prior to diagnosis,  Table 3). In all four cancer sites, patients who first presented with non-alert symptoms had longer diagnostic intervals than patients who first presented with alert symptoms (Table 3). Amongst breast cancer patients, 92.5% presented with a breast lump, an alert symptom, with a median diagnostic interval of 14 days (IQR: 9-28; S2 Table).
The two most common presenting symptoms for colorectal cancer were abdominal pain (28.0%; median: 84; IQR: 33-175; S3 Table), a non-alert symptom, and rectal bleeding (20.7%;  ). The five-year relative survival estimates conditional on surviving the first year were all higher than the five-year relative survival estimates and followed the patterns of the relative survival estimates by symptom category (Table 5).

Excess mortality modelling
We found no evidence of an association between diagnostic interval and mortality for breast, colorectal and lung cancer ( Table 6). There was some evidence that longer diagnostic intervals were associated with lower mortality amongst men with prostate cancer. There was also no evidence of an interaction between age and diagnostic interval on survival (p-values for interaction: breast = 0.35; colorectal = 0.31; lung = 0.13; prostate = 0.34).Associations between diagnostic interval and mortality for each cancer site varied in direction when stratified by classification of presenting symptom. We found no evidence of higher excess mortality with longer diagnostic intervals among women with breast cancer presenting with NICE-qualifying alert symptoms (p-values with diagnostic interval of <1 week as reference: 1-2 weeks = 0.850; 3-4 weeks = 0.055; >1 month = 0.346). There was some evidence that both shorter and longer diagnostic intervals were associated with decreased mortality for women who presented with nonalert symptoms, but multivariable analysis could not be done, as the models did not converge, due to the small number of deaths recorded in this group (n = 71). Among colorectal cancer patients presenting with NICE-qualifying alert symptoms, we observed higher mortality for The Association of Diagnostic Intervals with Cancer Survival  both shorter and longer diagnostic intervals, but all results were imprecisely estimated (have wide confidence intervals; p-values with diagnostic interval of <1 month as reference: 1-2 months = 0.216; 3-6 months = 0.053; >6 months = 0.961). There was inconclusive evidence that a shorter diagnostic interval was associated with increased mortality among colorectal patients who presented with non-alert symptoms (p-values with diagnostic interval of <1 month as reference: 1-2 months = 0.493; 3-6 months = 0.349; >6 months = 0.049).There was little variation in the diagnostic interval-mortality associations among lung cancer patients who presented with NICE-qualifying symptoms (p-values with diagnostic interval of <1 month as reference: 1-2 months = 0.919; 3-6 months = 0.069; >6 months = 0.872). Among patients presenting with non-alert symptoms, there was some evidence that mortality was lower among patients who had diagnostic intervals of 3 months or more compared with those with shorter diagnostic intervals of less than 1 month (EHR 3-6 months vs Discussion Summary Colorectal, lung and prostate cancer patients with NICE-qualifying alert symptoms had lower mortality compared to patients with non-alert symptoms, but findings were opposite for breast cancer. For colorectal and lung cancers, longer diagnostic intervals were associated with lower mortality for patients with non-alert symptoms. The risk of excess mortality in men with prostate cancer decreased with longer diagnostic intervals, regardless of symptom classification.

Strengths and limitations
Our study is one of the few that have looked at the associations of diagnostic intervals with cancer survival, and even fewer studies have stratified by NICE-qualifying alert and non-alert symptom. We used secondary data from the CPRD, cancer registries, HES and ONS in England, which are databases known to be of high quality [25,50]. However, our study is not without limitations. We did not have pertinent information on all factors that could affect survival. Data on stage was only available for colorectal cancers and we had limited information on tumour differentiation for lung and prostate cancers. Nevertheless, we adjusted for treatment received (surgery, radiotherapy, chemotherapy and hormone therapy), which is a proxy indicator of disease severity, and we computed conditional relative survival estimates, to take into account factors influencing survival in the first year after diagnosis.Our study only included patients who consulted with a GP, representing those who are directly affected by the rapid diagnostic pathways specified in the cancer waiting time guidelines. The exclusion of patients diagnosed through emergency routes could have caused an underestimate of any observed association between short diagnostic intervals and increased mortality. Our findings would therefore only be strengthened by the inclusion of emergency presentations that bypassed the GP. Positive associations of diagnostic interval with mortality have been reported by other studies [15,17,[19][20][21][22]51] and exclusion of emergency presentations alone was not sufficient to explain the results. The exclusion of patients diagnosed via screening could have resulted in an overestimate of the observed associations, since these patients tend to have short diagnostic intervals but better survival. However, their inclusion would have only affected breast and prostate cancer patients. It would also have resulted in a lead time bias that could have overestimated the excess hazards ratios, particularly for prostate cancer.Our estimates of diagnostic intervals were based on the date of the earliest recorded symptom within a period of one year prior to diagnosis. This symptom record could reflect the consultation when the GP thought the symptom might be related to cancer or the date when the patient was referred to secondary care for further investigation. There could be an underestimate of diagnostic interval, if the GP did not refer the patient on initial presentation or if they did not record the symptom until after making the decision to refer. The reported low proportion of GP consultation notes that were coded electronically [52] could have also influenced the measured diagnostic interval. Any underestimate would be more apparent for patients presenting with non-alert symptoms, which could be attributed to other diseases. Nevertheless, we believe this bias is non-differential with respect to survival and would underestimate observed associations.

Comparison with previous studies
Our estimates of diagnostic intervals were shorter than previously reported using the CPRD [53]. The main difference lies in the classification of the symptoms as NICE-qualifying alert and non-alert. Some symptoms, classified as alert by Neal et al (diarrhoea for colorectal cancer; chest pain and cough for lung cancer), have been classified as non-alert in our study. We have also excluded some symptoms such as anaemia, anorexia, fatigue and weight loss for breast cancer from our list, as these were deemed too unspecific for the cancer site. We relied heavily on the referral guidelines and previous study definitions [31], and we felt that our classifications reflected the GP decision making process for referral.The findings that colorectal and lung cancer patients who presented with alert symptoms have better survival than those with non-alert symptoms were similar to a previous study using CPRD [16]. The disparities could be explained by the longer diagnostic interval for patients presenting with non-alert symptoms.
Patients presenting with NICE-qualifying alert symptoms should have been referred to the rapid pathway, which would limit the time from referral to hospital appointment to two-weeks [7]. However, the same two-week pathway may increase the diagnostic interval for patients with non-alert symptoms [54], as indicated by the doubling of the diagnostic intervals for nonalert symptoms in our study.The associations between diagnostic interval and survival were masked when analyses combined all patients by cancer site, and did not stratify by symptom.
In recent UK studies, no associations between longer diagnostic interval and higher mortality were found for lung and colorectal cancers, where all patients were combined in the analysis per cancer site [16,55]. For colorectal cancer, one study showed high mortality for patients with short and long diagnostic intervals in the unadjusted regression analysis but the association was attenuated once tumour biology was taken into account [55]. We, however, found that the associations between diagnostic interval and survival differ by site and symptom classification, even after tumour biology was taken into account.In a previous study in Denmark, the highest mortality rates were observed for those with both the shortest and longest diagnostic intervals among patients who presented with symptoms suggestive of cancer [21], conversely, for patients presenting with vague symptoms, the lowest mortality rates were seen among those with the shortest and longest diagnostic intervals [21]. While our study showed similar associations for colorectal and lung cancer patients with alert symptoms, our results only show lower mortality with longer diagnostic intervals for non-alert symptoms [21]. Our dataset enabled us to adjust for confounding variables such as tumour differentiation, and to some extent, stage, which were not taken into account in the previous study.Some of our findings appear to run counter to expected associations: high mortality among patients with short diagnostic intervals and low mortality for patients with long diagnostic intervals. These associations remained even after adjustments for tumour biology, and could be attributed to confounding by indication. Confounding by indication is an extraneous determinant of the outcome parameter that is present if a perceived high risk or poor prognosis becomes an indication for intervention [56]. In our study, the type of presenting symptoms (alert or non-alert) could have triggered differences in care. Patients presenting with severe manifestations of cancer could be expedited through the system [57], with the result that these patients have shorter diagnostic interval yet show higher mortality [16,[19][20][21][22]51]. More research is needed to elucidate how the role of health care affects diagnostic intervals and survival.The higher mortality observed for colorectal cancer patients with shorter and longer diagnostic intervals is in agreement with existing literature [12,21,22] and reinforces the rationale for rapid diagnostic pathways. Patients with poorer prognosis could have shorter diagnostic intervals because they were expedited through the pathway. Patients with longer diagnostic intervals could have suffered from delays that might have resulted in disease progression [11] which could have led to an adverse effect on survival [11,21]. Longer diagnostic intervals could have been a result of patient delays, increased burden to secondary care, or longer diagnostic work up [13,58,59].There is a possibility that time from diagnosis to treatment might have contributed to excess mortality, with longer waiting times for treatment worsening disease prognosis. Current evidence regarding this is inconclusive [23,60], with some studies on breast, lung and colorectal cancers showing no association or some evidence of the waiting time paradox [23,60]. Based on the literature we believe that any residual confounding would have been minimal and will not alter the results of our study.

Implications for practice
Despite our finding, clinicians should be mindful that whatever the association between presenting symptoms and mortality, perceived delayed diagnoses can have negative effects for both the psychological health of the patient and on the patient-doctor relationship. GPs should continue to refer patients with alert symptoms via the cancer pathways, and at least actively follow-up patients with non-alert symptoms in the community. Nevertheless, our study provides some reassurance for patients and clinicians alike that a reasonable diagnostic interval should not worsen prognosis and decrease survival.

Conclusions
The disparate effects of diagnostic intervals on the excess mortality of patients with alert and non-alert symptoms highlight the importance of the nature of the symptoms and type of cancer. The findings in this and other studies suggest that alert symptoms may prompt earlier diagnosis, not only by drawing attention to the underlying cancer, but also by prompting immediate action from clinicians. However, the UK's two-week pathway may increase the diagnostic interval for patients with non-alert symptoms.
Supporting Information S1