Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identifying Parkinson's disease and parkinsonism cases using routinely collected healthcare data: A systematic review

  • Zoe Harding ,

    Roles Data curation, Formal analysis, Investigation, Methodology, Project administration, Visualization, Writing – original draft

    ‡ ZH and TW are joint first authors

    Affiliation College of Medicine & Veterinary Medicine, University of Edinburgh, Edinburgh, United Kingdom

  • Tim Wilkinson ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Writing – original draft, Writing – review & editing

    ‡ ZH and TW are joint first authors

    Affiliations Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom, Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • Anna Stevenson,

    Roles Conceptualization, Data curation, Project administration, Writing – review & editing

    Affiliations Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom, Centre for Cognitive Ageing and Cognitive Epidemiology, Edinburgh, United Kingdom

  • Sophie Horrocks,

    Roles Conceptualization, Data curation, Project administration

    Affiliation College of Medicine & Veterinary Medicine, University of Edinburgh, Edinburgh, United Kingdom

  • Amanda Ly,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • Christian Schnier,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliation Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • David P. Breen,

    Roles Methodology, Writing – review & editing

    Affiliations Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom, Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom, Anne Rowling Regenerative Neurology Clinic, University of Edinburgh, Edinburgh, Scotland

  • Kristiina Rannikmäe,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom, Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • Cathie L. M. Sudlow

    Roles Conceptualization, Funding acquisition, Methodology, Supervision, Writing – review & editing

    Affiliations Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom, Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, United Kingdom

Identifying Parkinson's disease and parkinsonism cases using routinely collected healthcare data: A systematic review

  • Zoe Harding, 
  • Tim Wilkinson, 
  • Anna Stevenson, 
  • Sophie Horrocks, 
  • Amanda Ly, 
  • Christian Schnier, 
  • David P. Breen, 
  • Kristiina Rannikmäe, 
  • Cathie L. M. Sudlow



Population-based, prospective studies can provide important insights into Parkinson’s disease (PD) and other parkinsonian disorders. Participant follow-up in such studies is often achieved through linkage to routinely collected healthcare datasets. We systematically reviewed the published literature on the accuracy of these datasets for this purpose.


We searched four electronic databases for published studies that compared PD and parkinsonism cases identified using routinely collected data to a reference standard. We extracted study characteristics and two accuracy measures: positive predictive value (PPV) and/or sensitivity.


We identified 18 articles, resulting in 27 measures of PPV and 14 of sensitivity. For PD, PPV ranged from 56–90% in hospital datasets, 53–87% in prescription datasets, 81–90% in primary care datasets and was 67% in mortality datasets. Combining diagnostic and medication codes increased PPV. For parkinsonism, PPV ranged from 36–88% in hospital datasets, 40–74% in prescription datasets, and was 94% in mortality datasets. Sensitivity ranged from 15–73% in single datasets for PD and 43–63% in single datasets for parkinsonism.


In many settings, routinely collected datasets generate good PPVs and reasonable sensitivities for identifying PD and parkinsonism cases. However, given the wide range of identified accuracy estimates, we recommend cohorts conduct their own context-specific validation studies if existing evidence is lacking. Further research is warranted to investigate primary care and medication datasets, and to develop algorithms that balance a high PPV with acceptable sensitivity.


Despite well-established pathological features, the aetiologies of Parkinson’s Disease (PD) and other parkinsonian conditions remain poorly understood and disease-modifying treatments have proved elusive[1]. Large, prospective, population-based cohort studies with biosample collections (e.g., UK Biobank, German National Cohort, US Precision Medicine Initiative) provide a robust methodological framework with statistical power to investigate the complex interplay between genetic, environmental and lifestyle factors in the aetiology and natural history of neurological disorders such as PD and other parkinsonian disorders[24].

Linkage to routinely collected healthcare data–which are administrative datasets collected primarily for healthcare purposes rather than to address specific research questions[5]–provides an efficient means of long term follow-up in order to identify large numbers of incident cases in such studies[2]. Furthermore, participant linkage to such datasets can be used in randomised controlled trials as a cost-effective and comprehensive method of follow-up for disease outcomes[6]. These data are coded using systems such as the International Classification of Diseases (ICD)[7], the Systematized Nomenclature of Medicine–Clinical Terms (SNOMED-CT) system[8], and the UK primary care Read system[9].

There are several mechanisms by which inaccuracies can arise when using routinely collected healthcare data to identify PD outcomes. False positives (participants who receive a disease code but do not have the disorder) may arise if a clinician incorrectly diagnoses the condition. Given that PD and other parkinsonian disorders are largely clinical diagnoses made without a definitive diagnostic test, there is the potential for diagnostic inaccuracies. Clinicopathological studies have shown discrepancies between clinical diagnoses in life and neuropathological confirmation[10] and there is evidence that accuracy increases when diagnoses are made by movement disorder specialists[1113]. Secondly, diagnoses may be incorrectly recorded in medical records, or errors may arise during the coding process. Similarly, false negatives (patients who have the condition but do not receive a code) may arise due to under-diagnosis, omission of the diagnosis from the medical records (e.g., because the condition is not the primary reason for hospital admission), or errors during the coding process.

As a result, before such datasets can be used to identify PD and parkinsonism cases in prospective studies, their accuracy must be determined. Important measures are the positive predictive value (PPV, the proportion of those coded positive that are true disease cases) and sensitivity (the proportion of true disease cases that are coded positive). Specificity and negative predictive value are less relevant metrics in this setting. A high specificity (the proportion of those without the disease that do not receive a disease code) is important to ensure a high PPV, thereby minimising bias in effect estimates. With an appropriately precise choice of codes, the specificity of routinely collected healthcare data to identify disease cases in population-based studies is usually very high (98–100%)[14,15]. However, in a population-based cohort study where the overall prevalence of a disease is low, a high specificity does not guarantee a high PPV—a large absolute number of people without the disease can be incorrectly classified as being disease cases (false positives), yet the overall proportion of misclassified cases can be low (high specificity, low PPV)[16]. NPV, like PPV, is related to disease prevalence and will therefore be high in population-based studies where most individuals do not develop the disease of interest[14].

Previous systematic reviews on the accuracy of routine data to identify other neurological diseases such as stroke[14], dementia[17] and motor neurone disease[18] have summarised the existing literature and identified methods by which accuracy can be improved, as well as areas for further evaluation. Here, we systematically reviewed published studies that evaluated the accuracy of routinely collected healthcare data for identifying PD and parkinsonism cases.


Study reporting

We followed the Preferred Reporting Items for Systematic Review and Meta-analysis statement (PRISMA) guidelines for the reporting of this systematic review[19].

Study protocol

We used the PRISMA Protocols (PRISMA-P) guideline to aid in the design of this study[20], and prospectively published the protocol (number: CRD42016033715, = CRD42016033715) [21].

Search strategy

We (AS & TW) searched the electronic databases MEDLINE (Ovid), EMBASE (Ovid), CENTRAL (Cochrane Library) and Web of Science (Thomson Reuters) for relevant articles published in any language between 01.01.1990 and 23.06.2017. Our search strategy is outlined in S1 File. We chose the date limits based on our judgement that accuracy estimates from studies published prior to 1990 would have limited current applicability. We did not exclude studies based on the dates covered by the datasets. We also screened bibliographies of included studies and relevant review papers to identify additional publications.

Eligibility criteria

To be included, studies had to have: compared codes for PD or parkinsonism from routinely collected healthcare data to a clinical expert-derived reference standard, and provide either a PPV and/or a sensitivity estimate (or sufficient raw data to calculate these). We excluded studies with <10 coded cases, due to the limited precision of studies below this size[17,18]. Studies reporting sensitivity values had to be population-based (i.e. community-based as opposed to hospital-based) with comprehensive attempts to detect all disease cases. Where multiple studies investigated overlapping populations, we included the study with the larger population size. Where articles assessed more than one dataset or evaluated both PPV and sensitivity, we included these as separate studies. Hereafter, we will refer to published papers as ‘articles’ and these separate analyses as ‘studies’.

Study selection

Two authors (AS and SH) independently screened all titles and abstracts generated by the search, and reviewed full text articles of all potentially eligible studies to determine if the inclusion criteria were met. In the case of disagreement or uncertainty, we reached a consensus through discussion and, where necessary, involvement of a senior third author (CLMS).

Data extraction

Using a standardized form, two authors (TW and ZH) independently extracted the following data from each study: first author; year of publication; time period during which coded data were collected; country of study; study population; average age of disease cases (or, if this was unavailable, the ages of participants at recruitment); study size (defined as the total number of code positive cases for PPV [true positives plus false positives] and the total number of true positives for sensitivity [true positives and false negatives]); type of routine data used (e.g., hospital admissions, mortality or primary care); coding system and version used; specific codes used to identify cases; diagnostic coding position (e.g. primary or secondary position); parkinsonian subtypes investigated; and the method used to make the reference standard diagnosis.

We recorded the reported PPV and/or sensitivity estimates, as well as any corresponding raw data. After discussion, any remaining queries were resolved with a senior third author (CLMS). When necessary, we contacted study authors to request additional information.

Quality assessment

We adapted the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2)[22] tool to evaluate the risk of bias in the estimates of accuracy and any concerns about the applicability of each article to our specific research question (S2 File). Two authors (TW and ZH) independently assigned quality ratings, with any discrepancies resolved through discussion. We performed this evaluation in the context of our specific review question and not as an indication of the overall quality of the articles. We assessed risk of bias at the article level rather than study level, as the methods for each study within an article were very similar. We did not exclude studies based on their quality assessment ratings, but rather considered a given study’s results in the context of the article’s risk of bias and applicability concerns. Where articles deemed to be at low of bias and articles at high risk of bias reported PPV or sensitivity estimates on the same type of dataset, we compared the reported estimates to assess the potential effect of bias on accuracy estimates.

Statistical analysis/data synthesis

We tabulated the extracted data, and calculated 95% confidence intervals for the accuracy measures from the raw data using the Clopper-Pearson (exact) method. Due to substantial heterogeneity in study settings and methodologies, we did not perform a meta-analysis, as we considered any summary estimate to be potentially misleading. Instead, we assessed the full range of results in the context of study methodologies, populations and specific data sources. We also reported any within-study comparisons in which a single variable was changed to examine its effect on PPV or sensitivity. We performed analyses using the statistical software StatsDirect3.


Study characteristics

From an initial 1319 identified articles, we removed 222 duplicates and excluded 994 considered to be irrelevant after screening the titles and abstracts. We therefore examined the full text articles for 103 papers. Of these, we excluded 37 that did not assess the accuracy of a routinely collected, coded dataset, 21 that did not validate the coded data against any reference standard, 12 that were not primary research studies, 11 that combined routine and non-routine data, three where no accuracy measure was reported or calculable, and four that did not assess coding in PD. 18 published articles fulfilled our inclusion criteria[2340]. A flow diagram of the study selection process is shown in Fig 1. We obtained key additional information from the authors of two studies[32,36]. Of the 18 included articles, 13 reported PPV[23,2536], four reported sensitivity[3740] and one reported both[24]. Four articles contained more than one study[2325,29]. One of these consisted of multiple sub-studies, using different methods to evaluate datasets across several countries, so we included these as six separate studies[25]. In total, there were 27 measures of PPV and 14 of sensitivity. Study characteristics are summarised in Tables 1 and 2 respectively.

Table 1. Characteristics of studies reporting positive predictive value, stratified by dataset type.

Table 2. Characteristics of studies reporting sensitivity, stratified by dataset type.

Study size varied considerably, ranging from 39–4957. All 18 articles were based in high-income countries. Three were from the UK[32,33,40], six from mainland Europe[24,25,30,3739], eight from the USA[2629,31,3436], and one from Canada[23]. There were 12 PPV estimates and two sensitivity estimates from hospital data[2331], two PPV and 10 sensitivity estimates from mortality data[24,3740], two PPV estimates from primary care data[32], four PPV estimates from prescription data[23,29,33] and seven PPV estimates and two sensitivity estimates from combining datasets from different sources[24,25,3436]. There were no sensitivity estimates from primary care or prescription data.

PD was evaluated in 13 articles, with eight estimating PPV[25,26,2830,32,33,36], four estimating sensitivity[3740] and one estimating both[24]. Parkinsonism was evaluated by seven articles, of which six estimated PPV[23,27,31,3335] and one assessed both PPV and sensitivity[24]. All of the parkinsonism articles combined PD with other causes of parkinsonism.

The methods of reference standard used could be broadly divided into two categories: patient history and examination (5/5 articles reporting sensitivity) and medical record review (14/14 of articles reporting PPV). Three articles used in-person examination and medical record review in combination[24,33,39]. In addition, where entire populations were under study, some studies incorporated a screening method (e.g., telephone interview) to identify potential cases[24,37].

Where reported, codes used to identify PD cases were consistent and appropriate to the ICD version used. However, the range of codes used to identify other parkinsonian conditions varied considerably, reflecting the broad range of pathologies that can lead to parkinsonism. Seven studies did not specify the exact codes used[29,32,33,3740]. ICD versions used reflected the time period over which the studies were conducted. 19 studies used ICD-9 (or ICD-9-CM, a clinically modified version used in the USA, and identical to ICD-9 with respect to parkinsonian diagnoses)[2329,31,3539], 11 used ICD-10[2325,30,37], three used ICD-8[24,30], and two used ICD-7[24]. One of the primary care studies used Read-coded data[32]. Four studies, including the three that evaluated prescription data, did not specify the coding system used[23,29,33,40].

The diagnostic coding position assessed also varied. Three studies assessed primary diagnoses alone[30,36,37], eight used any diagnostic position[24,31,3840], while 13 did not specify the coding position[23,2529,34,35]. Diagnostic position was not applicable in the studies of primary care and prescription data due to the nature of these datasets[23,29,32,33].

Quality assessment

Only two articles were judged to be of low risk of bias or applicability concerns in the QUADAS-2 assessment[23,24] (S1 Table). Across the risk of bias domains, the most common area of concern was inappropriate or unclear code lists to identify disease cases (10/18), followed by: selection bias (8/18), patient flow (i.e. inappropriate inclusions and exclusions or patients being lost to follow-up) (5/18) and insufficiently rigorous or unclear reference standards (4/18).

Positive predictive value

For PD, there were 17 PPV estimates in total (Fig 2)[2426,2830,32,33,36]. These comprised seven PPV estimates of hospital data alone[2426,2830], one of mortality data alone[24], two for prescription data alone[29,33], one of primary care data alone[32], one of prescription data and primary care data in combination[32], and five of datasets used in combination[25,36]. PPVs ranged from 36–90% across all studies. Nine of the 17 estimates were >75%. The single study of Read coding in primary care data alone reported a PPV of 81%, increasing to 90% with the presence of a relevant medication code in addition to a diagnostic code[32]. The two studies of medication data alone reported PPVs of 53% and 87%[29,33]. The single, small study of mortality data had a PPV of 67%[24].

Fig 2. Positive predictive values (PPVs) of coded diagnoses.

Study size: total number of code-positive cases (true positives + false positives). *Exact sample size unknown, most conservative estimate used. Box sizes reflect Mantel-Haenszel weight of study (inverse variance, fixed effects).

One of the two articles judged to be at low risk of bias investigated the PPV of hospital admissions data to identify PD, reporting a PPV of 70.8%[24]. This value fell in between the range of other studies (range 55.5–90.3%), raising the possibility that estimates from studies at the extremes of the range may be influenced by bias.

Several within-study comparisons were available from three studies identifying PD (Table 3)[24,28,29]. Two of these investigated the change in PPV for hospital data to identify PD when algorithms containing additional criteria were used[24,28]. Both showed a moderate increase in PPV if a relevant diagnosis code was recorded more than once, or if a specialist department assigned such a code. One study reported an increase in PPV when only primary position diagnoses were assessed[24]. Another showed that incorporating selected medication codes with diagnosis codes increased the PPV from 76% to 86%, although this was at the expense of reduced case ascertainment[28]. Finally, one study showed that the combination of a diagnostic code in hospital data with a relevant medication code increased the PPV when compared to using either dataset alone (94% versus 87% and 89% respectively)[29].

For parkinsonism, there were 10 PPV estimates in total (Fig 2)[23,24,27,31,3335]. These comprised five estimates from hospital data alone[23,24,27,31], two from prescription data alone[23,33], one from mortality data alone[24], and two from using datasets in combination[34,35]. PPVs ranged from 40–94% in the single datasets and from 22–28% in the combination datasets. The two studies of parkinsonism in prescription data produced very different PPV estimates of 40% and 74%[23,33]. One of these studies reported that the PPV of medication data to identify any parkinsonian disorder was considerably higher than that for PD (74% and 53% respectively)[33].

The two articles with low risk of bias investigated the use of hospital admissions data to identify parkinsonism cases. These articles reported PPVs of 76%[23] and 88%[24], which is consistent with the values reported by other studies judged to be at risk of bias.


For PD, there were 11 sensitivity estimates in total (Fig 3)[24,3740]. Of these, nine were sensitivity estimates for mortality data alone, consistently showing that codes in the primary position only gave low sensitivities of 11–23%, rising to 53–60% when codes from any position were included[24,3740]. A single study reported the sensitivity of hospital data to be 73%, increasing to 83% when hospital and mortality data were combined. There were no sensitivity estimates for primary care or prescription data.

Fig 3. Sensitivity estimates of coded diagnoses.

Study size: total number of true positives according to reference standard (true positives + false negatives). *Unknown sample size and confidence intervals. Box sizes reflect Mantel-Haenszel weight of study (inverse variance, fixed effects).

Of the two studies with low risk of bias, one investigated the sensitivity of mortality data, reporting a value of 20%. This was similar to the values reported by other studies deemed at risk of bias, suggesting that the potential bias identified did not significantly affect these estimates.

For parkinsonism, there were three sensitivity estimates, all from one study[24]. Hospital admissions and mortality data combined gave higher sensitivity (71%) compared with either mortality or hospital data alone (43% and 63% respectively).


We have demonstrated that existing validation studies show a wide variation in the accuracy of routinely collected healthcare data for the identification of PD and parkinsonism cases. Despite this, in some circumstances, achieving high PPVs is possible. Sensitivity (range 15–73% for PD) is generally lower than PPV (range 36–90%) in single datasets, but is increased by combining data sources.

When using routinely collected datasets to identify disease cases, there will inevitably be a trade-off between PPV and sensitivity[16]. The extent to which cohorts seek to maximise one accuracy metric over another will depend on the specific study setting and research question. For example, for studies that rely only on routinely collected data to identify disease cases are likely to desire a high PPV, providing sensitivity is sufficient to ensure statistical power in analyses. In contrast, for studies that use routinely collected data to identify potential cases before going onto validate these cases with a more detailed in-person or medical record review, a high sensitivity will be important. In this review, we found that the sensitivity of mortality data to detect PD using codes in the primary position alone was very low (range 11–23%) however, this markedly improved (range 56–60%) when codes were selected from any position on the death certificate[24,3740]. No studies in this review investigated the effect of coding position on PPV, but previous studies of dementia and motor neurone disease have shown that selecting cases for whom the disease code was in the primary position consistently led to increased PPVs compared to selecting disease codes from any position[4144]. However, as with PD, this approach led to the identification of fewer cases, thereby reducing sensitivity[17,18].

The pharmacological treatment of PD is largely focussed on improving motor function and patients are treated with a limited number of drugs. This has allowed antiparkinsonian drugs to be used as ‘tracers’ in epidemiological studies[45,46]. There are potential problems with using prescription data as a proxy for PD diagnosis. This approach may disproportionately under-identify patients with early stage disease who do not yet require treatment. Also, a response to a trial of dopaminergic drugs may be used as part of the diagnostic assessment in potential PD cases, meaning some patients prescribed antiparkinsonian medications will not be subsequently diagnosed with PD. Furthermore, antiparkinsonian can be prescribed for indications other than PD (such as dopamine agonists for restless legs syndrome, endocrine disorders and other forms of parkinsonism). The specific drugs licensed for use in parkinsonian conditions varies between countries and may change over time. Therefore, an algorithm incorporating prescription data would need to be continually revised to match prescribing patterns. Results from our review suggest that prescription data alone has a low PPV for PD case ascertainment[33]; however, when drug codes are combined with diagnostic codes, PPV increases but with reduced case ascertainment[28,32]. Furthermore, prescription datasets appear to have a higher PPV when identifying any parkinsonian disorder rather than specifically PD[33].

This study has several strengths and limitations. Our review benefits from prospective protocol publication, comprehensive search criteria, and independent duplication of each stage by two authors. Despite this, relevant studies may still have been missed, especially if a validation study was a subsection of a paper with a wider aim. As all eligible studies were included, the results may have been influenced by studies of lower quality. Only two articles were found to be at low risk of bias or applicability concerns[23,24], and it is likely that biases in study design would have affected the results. For example, one study with the lowest PPV[35] used very broad ICD-9 codes such as 781.0 (abnormal involuntary movements) and 781.3 (lack of coordination).

Since there is no method of diagnosing PD with certainty in life, there is likely to be some misclassification of the reference standards used in the studies. The application of stringent diagnostic criteria to reference standard diagnoses, although often necessary for research purposes, may lead to some patients being misclassified as ‘false positives’ when they do in fact have the condition. This may lead to underestimation of the PPV in some of the studies. When considering the ideal reference standard for validation studies, there is a trade-off between the robustness of the reference standard and validating sufficient cases to produce precise accuracy estimates. For example, in-person neurological examination may have greater diagnostic certainty than medical record review but this becomes difficult as the cohort size increases. Some of the variation in the reported results, therefore, is likely to be due to differences in how stringently different studies applied their reference standards.

Many of the studies reported cases with insufficient information to meet the reference standard and the handling of these varied. Some studies excluded such cases, others classified them as false positives, while some did not specify how they handled such missing data. Excluding such cases may introduce selection bias, whereas counting them as false positives may underestimate PPV.

The effect of possible publication bias on the results is difficult to estimate, but disproportionate publication of studies which report more favourable accuracy measures may lead to over-estimation of the performance of the codes. In addition, estimates of PPV are dependent upon the prevalence of the condition in the study population but it was not possible to assess the prevalence of PD within each study population.

Our review highlights several areas requiring further research. Given that the management of PD is largely delivered in outpatients or the community, primary care data may be an effective method of identifying cases. Whilst studies have suggested that PD diagnoses made in primary care are less accurate than those made in a specialist setting[47,48], primary care records combine notes made by primary care clinicians with prescription records and correspondence from secondary care. Codes from primary care should therefore include diagnoses made by specialists, thus increasing their accuracy. We found only one small study of primary care data, reporting a promising PPV of 81%, improving to 90% with the inclusion of medication codes[32]. No studies investigated the sensitivity of primary care data. Further research into the accuracy of primary care data is needed.

Two studies investigated using algorithmic combinations of codes from different sources to improve PPV[24,28]. These investigated the additional benefit of the inclusion of factors such as only including codes that appeared more than once, selecting codes in the primary position only, combining diagnostic codes with prescription data, and only including diagnoses made in specialist clinics. These methods increased PPV but at a cost to the number of cases identified. The development of algorithms that maximize PPV whilst maintaining a reasonable sensitivity (e.g., by combining multiple complimentary datasets) merits further evaluation.

To our knowledge, no studies have evaluated the accuracy of routinely collected healthcare data for solely identifying atypical parkinsonian syndromes such as PSP and MSA. Further work is needed to understand whether these datasets provide a valuable resource for studying these less common diseases.

In conclusion, our review summarises existing knowledge of the accuracy of routinely collected healthcare data for identifying PD and parkinsonism, and highlights approaches to increase accuracy and areas where further research is required. Given the wide range of observed results, prospective cohorts should perform their own validation studies where evidence is lacking for their specific setting.


The UK Biobank Neurodegenerative Outcomes Working Group provided feedback on the manuscript. This work was conducted on behalf of Dementias Platform UK (


  1. 1. Lang AE, Espay AJ. Disease Modification in Parkinson’s Disease: Current Approaches, Challenges, and Future Considerations. Movement Disorders. 2018;33: 660–677. pmid:29644751
  2. 2. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12: e1001779. pmid:25826379
  3. 3. German National Cohort (GNC) Consortium. The German National Cohort: aims, study design and organization. Eur J Epidemiol. 2014;29: 371–382. pmid:24840228
  4. 4. Jaffe S. Planning for US Precision Medicine Initiative underway. The Lancet. 2015;385: 2448–2449.
  5. 5. Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) Statement. PLOS Med. 2015;12: e1001885. pmid:26440803
  6. 6. Mc Cord KA, Al-Shahi Salman R, Treweek S, Gardner H, Strech D, Whiteley W, et al. Routinely collected data for randomized trials: promises, barriers, and implications. Trials. 2018;19: 29. pmid:29325575
  7. 7. World Health Organization. The ICD-10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines. Geneva: World Health Organization; 1992.
  8. 8. SNOMED International. SNOMED CT [Internet]. [cited 18 May 2017]. Available:
  9. 9. NHS Digital. Read Codes [Internet]. [cited 17 May 2017]. Available:
  10. 10. Adler CH, Beach TG, Hentz JG, Shill HA, Caviness JN, Driver-Dunckley E, et al. Low clinical diagnostic accuracy of early vs advanced Parkinson disease: clinicopathologic study. Neurology. 2014;83: 406–412. pmid:24975862
  11. 11. Hughes AJ, Daniel SE, Kilford L, Lees AJ. Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases. J Neurol Neurosurg Psychiatr. 1992;55: 181–184.
  12. 12. Hughes AJ, Daniel SE, Lees AJ. Improved accuracy of clinical diagnosis of Lewy body Parkinson’s disease. Neurology. 2001;57: 1497–1499. pmid:11673599
  13. 13. Hughes AJ, Daniel SE, Ben-Shlomo Y, Lees AJ. The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. Brain. 2002;125: 861–870. pmid:11912118
  14. 14. Woodfield R, Grant I, Sudlow CLM. Accuracy of Electronic Health Record Data for Identifying Stroke Cases in Large-Scale Epidemiological Studies: A Systematic Review from the UK Biobank Stroke Outcomes Group. PLoS One. 2015;10. pmid:26496350
  15. 15. Gao L, Calloway R, Zhao E, Brayne C, Matthews FE. Accuracy of death certification of dementia in population-based samples of older people: analysis over time. Age Ageing. 2018;47: 589–594. pmid:29718074
  16. 16. Chubak J, Pocobelli G, Weiss NS. Tradeoffs between accuracy measures for electronic health care data algorithms. J Clin Epidemiol. 2012;65: 343–349.e2. pmid:22197520
  17. 17. Wilkinson T, Ly A, Schnier C, Rannikmäe K, Bush K, Brayne C, et al. Identifying dementia cases with routinely collected health data: A systematic review. Alzheimers Dement. 2018;14: 1038–1051. pmid:29621480
  18. 18. Horrocks S, Wilkinson T, Schnier C, Ly A, Woodfield R, Rannikmäe K, et al. Accuracy of routinely-collected healthcare data for identifying motor neurone disease cases: A systematic review. PLoS ONE. 2017;12: e0172639. pmid:28245254
  19. 19. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339: b2535. pmid:19622551
  20. 20. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4: 1. pmid:25554246
  21. 21. Stevenson A, Wilkinson T, Sudlow CLM, Ly A. The accuracy of electronic health datasets in identifying Parkinson’s disease cases: a systematic review [Internet]. 28 Jan 2016 [cited 17 May 2017]. Available:
  22. 22. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155: 529–536. pmid:22007046
  23. 23. Butt DA, Tu K, Young J, Green D, Wang M, Ivers N, et al. A validation study of administrative data algorithms to identify patients with Parkinsonism with prevalence and incidence trends. Neuroepidemiology. 2014;43: 28–37. pmid:25323155
  24. 24. Feldman AL, Johansson ALV, Gatz M, Flensburg M, Petzinger GM, Widner H, et al. Accuracy and sensitivity of Parkinsonian disorder diagnoses in two Swedish national health registers. Neuroepidemiology. 2012;38: 186–193. pmid:22472568
  25. 25. Gallo V, Brayne C, Forsgren L, Barker RA, Petersson J, Hansson O, et al. Parkinson’s Disease Case Ascertainment in the EPIC Cohort: The NeuroEPIC4PD Study. Neurodegener Dis. 2015;15: 331–338. pmid:26375921
  26. 26. Kestenbaum M, Ford B, Louis ED. Estimating the Proportion of Essential Tremor and Parkinson’s Disease Patients Undergoing Deep Brain Stimulation Surgery: Five-Year Data From Columbia University Medical Center (2009–2014). Mov Disord Clin Pract. 2015;2: 384–387. pmid:28845438
  27. 27. Swarztrauber K, Anau J, Peters D. Identifying and distinguishing cases of parkinsonism and Parkinson’s disease using ICD-9 CM codes and pharmacy data. Mov Disord. 2005;20: 964–970. pmid:15834854
  28. 28. Szumski NR, Cheng EM. Optimizing algorithms to identify Parkinson’s disease cases within an administrative database. Mov Disord. 2009;24: 51–56. pmid:18816696
  29. 29. Wei W-Q, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc. 2016;23: e20–27. pmid:26338219
  30. 30. Wermuth L, Cui X, Greene N, Schernhammer E, Ritz B. Medical Record Review to Differentiate between Idiopathic Parkinson’s Disease and Parkinsonism: A Danish Record Linkage Study with 10 Years of Follow-Up. Parkinsons Dis. 2015;2015: 781479. pmid:26770868
  31. 31. White D, Moore S, Waring S, Cook K, Lai E. Identifying incident cases of parkinsonism among veterans using a tertiary medical center. Mov Disord. 2007;22: 915–923. pmid:17415798
  32. 32. Hernán MA, Logroscino G, Rodríguez LAG. A prospective study of alcoholism and the risk of Parkinson’s disease. J Neurol. 2004;251 Suppl 7: vII14–17. pmid:15505749
  33. 33. Meara J, Bhowmick BK, Hobson P. Accuracy of diagnosis in patients with presumed Parkinson’s disease. Age Ageing. 1999;28: 99–102. pmid:10350403
  34. 34. Bower JH, Maraganore DM, McDonnell SK, Rocca WA. Incidence and distribution of parkinsonism in Olmsted County, Minnesota, 1976–1990. Neurology. 1999;52: 1214–1220. pmid:10214746
  35. 35. Savica R, Grossardt BR, Bower JH, Ahlskog JE, Rocca WA. Incidence and pathology of synucleinopathies and tauopathies related to parkinsonism. JAMA Neurol. 2013;70: 859–866. pmid:23689920
  36. 36. Thacker T, Wegele AR, Pirio Richardson S. Utility of electronic medical record for recruitment in clinical research: from rare to common disease. Mov Disord Clin Pract. 2016;3: 507–509. pmid:27713907
  37. 37. Benito-León J, Louis ED, Villarejo-Galende A, Romero JP, Bermejo-Pareja F. Under-reporting of Parkinson’s disease on death certificates: a population-based study (NEDICES). J Neurol Sci. 2014;347: 188–192. pmid:25292414
  38. 38. Beyer MK, Herlofson K, Arsland D, Larsen JP. Causes of death in a community-based study of Parkinson’s disease. Acta Neurol Scand. 2001;103: 7–11. pmid:11153892
  39. 39. Fall P-A, Saleh A, Fredrickson M, Olsson J-E, Granérus A-K. Survival time, mortality, and cause of death in elderly patients with Parkinson’s disease: a 9-year follow-up. Mov Disord. 2003;18: 1312–1316. pmid:14639673
  40. 40. Williams-Gray CH, Mason SL, Evans JR, Foltynie T, Brayne C, Robbins TW, et al. The CamPaIGN study of Parkinson’s disease: 10-year outlook in an incident population-based cohort. J Neurol Neurosurg Psychiatr. 2013;84: 1258–1264. pmid:23781007
  41. 41. Fisher ES, Whaley FS, Krushat WM, Malenka DJ, Fleming C, Baron JA, et al. The accuracy of Medicare’s hospital claims data: progress has been made, but problems remain. Am J Public Health. 1992;82: 243–248. pmid:1739155
  42. 42. Ostbye T, Hill G, Steenhuis R. Mortality in elderly Canadians with and without dementia: a 5-year follow-up. Neurology. 1999;53: 521–526. pmid:10449114
  43. 43. Chiò A, Ciccone G, Calvo A, Vercellino M, Di Vito N, Ghiglione P, et al. Validity of hospital morbidity records for amyotrophic lateral sclerosis. A population-based study. J Clin Epidemiol. 2002;55: 723–727. pmid:12160921
  44. 44. Stickler DE, Royer JA, Hardin JW. Validity of hospital discharge data for identifying cases of amyotrophic lateral sclerosis. Muscle Nerve. 2011;44: 814–816. pmid:22006696
  45. 45. Brandt-Christensen M, Kvist K, Nilsson FM, Andersen PK, Kessing LV. Use of antiparkinsonian drugs in Denmark: results from a nationwide pharmacoepidemiological study. Mov Disord. 2006;21: 1221–1225. pmid:16671076
  46. 46. Chiò A, Magnani C, Schiffer D. Prevalence of Parkinson’s disease in Northwestern Italy: comparison of tracer methodology and clinical ascertainment of cases. Mov Disord. 1998;13: 400–405. pmid:9613728
  47. 47. Newman EJ, Breen K, Patterson J, Hadley DM, Grosset KA, Grosset DG. Accuracy of Parkinson’s disease diagnosis in 610 general practice patients in the West of Scotland. Mov Disord. 2009;24: 2379–2385. pmid:19890985
  48. 48. Schrag A, Ben-Shlomo Y, Quinn N. How valid is the clinical diagnosis of Parkinson’s disease in the community? J Neurol Neurosurg Psychiatr. 2002;73: 529–534.