Skip to main content
Advertisement

< Back to Article

Table 1.

Strengths and weakness of data classes within EHRs.

More »

Table 1 Expand

Figure 1.

Comparison of natural language processing (NLP) and CPT codes to detect completed colonoscopies in 200 patients.

In this study, more completed colonoscopies were found via NLP than with billing codes alone, and only one colonoscopy was found with billing codes that was not found with NLP. NLP examples were reviewed for accuracy.

More »

Figure 1 Expand

Figure 2.

Use of Intelligent Character Recognition to codify handwriting.

Figure courtesy of Luke Rasmussen, Northwestern University.

More »

Figure 2 Expand

Figure 3.

General figure for identifying cases and controls using EHR data.

Application of electronic selection algorithms lead to division of a population of patients into four groups, the largest of which comprises patients who were excluded because they lack sufficient evidence to be either a case or control patient. Definite cases and controls cross some predefined threshold of positive predictive value (e.g., PPV≥95%), and thus do not require manual review. For very rare phenotypes or complicated case definitions, the category of “possible” cases may need to be reviewed manually to increase the sample size.

More »

Figure 3 Expand

Table 2.

Methods of finding cases and controls for genetic analysis of five common diseases.

More »

Table 2 Expand

Table 3.

eMERGE network participants.

More »

Table 3 Expand

Figure 4.

Use of NLP to identify patients without heart disease for a genome-wide analysis of normal cardiac conduction.

Using simple text searching, 1564 patients would have been eliminated unnecessarily due to negated terms, family medical history of heart disease, or low dose medication use that would not affect measurements on the electrocardiogram. Use of NLP improves recall of these cases without sacrificing positive predictive value. The final case cohort represented the patients used for GWAS in [71].

More »

Figure 4 Expand

Figure 5.

A PheWAS plot for rs3135388 in HLA-DRA.

This region has known associations with multiple sclerosis. The red line indicates statistical significance at Bonferroni correction. The blue line represents p<0.05. This plot is generated from updated data from [78] and the updated PheWAS methods as described in [73].

More »

Figure 5 Expand