Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis

doi:10.1371/journal.pone.0192586

Fig 1.

The overall processes of the study.

More »

Expand

Table 1.

Summary of the variables used in the study.

More »

Expand

Fig 2.

The ICD-9 coded baseline.

More »

Expand

Fig 3.

The event distribution of stroke subtypes among the four categories.

More »

Expand

Fig 4.

The performance curves when adding the variable sets (Table 1).

More »

Expand

Table 2.

Performance of different classification algorithms for stroke case identification.

More »

Expand

Table 3.

Statistical significance tests (paired T-test) of the performance difference between the machine learning algorithms and the baselines on stroke case identification.

More »

Expand

Fig 5.

Precision-recall curves generated by the algorithms.

More »

Expand

Table 4.

Performance of different classification algorithms for stroke type identification.

More »

Expand

Table 5.

Statistical significance tests (paired T-test) of the performance difference between the machine learning algorithms and the baselines on stroke type identification.

More »

Expand

Fig 6.

Confusion matrices generated by ICD9, CLIN, and RF on the test set.

More »

Expand

Table 6.

Misclassification errors made by the RF algorithm on the test set.

More »

Expand