Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

The ten different phenotypes used for this study.

The first column shows the name of the phenotype, the second column shows the number of positive examples our of the total 1,610 notes, and the third shows the κ coefficient as inter-rater agreement measure. The last column lists the definition for each phenotype that was used to identify and annotate the phenotype.

More »

Table 1 Expand

Fig 1.

Overview of the basic CNN architecture.

(A) Each word within a discharge note is represented as its word embedding. In this example, both instances of the word “and” will have the same embedding. (B) Convolutions of different widths are used to learn filters that are applied to word sequences of the corresponding length. The convolution K2 with width 2 in the example looks at all 10 combinations of neighboring two words and output one value each. There can be multiple feature maps for each convolution width. (C) The multiple resulting vectors are reduced to only the highest value (the one with the most signaling power) for each of the different convolutions. (D) The final prediction (“Does the phenotype apply to the patient?”) is made by computing a weighted combination of the pooled values and applying a sigmoid function, similar to a logistic regression. This figure is adapted with permission from Kim [33].

More »

Fig 1 Expand

Fig 2.

Comparison of achieved F1-scores across all tested phenotypes.

The left three models directly classify from text, the right two models are concept-extraction based. The CNN outperforms the other models on most tasks.

More »

Fig 2 Expand

Table 2.

This table shows the best performing model for each approach and phenotype.

We show precision, recall, F1-Score, and AUC.

More »

Table 2 Expand

Fig 3.

Impact of phrase length on model performance.

The figure shows the change in F1-score between a model that considers only single words and a model that phrases up to a length of 5.

More »

Fig 3 Expand

Table 3.

The most salient phrases for advanced heart failure and alcohol abuse.

The salient cTAKES CUIs are extracted from the filtered RF model.

More »

Table 3 Expand