Fig 1.
Illustration of topic modeling on EHRs using NMF.
Fig 2.
The size of the words (phecode) in each cloud indicates the weights of the phenotypes on the topic. Phenotypes with larger-sized words have greater influence on the topic compared to phenotypes with smaller-sized words. For each word cloud, we listed the top 60 words.
Fig 3.
Topic distribution in the cohort.
To visualize the prevalence of each topic in the cohort, we assigned an individual to the topic with the maximum score.
Fig 4.
t-SNE plot of visualizing the patient clusters in a projected 2D metric map (The perplexity was set to 30).
Table 1.
Pearson correlation coefficient testing between LPA variant for each topic.
Table 2.
Logistic regression analysis between LPA variant for each topic.
Fig 5.
PheWAS results of rs10455872 on 12,759 individuals adjusted by sex and age.