A Dynamic Network Approach for the Study of Human Phenotypes

doi:10.1371/journal.pcbi.1000353

Figure 1.

Data characteristics and basic comorbidity statistics.

A. Age distribution for the study population. B. Demographic breakdown of the study population. C. Prevalence distribution for all diseases measured using ICD9 codes at the 5 digit level. D. Distribution of the relative risk (RR) between all disease pairs. E. Distribution of the φ-correlation between all disease pairs. F. Scatter plot between the φ-correlation and the relative risk of disease pairs.

More »

Expand

Figure 2.

Phenotypic Disease Networks (PDNs).

Nodes are diseases; links are correlations. Node color identifies the ICD9 category; node size is proportional to disease prevalence. Link color indicates correlation strength. A. PDN constructed using RR. Only statistically significant links with RR_ij>20 are shown. B. PDN built using φ-correlation. Here all statistically significant links where φ>0.06 are shown.

More »

Expand

Figure 3.

The Phenotypic Disease Network and disease dynamics.

A. Schematic representation of the three dynamical questions explore here. B. Average φ-correlation between diseases diagnosed in the first two and last two visits for the 946,580 patients with 4 visits (green) and when we consider a randomized set of diseases for the first two visits (red). C. Same as B but for the RR-PDN. D. Ratio between the average φ-correlation among diagnoses received by a patient in its first two and last two visits relative to the control case. E. same as D but for the RR-PDN. F. Gender and race differences. The subset of Fig 2 B where all diseases connected to hypertension and ischemic heart disease is shown. Blue links indicate comorbidities that are strongest among black males; whereas red links indicate comorbidities that are strongest among white males (see legend).

More »

Expand

Figure 4.

Disease connectivity and lethality.

A. Scatter plot between the connectivity of a disease measured in the φ-PDN and the percent of patients that died 8 years after this disease was first observed in our data set. B. Same as A for the RR-PDN. C. percent of patients that died 8 years after this disease was first observed in our data set as a function of disease prevalence. D. same as A showing only neoplasms. E. same as B showing only neoplasms. F. same as A showing only mental disorders. G. same as B showing only mental disorders.

More »

Expand

Figure 5.

Connectivity lethality control.

A. Histogram with the number of visits for each patient for which the year of death is known. B. Histogram for the number of diagnosis assigned to each patient for which the year of death is known. C. Correlation between the average connectivity of the diagnosis assigned to a patient and the number of years survived after the last diagnosis was recorded for groups of patients with the same number of hospital visits. D. Correlation between the average connectivity of the diagnosis assigned to a patient and the number of years survived after the last diagnosis was recorded for groups of patients with the same number of total number of diagnosis assigned. Error margins in C and D represent 95% confidence intervals.

More »

Expand

Figure 6.

Directionality of disease progression.

A. Distribution of λ_1→2 B. Disease precedence Λ_i as a function of disease prevalence P_i. The inset shows the same plot after removing the trend from disease precedence (Λ_i* = Λ_I+496.08log₁₀(P_i)-2446.2) C. Disease connectivity calculated from the φ-PDN as a function of Λ_i*. The green line shows the best fit for the 518 diseases with a prevalence larger than 1/500 (green circles) while the red line shows the best fit for the 463 diseases at the center of the cloud (red points). The correlation coefficient is represented by r and its associated p-value by p. D. Percentage of patients that died 2 and 8 years after being diagnosed with a disease with a given detrended precedence Λ_i*. The green lines show the best fit for all the 518 diseases (green circles) while the red lines show the fit for the 434 (top panel) and 465 (bottom panel) diseases at the bulk of the cloud.

More »

Expand