Functional principal component analysis for identifying multivariate patterns and archetypes of growth, and their association with long-term cognitive development

doi:10.1371/journal.pone.0207073

Fig 1.

Irregular and sparse longitudinal observations from the PROBIT data.

Head circumference (HC, left), body length (LN, middle) and weight (WT, right) are illustrated for a random selection of 30 subjects (gray) out of about 12,800 total children, along with estimated mean curves for each longitudinal trait (black solid lines).

More »

Expand

Fig 2.

Example for visualization of observations from simulation.

The proposed method identifies clusters associated with response outcomes Y characterized by archetypal covariate levels Z = (Z₁, Z₂), for example (high Z₁, high Z₂), (high Z₁, low Z₂), (low Z₁, high Z₂) and (low Z₁, low Z₂), which are symbolized by red, purple, green and blue points, respectively, where the crosses denote the cluster centers. The surface demonstrates the conditional mean response when regressing Y on Z = (Z₁, Z₂) for n = 200 data points.

More »

Expand

Table 1.

Simulation results for identification of at-risk multiple trajectories clusters associated with response outcomes.

More »

Expand

Fig 3.

Estimated auto-covariance functions and eigenfunctions.

Estimated auto-covariance functions G(s, t) (top) and the corresponding first two eigenfunctions ϕ₁(t) and ϕ₂(t) (bottom) as in Eq (3) for head circumference (HC, left), body length (LN, middle) and weight (WT, right), respectively. Eigenfunctions represent the qualitative factors “General growth” (red) and “Growth acceleration” (blue). The cumulative fractions of variation explained (FVE) of the first two components are 97.70%, 96.92% and 98.14% for HC, LN and WT, respectively.

More »

Expand

Fig 4.

Extreme functional patterns from marginal analysis.

Head circumference (HC, left), body length (LN, middle) and weight (WT, right) traits, respectively. Four outlying clusters (falling into the smallest 5% of the bivariate density) are demonstrated with respective different colors, with estimated mean curves corresponding to the four outlying subgroups, respectively, which represent the four qualitative longitudinal growth patterns: “Generally Large” (red), “Catch-up” (purple), “Stunting” (green) and “Faltering” (blue) (top panels). The corresponding scatterplots for the first two functional principal component scores (bottom).

More »

Expand

Table 2.

Principal component analysis of functional principal components.

More »

Expand

Table 3.

One-way ANOVA for subgroup detection.

More »

Expand

Fig 5.

Extreme subgroup identification from the proposed method.

(Left panel) Scatterplot of 95% high density region clustering from principal component analysis for head circumference, body length and weight and (Middle panel) the corresponding conditional kernel density estimates for long-term intelligence for each subgroup. Here, qualitative growth patterns include “Generally Large” (red), “Catch-up” (purple), “Stunting” (green) and “Faltering” (blue), whereas the black dashed line represents the “normal” subgroup that consists of subjects who do not belong to the four outlying subgroups. (Right panel) Tukey’s multiple comparisons of mean differences for standardized IQ along outlying subgroups are demonstrated by family-wise 95% confidence intervals, where we label the four subgroups as R (red, “Generally Large”), P (purple, “Catch-up”), G (green, “Stunting”) and B (blue, “Faltering”).

More »

Expand

Fig 6.

Correlation plot.

Correlations between the first two functional principal component (FPC) scores of head circumference (HC), body length (LN) and weight (WT). FPC scores among the first and second components have positive correlations, respectively, which suggests to combine the three growth features linearly with PC loadings as in Table 2.

More »

Expand

Fig 7.

Marginal subgroup identification.

(Top panels) Conditional kernel density estimators and (bottom panels) illustrations of Tukey’s multiple comparisons for standardized IQ along outlying subgroups. As in Fig 5, the first two functional principal component (FPC) scores are used to construct subgroups for head circumference (left), body length (middle) and weight (right), labeled R (red, “Generally Large”), P (purple, “Catch-up”), G (green, “Stunting”) and B (blue, “Faltering”).

More »

Expand