Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Functional analysis within latent states: A novel framework for analysing functional time series data

  • Owen Forbes ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft

    owen.forbes@hdr.qut.edu.au

    Affiliation QUT Centre for Data Science, School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia

  • Edgar Santos-Fernandez,

    Roles Conceptualization, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliation QUT Centre for Data Science, School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia

  • Paul Pao-Yen Wu,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation QUT Centre for Data Science, School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia

  • Kerrie Mengersen

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation QUT Centre for Data Science, School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia

Abstract

Functional data analysis (FDA) enables modelling and interpretation of data represented as functions over a continuum like time, space, or frequency. This paper introduces the flawless analysis framework (FunctionaL Analysis Within LatEnt StateS), a nested FDA framework for analysing functional time series data. It provides comprehensive insights into the interplay between latent state characteristics, state occupancy dynamics, and functional attributes within states, while maintaining interpretability at each level. Applying flawless to functional time series of power spectral densities from electroencephalography (EEG) data from the Healthy Brain Network, we explore functional characteristics of resting state brain activity in n = 503 early adolescents aged 9 - 15 (, SD = 1.7). We identify four functional latent states associated with variations in psychopathology and cognitive function. Bayesian regression models reveal important associations between the dynamics of latent state occupancy, functional traits within states, and relevant health measures. The integration of multiple FDA tools offers rich insights into functional and time-frequency characteristics of longitudinal data. For neuroscientific data this requires fewer assumptions about oscillatory peak frequencies, and captures more detailed frequency domain characteristics. flawless offers utility for novel and sophisticated insights into functional time series data across a range of areas for research and practice.

Introduction

Functional data analysis (FDA) is a growing field of statistical research regarding analysis of data that can be represented as functions, including curves or shapes which vary over a continuum such as time or frequency [13]. It is a rapidly expanding area of research and application within medical statistics, and has broad utility across applications from genetics to neuroscience and prediction of clinical outcome time series for health services [46]. FDA involves analysing data that are intrinsically infinite dimensional, which poses challenges for both theoretical development and computational methods. However, this inherent high dimensionality also provides a rich source of information and many opportunities for research and data analysis. Sequences of functional data observed over time are an increasingly common subject of applied statistical analysis across a variety of fields. This flavour of functional data occurs commonly across domains such as ecology, climatology and biology [7]. Examples of functional data observed over time in these domains include animal movement data on orca dives [8], biomechanical data from athletes [9, 10], and power spectral density curves measuring frequency content of brain activity over time [11].

In neuroscience, functional data are common objects of investigation, with several studies demonstrating FDA’s utility for analysing brain activity data varying over the functional domains of time, space and frequency [1113]. For example, Hasenstab et al. developed a multi-dimensional FPCA approach for EEG analysis, while Scheffler et al. extended this to handle region-referenced longitudinal EEG data. Xie & Lawniczak demonstrated FDA’s value for spectral analysis of epileptic EEG signals. These studies show how FDA methods can effectively handle the high dimensional, complex temporal dependence structure inherent in brain activity data [1].

Data measuring brain activity data are typically high dimensional, complex, and exhibit temporal dependence structure [1], and functional analysis methods offer effective and novel avenues for statistical analysis of neuroscientific data. Developing functional analysis frameworks for neuroscientific data offers the potential for novel insights into complex patterns of variation in functional time series, with less reliance on canonical summary features and the potential for novel insights into influential sources of variation among individuals which fall outside of the scope of traditional multivariate analyses. Our goal in this work is to develop and test a novel methodological framework combining different FDA approaches, demonstrating utility for applied researchers in neuroscience and other fields to analyse functional data and gain novel insights across multiple nested levels of analysis.

When applying FDA to time series data like EEG recordings, the data must first be transformed into a functional form and appropriately smoothed. In this work, we convert EEG time series data into power spectral density (PSD) functions through frequency decomposition, producing curves that represent the distribution of signal power across frequencies at each time point. These PSDs must then be smoothed to create functional data objects - while various approaches like B-spline smoothing can be used for this purpose [28], we employ the ‘fitting oscillations and one over f’ (FOOOF) algorithm [16] which is specifically designed to separate periodic and aperiodic components in neural power spectra. This transformation assumes that the underlying signal can be meaningfully decomposed into frequency components, and that these components vary smoothly over the frequency domain. An important constraint of our approach is that we analyse data from a single EEG channel, which means we cannot capture spatio-temporal dynamics or conduct analyses of functional connectivity between brain regions. While multi-channel FDA approaches exist [12], we focus on single-channel analysis to maximise clinical utility and interpretability while demonstrating the core principles of our nested functional analysis framework.

FDA encompasses a set of methods to characterise the key modes of variation and identify influential characteristics over the functional domain of observed curves or trajectories. Functional data are typically considered as finite, high dimensional realisations of underlying smooth functions [1, 2]. Relative to traditional multivariate approaches, FDA tools are better able to handle the very high dimensional nature of functional data, and offer richer insights on functional data compared to multivariate analyses which are more limited by the ‘curse of dimensionality’ [2, 17]. For data observed over time, the dimensionality is the product of the number of features and the number of time points [10], resulting in issues with the number of dimensions being large relative to the number of observations. To manage challenges around high dimensionality, traditional multivariate statistical methods for analysis of functional neuroscience data observed over time would typically require selecting a small number of a priori features of interest to summarise observed curves, and/or averaging over the temporal dimension. Functional data analysis offers statistical approaches that are more robust to the ‘curse of dimensionality’ compared to classical multivariate statistics, using tools such as smoothing, regularisation and dimensionality reduction to enable novel insights into influential characteristics across the whole functional domain of interest without restricting focus to a priori features of interest [3].

While various approaches exist for analysing functional time series (FTS) data, including direct dimension reduction methods like Functional Singular Spectrum Analysis (FSSA) [18], our goal is to develop a framework that preserves interpretability across multiple levels of analysis while capturing both temporal dynamics and functional characteristics. Recent advances in FTS methodology have provided sophisticated tools for dimensionality reduction and forecasting in both univariate and multivariate contexts [19]. However, for neuroscientific applications, maintaining clear interpretability of outputs at each analytical level is crucial for clinical relevance and practical utility.

One method for functional analysis that offers insights into temporal dynamics within time series of functional data observations is the functional hidden Markov model (FHMM) [22]. Considering the dynamics of a system observed longitudinally, latent state models such as Hidden Markov Models (HMM) [23] can offer useful insights into the latent states that it shifts between over time. HMMs have been extended for use with high dimensional and functional data, which enables analysing curves (functional data) as realisations from unobserved latent states [22].

Recent work has demonstrated the value of HMM approaches specifically for analysing brain state dynamics in neurodevelopmental conditions. Kember et al. applied HMM to resting-state EEG data from the Healthy Brain Network dataset to study network properties in ADHD, identifying distinct electrophysiological states characterised by oscillatory power patterns and finding that dwelling in states with high alpha/beta power supported better response control [20]. Similarly, Shappell et al. used Hidden semi-Markov Models to show that children with ADHD spend less time in anticorrelated network states and more time in hyperconnected states compared to typically developing children [21]. These studies highlight how latent state modeling can reveal important differences in brain dynamics associated with neurodevelopmental conditions.

Another core method for FDA is functional principal component analysis (FPCA). FPCA is able to summarise infinite-dimensional functional data into a finite set of uncorrelated random variables which represent variation over the whole functional domain in a parsimonious way, allowing for the identification of dominant modes of functional variation [2, 3]. Functional principal components are similar to traditional principal components which maximise variance explained in vector space for standard PCA, where functional principal components, comprised of eigenfunctions and corresponding eigenvalues or scores, maximise functional variance explained in space [2]. As with traditional PCA, typically a small finite number C of functional principal component and their associated scores are retained for further analysis, which explain a majority of functional variation in the data [3]. Dimensionality reduction with FPCA is a key method for functional analysis to provide insight into dominant attributes of functional variation, and to enable further analysis based on the reduced set of variables which capture much of the information in the original data.

Applying these approaches to neuroscientific data, FHMM outputs can provide rich insight into temporal and frequency characteristics of functional time series data measuring brain activity across individuals. However, it is often of interest to understand in more detail how individuals differ in their functional characteristics when they are allocated to matched states. For instance in the context of sleep studies, a common goal would be to first identify periods in which individuals are allocated to a common sleep state (e.g. rapid eye movement sleep), before analysing and comparing detailed attributes across individuals within that matched sleep stage type [24, 25]. Using Viterbi estimated states from a FHMM model, by selecting subsets of time series data during which individuals are occupying the same latent state, we are able to compare ‘like with like’ and gain a more detailed understanding of how individuals are similar or different within matched latent states.

In other work developing FDA methods for neuroscientific data, incorporating functional domain information in the same modelling framework as temporal dynamics often comes at the cost of substantially increased model complexity and decreased interpretability. There are several examples in the literature of modelling approaches that simultaneously model brain activity in the frequency, temporal and even spatial domains [1, 12, 26]. Instead by modelling temporal dynamics of PSDs separately using FHMM and nesting frequency analysis within latent states using FPCA, we are able to generate insights into frequency and temporal dynamics in a way that better preserves interpretability for clinical relevance, while also offering insights into the connections between these levels of analysis.

Building on previous work to analyse characteristics of functional data observed for multiple individuals over time, in this paper we introduce a methodological framework called flawless (Functiona L Analysis Within Lat Ent State S), a nested FDA framework for analysis of time-varying functional data that incorporates latent state structure, temporal dynamics and functional characteristics in the frequency domain. By integrating latent state modelling of functional data using FHMM and FPCA of functional characteristics stratified by allocation to matched latent states, this method offers more detailed insights and richer comparisons beyond those available through its component methods. FHMM outputs provide insight into the attributes of distinct latent states that individuals occupy, and the temporal dynamics of their movement between states including the number of states they visit and the number of transitions between them. Subsetting the data based on allocation to latent states, FPCA models for each state provide insight into the dominant characteristics of functional variation in the frequency domain that differentiate individuals within each latent state. While the present analysis has been developed for resting state brain activity data in young people, this framework can readily be applied to other instances of functional data observed over time.

We use flawless analysis to understand characteristics of resting state brain activity in early adolescents, using EEG data from the Child Mind Institute’s Health Brain Network study (HBN) [27]. HBN is a cross-sectional study covering a large number of data types relating to brain activity, cognitive, physical, and mental health in young people from the New York area. The aim of HBN is to create a large-scale biobank of data to facilitate the discovery of biomarkers and the exploration of prevalent illness phenotypes linked to psychopathology and cognitive function. Using this distinctive data source, we demonstrate the value of flawless analysis for generating unique and novel insights regarding characteristics of resting state brain activity in young people, and demonstrate substantial associations between functional analysis outputs and measures of psychopathology and cognitive function.

For resting state brain activity in the frequency domain measured over time, analysis of latent states and trajectories from FHMM and dominant modes of functional variation from FPCA provides valuable insights. Alternative approaches, including traditional multivariate analyses that cannot accomodate functional data, and complex functional methods that simultaneously model multiple functional domains, have different analytical goals and produce outputs which are not readily comparable to flawless analysis. Given the limited ability in this context to directly compare quantitative performance or model fit metrics against related methods, instead we focus in this work on addressing qualitative differences and highlighting unique features that are available through the nested approach of the flawless framework.

We address the following applied research questions regarding the dynamics of resting state brain activity in young people, and their associations with psychopathology and cognitive function:

  1. Is functional analysis within latent states an effective approach to implement for characterising resting state brain activity measured by EEG?
  2. What are the characteristics of functional latent states that young people occupy during eyes closed, resting state brain activity? How do measures of psychopathology and cognitive function vary between groups of individuals who spend a majority of time in each one of these states?
  3. What are the temporal dynamics of movement between functional latent states? How many states do individuals visit, what proportion of time do they spend in each state, and how frequently do they move between them?
  4. Examining subsets of functional data by allocation to latent states, what frequency characteristics differentiate brain activity among individuals in each state?
  5. Considering outputs of these functional analyses in terms of latent states, temporal dynamics and frequency content within latent states, how are these traits associated with measures of psychopathology and cognitive function?

The rest of the paper is organised as follows. In the Methods we provide an overview of the methodological pipeline for flawless analysis, providing information on the background and implementation of the component methods, and provide details on neuroscientific and health outcome data collected in the HBN study. In the Results we present findings from the different stages of our analysis of resting state EEG characteristics in adolescents, including functional latent states and temporal dynamics of state occupancy from the FHMM, health measures and frequency characteristics across groups of individuals with 80% or more of their time allocated to one latent state. We then present results of Bayesian regression models demonstrating associations between flawless analysis outputs and health measures relating to psychopathology and cognitive function. In the Discussion we discuss the benefits and implications of this method and our findings, and consider limitations and future directions for this work.

Methods

In this section we provide an overview of the methodological steps involved in the flawless analysis framework. We then introduce the study protocols and data collection details for neuroscientific and health measures assessed in the HBN study, before describing pre-processing, frequency decomposition and extraction of smoothed representations of periodic content from EEG data. Background and implementation details are provided for FHMMs, functional principal component analysis, and Bayesian regression models. The Healthy Brain Network study was approved by the Chesapeake Institutional Review Board. Prior to conducting the research, written informed consent is obtained from participants ages 18 or older. For participants younger than 18, written consent is obtained from their legal guardians and written assent obtained from the participant.

Flawless overview

We begin with a high level overview of the steps involved in flawless analysis. The intention is to provide a ‘road map’ for the reader, making the detailed explanations of each individual step easier to understand in the broader context of this framework. Details on notation are provided in the subsequent sections, and a reference Table 5 describing key notation is provided at the end of this manuscript. The method implemented in the flawless analysis framework consists of the following steps:

  1. Taking a time series of functional data observations indexed over time k from one or more individuals/units p, perform some initial smoothing of the data over the functional domain. In general applications, an approach such as B-spline smoothing may be appropriate for preparation of data prior to fitting the FHMM [28]. In the present work we generate smoothed representations of the periodic content in the frequency domain for EEG power spectral densities using the FOOOF algorithm [16].
  2. Fit a FHMM to the time series of smoothed functional data, finding a set of N functional latent states across individuals to characterise the different states occupied over time [22]. An initialisation strategy based on multiple subsampled models may be used to identify the number of latent states N and improve model stability through selection of well-separated initial centroids, described in the Methods and the Supplementary Materials.
  3. Use the Viterbi algorithm to calculate the maximum a posteriori estimated series of latent states associated with the series of observed functional data observations [29]. Based on this estimated series of states, calculate summary statistics for each individual including the number of states occupied, percentage of time spent in each state, and number of transitions between states.
  4. Create subsets of the input data based on allocation to each state si,i = 1,...,N in the vector of state allocations estimated for each individual in (3), .
  5. For each subset of functional data observations allocated to each latent state, apply functional principal component analysis. For each FPCA model, retain an appropriate number of functional principal components C based on inspection of scree plots and cumulative percentage of variance explained.
  6. Based on the within-state functional principal components from (5) and the FHMM outputs from (3), these outputs may be used for subsequent analyses such as regression or clustering to examine relationships between functional characteristics and associated outcome variables.

For clarity and ease of interpretation, this is also presented in Algorithm 1 as pseudocode. Fig 1 also presents a methods diagram indicating an overview of the components of the flawless analysis framework and their application to functional data calculated from resting state EEG recordings in the present application.

Algorithm 1. flawless (FunctionaL Analysis Within LatEnt StateS)

Notation:

: Time series of functional data observations

p : individuals/units

k : time points

t : functional domain

a.b : Access field/attribute b of object a (object-oriented

  notation)

1: procedure Flawless

2:   : Vector of state allocations

3:   : Functional principal components for each state i

4:   : Summary statistics for each individual p

5:   // Smooth functional data over domain t

6:   e.g., B-spline smoothing or

  FOOOF for EEG PSDs

7:   // Fit FHMM to identify latent states

8:   via subsampling

  strategy to choose initial centroids

9:  

10:   // Calculate state allocations and statistics

11:  

12:   for each p do

13:   

14:   

15:   

16:   end for

17:   // Create state-specific data subsets

18:   for i = 1 to N do

19:   

20:   end for

21:   // Perform FPCA within each state

22:   for i = 1 to N do

23:    via scree plots

24:   

25:   end for

return , FPC, Stats

26: end procedure

thumbnail
Fig 1. Flawless analysis framework overview, with application to EEG power spectral densities as functional data observed over time.

Xpk indicates a functional data array with p = individuals, k = time, and t = functional domain. FOOOF = fitting oscillations and one over f algorithm - see Methods for details.

https://doi.org/10.1371/journal.pone.0326598.g001

The Child Mind Institute’s Healthy Brain Network study

The Child Mind Institute’s Healthy Brain Network study is a large scale initiative collecting data on a variety of measures relating to brain activity, physical health, mental health and cognitive development in young people aged 5 - 21 years in New York City and surrounding areas [27]. The study uses a community referral-based recruitment model to encourage families concerned about psychiatric symptoms in their child to participate. The goal for HBN is to generate a large-scale biobank of data for biomarker discovery and investigations of commonly occurring illness phenotypes relating to psychopathology and cognitive function. Relative to a population sample, the strategy of recruiting on the basis of perceived clinical concern means that the HBN sample includes a high proportion of individuals with elevated psychopathology and/or cognitive difficulties. As the present study focuses on brain activity, psychopathology and cognitive function in the period of early adolescence, we use data from participants aged 9 to 15 years in the HBN biobank. A number of other studies have used this age range for early adolescence [3032], and we used this range to maximise the number of participants included, while confining our focus to this developmental period of interest.

Data were downloaded from the HBN portal on July 4, 2022. Data used for this study were contained in Release Numbers 1.1 - 10 of HBN, with Release Dates between January 31, 2018 and April 13, 2022. The authors did not have access to information that could identify individual participants during or after data collection.

EEG data acquisition.

For this paper we use resting state, eyes closed EEG data, recorded at the vertex electrode Cz. We chose to focus on single electrode, eyes closed, resting state data for the sake of generalisability and clinical utility. Compared to task-based EEG data, resting state EEG data offers more opportunities for clinical applications as it is easier to replicate data collection across different laboratory and clinical settings without requiring specific software and equipment for task-based paradigms [33, 34]. It is also simpler for data cleaning and preparation, with potential for straightforward implementation on a wider scale. We focused specifically on the Cz electrode, which is often used as a reference electrode due to its central location and typically high-quality signal [76]. Cz was used as the reference electrode in the HBN EEG acquisition protocol. This placement provides a good balance between signal quality and reduced susceptibility to artifacts, and the consistent placement of Cz could support implementation across various clinical and research settings [77]. These factors enhance the potential clinical utility of our findings, as data-driven phenotypes uncovered from these measures will be straightforward to measure and implement across a variety of scenarios for EEG acquisition in clinical and research settings.

During resting state recording sessions, participants viewed a fixation cross in the center of a computer screen. Throughout the paradigm, participants were instructed to open or close their eyes at various points, alternating between 40 second periods of eyes closed and 20 second periods of eyes open for a total of 5 minutes (300 seconds total, with 200 seconds of eyes closed recording). The paradigm was designed to measure endogenous brain activity during rest [27]. High-density EEG data were recorded in a sound-shielded room at a sampling rate of 500 Hz with a bandpass of 0.1 to 100 Hz, using a 128-channel EEG geodesic hydrocel system by Electrical Geodesics Inc. Full details of EEG data acquisition including cap fitting, impedance checking and preparation are available in the HBN data descriptor [27].

Psychopathology and cognitive function measures.

A wide variety of measures regarding physical, mental and cognitive health have been included at various time points in the evolving HBN study protocol. We selected four measures of psychopathology and four measures of cognitive function, motivated by choosing measures which captured a broad range of outcomes in each domain and were available for a majority of participants since early timepoints in the HBN study.

The psychopathology measures we selected were: Mood and Feelings Questionnaire, Self Report (MFQ SR) which measures depression symptoms in children and young people [35]; Screen for Child Anxiety Related Emotional Disorders, Self Report (SCARED SR), which measures anxiety symptoms in young people [36]; and the internalising and externalising scales from the Youth Self Report inventory (YSR Int and YSR Ext), which measure broad dimensions of internalising and externalising psychopathology [37, 38]. The four measures of cognitive function were from the National Institutes of Health (NIH) Toolbox Cognition Battery [39]. These measures included the Card Sorting task (NIH Card) measuring executive function, the Flanker task (NIH Flanker) measuring executive function and attention, the List Sorting task (NIH List) measuring working memory, and the Pattern Comparison task (NIH Pattern) measuring processing speed.

EEG processing, frequency decomposition and smoothing

EEG data used in this paper were pre-processed internally by HBN investigators, using steps including identification and replacement of electrodes with poor data quality, high pass filtering at 0.1 Hz, notch filtering at 59-61 Hz to remove background electrical line noise, and removal of eye movement artifacts. Full details of EEG cleaning and pre-processing steps are available in a separate publication describing methodology for the Multimodal Resource for Studying Information Processing in the Developing Brain (MIPDB), another study run by the Child Mind Institute [40]. Following standardised EEG processing methodology used in MIPDB, EEG data were filtered between 1.5 and 30 Hz prior to frequency decomposition, covering the canonical frequency bands between delta (1.5 - 4 Hz) and beta (14 - 30 Hz).

To study frequency content of EEG data we used multitaper analysis, a frequency decomposition method that is suitable for non-stationary signals and offers good frequency specificity [41]. This is a popular technique for frequency analysis that has been used widely in recent EEG research literature [42, 43], and offers an improved signal-to-noise ratio for detecting rhythmic activity in a signal relative to other standard frequency decomposition methods [44, 45]. Multitaper analysis was conducted using the Chronux toolbox in MATLAB [46]. This method enabled us to calculate power spectral densities with good frequency resolution and minimal ‘bleeding’ of power across adjacent frequency bands [47]. This analysis used a 2 second time segment (W) and a 2Hz frequency bandwidth (W), resulting in a total of 3 tapers (2TW - 1). These parameters were chosen to provide adequate frequency resolution, giving a detailed characterisation of the frequency distribution in each individual’s power spectral density [43].

Multitaper analysis initially generated a time series of 300 power spectral densities for each individual, representing the frequency content of their EEG signal over the 300 seconds of resting state recording time, recorded on the vertex electrode Cz as described above. As we were interested in activity recorded when participants had their eyes closed, we trimmed out each eyes closed segment with a 2 second leading and trailing buffer, resulting in 180 seconds total of eyes closed PSDs per individual across 5 segments each with 36 PSDs.

Finally we applied the FOOOF algorithm (‘fitting oscillations and one over f’) in order to identify periodic components of EEG activity, removing aperiodic power and generating smoothed estimates of the periodic content in PSDs which could be used as inputs for the FHMM. The FOOOF library in Python (version 1.0.0) was used to parameterise neural power spectra. Settings for the algorithm were set as: peak width limits: 1 - 12 Hz; max number of peaks: no limit; minimum peak height: 0.05; peak threshold: 2.0; and aperiodic mode: fixed, without a knee. Power spectra were parameterised across the frequency range 1.5 to 30 Hz. Performance was assessed based on visual comparison of FOOOF model fits against original PSDs, and on metrics for model error and goodness of fit. The outputs retained from the FOOOF algorithm were the peak fit objects, representing the peaks in each PSD with aperiodic power subtracted, reconstructed as a sum of Gaussians fit to the centre frequency, amplitude and width of each detected peak. This produced smoothed representations of the periodic oscillatory content in each PSD, which were suitable as inputs for subsequent functional analyses.

Functional hidden Markov model

Background.

Hidden Markov Models (HMMs) are a type of statistical model that use a Markov process with hidden states to make probabilistic models of linear sequence labeling problems, and can be used to characterise the modes or states that a system occupies and moves between over time [48]. These models are designed to handle time series data by using emitted symbols that are observable realisations from latent states, and random transitions from one latent state to another that remain unobserved. The memory-less property of the Markov chain, where the transition from one state to another depends only on the present state, is a key concept in the HMM framework.

A Hidden Markov Model is a bivariate process defined on a given probability space such that:

  • is a Markov chain with a discrete and finite state space , with , transition matrix and initial distribution , where ;
  • For each time k, the observation is a d-dimensional random array. In particular, given the state process is a sequence of conditionally independent random arrays (vectors or matrices, depending on the type of data) [22, 48].

In the general case, the objective function for a HMM can be written as:

(1)

where is the probability of being in the state si at time k, is the probability of being in state si at time k and state sj at time k + 1, given the model and the observations, and is the emission function of conditionally on the event for any . The Baum-Welch algorithm can then be used to compute the value in (1) and iteratively perform expectation maximisation and the ‘forward-backward’ procedure to calculate the objective function and the best estimates for model parameters [49, 50].

In a standard HMM, the emission function is a probability distribution that describes the likelihood of observing a particular output symbol (or emission) given the current hidden state. It describes the relationship between the unobservable internal state of the system and the observable data, enabling inference about the underlying hidden states from the observed data. For a functional observation , the emission function of conditionally on the event is represented as , for any , where is a functional parameter representing the mean of the curves emitted by state si. In a FHMM, the emission functions are constructed based on distances between curves. Specifically, this method assumes that for each state si, the emission function can be written as

(2)

where is a function that transforms the distance into a similarity measure. In particular, the implementation by Martino et al. (2020) uses the function h(y) = 1/y2 and the distance for d. For a full discussion of the development and methodology of the functional HMM, please refer to [22].

Following identification of functional latent states from a FHMM, a number of insights may be gained from interpretation of the fitted FHMM model characteristics and estimation of the most likely sequence of latent states. One common approach is to use the Viterbi algorithm [51] to calculate the most likely sequence of states to have generated the observed data. From the generated sequence of most likely latent states, further analysis may identify attributes including dominant states which were visited by a higher number of individuals, and inter-individual comparisons including the time spent in each state and the number of states visited.

Implementation.

For this analysis, we implement a FHMM using the hmmhdd R package, version 1.0 [52]. As described above, the functional data time series input to the model are peak fit outputs from the FOOOF algorithm, representing a smoothed estimate of the periodic oscillatory content in each power spectral density.

To manage instability issues arising from initialisation of functional latent state centroids based on a functional k-means algorithm, we used a subsampling strategy described below to identify an appropriate number of stable and robust centroids present in the data to initialise the model.

Initialisation strategy for FHMM latent state centroids.

During the development of this work, we identified instability issues in the performance of FHMMs, as implemented in the R package hmmhdd [22, 52]. Through testing, we discovered an issue where the FHMM algorithm tends to find multiple latent states with identical centroids. Our understanding is that this instability likely occurs due to the use of a functional k-means algorithm to initialise the latent state centroids for the FHMM [53, 54]. k-means with random initialisation has known performance issues where the clustering results can have a high degree of instability and are very sensitive to the initial conditions [55]. This is particularly challenging for clustering functional data, as they are typically very high dimensional, so the k-means algorithm with random initialisation appears to have a tendency for multiple clusters to collapse towards the same local maximum in the high-dimensional space, often resulting in near-identical centroids across multiple functional latent states.

In response to this issue, we developed a subsampling based initialisation strategy to improve the stability and performance of the FHMM algorithm. To do this, we fit 10 FHMM models based on 80% randomly subsampled sets of the full data in order to identify stable centroids. 80% was used as a subsampling proportion based on common practice in the literature [56, 57], removing a small proportion (20%) of the data at random in order to identify stable and robust results across subsamples. For this application, based on testing with HBN data these subsampled models were each set to identify 8 latent states, allowing for redundant overlapping states to occur while enabling identification of stable consistent states. As a result of this instability in FHMM performance, model selection based on comparing information criteria between candidate models was not a feasible strategy to choose the number of latent states N. Instead, taking all centroids identified across the subsampled models, we plotted them together to allow visual identification of consistent and well-separated state centroids which appear across models. These heuristically grouped centroids across subsampled models were then averaged and used to generate a set of N initial centroids for the FHMM, to prevent generation of redundant states with identical centroids. More details on this process are provided in S1 and S2 Figs.

Functional principal component analysis

Background.

Functional principal component analysis is an extension of principal component analysis for dimensionality reduction, and it can be used to analyse dominant modes of variation across the functional domain for data sets consisting of functions or curves. FPCA enables representation of the infinite-dimensional functional data as a finite-dimensional vector of random scores, which can subsequently be modeled using the tools of multivariate data analysis. This method is based on an expansion of the underlying random trajectories in a functional basis consisting of the eigenfunctions of the covariance operator of the process [2]. The resulting FPCs or scores capture the dominant modes of variation in the data and can be truncated to a finite vector, achieving the goal of dimensionality reduction. Like a traditional PCA explaining maximum variance in principal components consisting of eigenvectors and eigenvalues, FPCA decomposes functional data into eigenfunctions and eigenvalues, capturing variation in curves across the whole functional domain.

For a set of functional data Xp(t) where t is the functional domain, the FPCA expension is as follows:

(3)

where is the mean function of Xp(t), are the orthogonal eigenfunctions, and are the functional principal components of Xp. This expansion in (2) enables dimensionality reduction as the first C terms that explain a substantial amount of overall functional variance provide a good approximation to the infinite sum, so that the information contained in Xp is largely contained in the C-dimensional vector of eigenvalues and the approximated processes

(4)

Based on assessment of scree plots and the cumulative percentage of functional variance explained, a small finite number C of functional principal components can be retained which explain the majority of variation and represent a parsimonious dimension-reduced summary of the original functional data.

Implementation.

FPCA models were fit to subsets of the time series of PSDs in this dataset, stratified by allocation to functional latent states generated from the Viterbi algorithm. This was implemented in the R package fda, version 6.0.5 [58]. For each subset of functional data by state, FPCA was performed using penalised smoothing to fit a series of B-spline basis functions to the FOOOF PSD curves for each individual and each time point [7]. To select functional PCs to retain, we assessed scree plots by looking for the ‘elbow’ point where the rate of decrease in explained variance begins to level off. On inspection of these plots, we looked for the number of components that cumulatively explained a substantial proportion () of the total variance while balancing parsimony.

Bayesian regression models for mental health and cognitive function

To examine patterns of association between outputs from functional analyses and health measures relating to psychopathology and cognitive function, we implemented Bayesian regression models using the brms package in R, version 2.15.0 [59, 60]. We fit separate multivariate response regression models for four subsets of participants, based on the dominant latent state in which participants spent the most time. The output variables for these models were scores on the four measures of psychopathology and four measures of cognitive function, as described above. Independent variables in these regression models included the number of functional latent states visited by each individual, the number of transitions between states over the resting state recording period, the percentage of time spent in the dominant state, and scores on the first five functional principal components for the dominant state. Other independent variables included as control covariates included age, sex, and handedness (measured by the Edinburgh Handedness Questionnaire) [61]. Default settings in brms were used for ‘flat’ uniform prior distributions on regression coefficients, covering the expected range of the parameter values.

Canonical EEG frequency bands

As we have indicated in the Introduction, canonical EEG frequency bands do not correspond to functionally distinct categories of brain activity and have substantial flaws for interpretation of differences in oscillatory content across individuals, especially in childhood and adolescence when oscillatory content in brain activity is rapidly changing [16, 62]. For ease of description below, we refer to the following labels for frequency ranges: delta (1.5-4 Hz), theta (4-7 Hz), alpha (7-14 Hz), beta-1 (14-22 Hz), and beta-2 (22-30 Hz). There is substantive variation in the definitions of these bandwidth ranges in the literature [34, 63, 64]. We have based these specific bandwidth labels on evidence that the alpha oscillatory rhythm appears across a wider frequency range than typically used (such as 8-12 Hz) [64], and we used the beta-1 and beta-2 bands implemented by Rogala et al. [34]. However, it is important to note that we are not calculating power within these bands for our principal analyses. The functional analysis methods used here are attuned to the shape of each PSD curve across the whole frequency range considered, and these labels are only used as a heuristic label to support understanding and interpretation pf power distribution across this frequency range, due to their widespread use and familiarity in EEG research.

Results

We present the results of this analysis in three sections: First we present the results of the FHMM including frequency characteristics of centroids for functional latent states, and differences in psychopathology and cognitive function between individuals who spent 80% or more of their time in each state. The second section presents results of FPCA analyses stratified by latent states, identifying the dominant modes of functional variation that distinguish frequency content among individuals in matched latent states. The third section presents results of Bayesian regression models investigating associations between outputs from FHMM and FPCA models with outcome variables relating to psychopathology and cognitive function.

In this work we used pre-processed resting state EEG data, which were available for n = 503 early adolescents between the ages of 9 and 15 years (M = 11.5, SD = 1.7) in the HBN study. Descriptive statistics for demographics, psychopathology and cognitive function are presented in Table 1. As the HBN study protocol has evolved over time, some measures introduced later in the study (including the Youth Self Report scale) are available for fewer participants. As noted in the Methods, the HBN study recruits participants using a targeted recruitment strategy for children with mental health and cognitive difficulties, and so this dataset exhibits higher levels of psychopathology and lower cognitive function than would be expected in a population sample. ‘Fitting oscillations and one over f‘ (FOOOF) performance metrics indicated good performance for models fit to estimate aperiodic content in PSDs, with a mean of 0.964, and an average of 3.6 peaks identified per PSD [16].

thumbnail
Table 1. Demographics, cognitive function and psychopathology measures for the overall sample. Last column gives the Mean (SD).

https://doi.org/10.1371/journal.pone.0326598.t001

Functional hidden Markov model

Based on centroids that appeared consistent and well-separated across 10 FHMMs fit using 80% random subsamples of the full data, we identified 4 functional latent states. Further details on FHMM initialisation based on stable centroids across subsampled models are provided in S1 and S2 Figs.

Frequency characteristics of functional latent states.

Fig 2 displays centroids for the 4 functional latent states, labelled by decreasing frequency. State 1 (red) is the most commonly visited state, being the most occupied state for n = 213 individuals, and is characterised by high relative power in the delta range (1.5-4 Hz), very low alpha power (7-14 Hz), and high beta-2 power (22-30 Hz). State 2 (orange) is the dominant state for n = 139 individuals, and has high relative theta power (4-7 Hz), a low frequency, moderate power alpha peak at 9 Hz, and low beta-1 power (14-22 Hz). State 3 (green) is the dominant state for n = 79 individuals, and has low relative theta power, a high frequency, moderate power alpha peak at 11 Hz, and high beta-1 power (14-22 Hz). State 4 (blue) is the dominant state for n = 72 individuals, and has low relative delta and theta power, very high alpha power with a moderate frequency alpha peak at 10 Hz, and low beta-2 power.

thumbnail
Fig 2. Centroids for 4 functional latent states from functional hidden Markov model.

https://doi.org/10.1371/journal.pone.0326598.g002

Latent state occupancy patterns.

Fig 3 presents a bar chart of the number of individuals for whom each state is dominant, and histograms of the number of states occupied and the number of transitions between states. State labels have been ordered by decreasing frequency of allocation. A majority of individuals visit only 1 (n = 234; 46.5%) or 2 (n = 205; 40.8%) latent states, and a majority (n = 479; 95.2%) make between 0 and 4 transitions between states over the recording period.

thumbnail
Fig 3. Bar charts of the number of individuals for whom each state is dominant, the number of states occupied and the number of transitions between states.

https://doi.org/10.1371/journal.pone.0326598.g003

Fig 4 presents a Venn diagram representing the combinations of states that individuals visit during resting state recordings, as captured in the sequence of estimated states generated by the Viterbi algorithm. The most common combinations of states visited were: State 1 only (25.6%); States 1 and 2 (21.5%); State 2 only (9.3%); States 3 and 4 (6.4%); State 3 only (6.0%); and State 4 only (5.6%).

thumbnail
Fig 4. Venn diagram of FHMM states visited by individuals.

Numbers in cells represent the number of individuals who visited that combination of functional latent states.

https://doi.org/10.1371/journal.pone.0326598.g004

Comparing psychopathology and cognitive function between dominant states.

Given the variation in the proportion of time which individuals spend in their most common state, it is of interest to contrast relevant health measures between individuals who spend a substantial majority of their time in each state, in order to understand the dominant patterns of variation in psychopathology and cognitive function between states. This descriptive analysis aims to characterise patterns in psychopathology and cognitive function between individuals who spent substantial time in each state, providing context for subsequent detailed analyses, rather than make inferential claims about statistical significance or broader generalisability of these differences. For Fig 5 and Table 2, data is included for individuals who spent 80% or more of their time in one state during resting state recordings (n = 353; 70.2%). Fig 5 presents bar charts indicating Z-scores (scaled and centered values) for 4 cognitive testing measures from the NIH Toolbox, and 4 measures of psychopathology. These plots are based on data presented in Table 2. Frequentist analyses of variance (ANOVAs) revealed no significant differences in sex or handedness between these groups, but there was a statistically significant difference (p = 0.001) in age detected between these groups. Full ANOVA results are available in S1 Table. Differences in age, sex and handedness between groups were accounted for by including these demographic factors as covariates in Bayesian regression models below.

thumbnail
Fig 5. Bar charts comparing Z-scores (scaled and centered values) for 4 cognitive function measures and 4 psychopathology measures, between individuals who spent 80% or more of their time in each state.

NIH = National Institutes of Health Toolbox Cognitive function tasks; NIH Card = Card Sorting task measuring executive function; NIH Flanker = Flanker task measuring executive function and attention; NIH List = List sorting task measuring working memory; NIH Pattern = Pattern comparison task measuring processing speed; MFQ SR = Mood and Feelings Questionnaire, Self Report; SCARED SR = Screen for Child Anxiety Related Disorders, Self Report; YSR Ext = Youth Self Report, Externalising Scale; YSR Int = Youth Self Report, Internalising Scale.

https://doi.org/10.1371/journal.pone.0326598.g005

thumbnail
Table 2. Demographics, cognitive function and psychopathology measures for individuals who spent 80% or more of their time in one state. Mean (SD).

https://doi.org/10.1371/journal.pone.0326598.t002

Fig 5 and Table 2 show that individuals who spent 80% or more of their time in State 1 (n = 167) had poorer cognitive function and higher psychopathology than the overall average, based on multiple measures. These included below average scores for the NIH Card sorting and Flanker tasks, indicating poorer executive function and attention. They also had above average scores for the MFQ SR, YSR Externalising and YSR Internalising scales, indicating higher depressive symptoms and psychopathology. This suggests that spending a majority of resting state time in State 1 may be a risk marker for elevated cognitive difficulties and psychological distress.

For State 2 (n = 90), these individuals had improved cognitive scores on the NIH Card and Flanker tasks, as well as the Pattern task which measures processing speed. However, they also had elevated scores for all four psychopathology measures including anxiety symptoms measures with the SCARED SR scale. This suggests that extended time in State 2 may be a risk indicator for increased psychopathology, alongside higher cognitive function.

Individuals spending 80% or more of their time in State 3 (n = 52) exhibited improved cognitive function and lower psychopathology, including higher than average scores on the NIH Card, Flanker and Pattern tasks, and low scores on all four psychopathology measures. This suggests that spending a majority of time in State 3 may be indicative of a beneficial marker for improved mental health and cognition.

State 4 (n = 44) had the smallest number of individuals spending 80% or more of their time, and had a more mixed profile with 3 cognitive function scores above average and 1 below, as well as 3 psychopathology scores above average and 1 below. Given the smaller group size and mixed direction of these effects, the interpretation of health measures associated with this group is less clear.

Functional principal component analyses within latent states

Based on the latent states identified using the Viterbi estimated state sequence from the FHMM, we split the data into subsets of PSDs allocated to each latent state. For each subset, we ran a separate FPCA model in order to identify the dominant modes of functional variation in the frequency domain that differentiate individuals occupying matched latent states.

For the sake of space, in the body of the text below we present FPCA and regression results for functional latent states 1 and 3. Based on the results above, State 1 is the most commonly occupied, and individuals who spent 80% or more of their time in this state had higher psychopathology and lower cognitive function scores, indicating that this is a state of concern which warrants further detailed investigation. State 3 has a notable pattern of lower psychopathology and higher cognitive function scores relative to the sample average, indicating that time spent in this state may indicate lower risk of psychological distress, and improved cognitive health outcomes. Full results for states 2 and 4 are available in the Supplementary Materials.

State 1 – FPCA.

For State 1, we retained 5 functional principal components which cumulatively explained 72.6% of the functional variance for all observations allocated to this state, based on investigation of the scree plot (S3 Fig). Eigenfunctions for State 1 are plotted in Fig 6.

thumbnail
Fig 6. Functional principal components (eigenfunctions) for observations allocated to State 1.

https://doi.org/10.1371/journal.pone.0326598.g006

The first functional principal component (FPC; red) explained 22.6% of the variance, and represents higher power concentrated in the delta (1.5 - 4 Hz) and beta-2 (22 – 30 Hz) ranges, contrasted with alpha (7 – 14 Hz) power and to a lesser extent theta (4-7 Hz) and beta-1 (14-22 Hz). The second FPC (orange) explained 17.1% of the variance and represents high relative beta power and moderate power for high-frequency alpha (10-14 Hz), contrasted with theta and low-frequency alpha (7-10 Hz), indicating that individuals with higher scores on this functional principal component tend to have a higher frequency alpha peak, low theta power, and a lower frequency beta peak. The third FPC (green) explained 13.6% of the variance and represents higher theta, alpha and beta-2 power contrasted with beta-1 power, with higher scores indicating greater alpha activity and higher frequency beta peaks relative to other individuals in this state. The fourth FPC (light blue) explained 11.5% of the variance and represents higher power in the theta, low-frequency alpha and beta-1 ranges contrasted with delta, high-frequency alpha and beta-2 power, indicating lower frequency alpha and beta peaks and low delta power. The fifth FPC (light blue) explained 7.8% of the variance and represents higher delta, theta and high-frequency alpha power. Similar to the second FPC, higher scores on the fifth FPC indicated higher frequency alpha peaks, but with a high amplitude delta peak and lower beta power.

State 3 – FPCA.

For State 3, we retained 5 functional principal components which cumulatively explained 77.4% of the functional variance for all observations allocated to this state, based on investigation of the scree plot (S5 Fig). Eigenfunctions for State 3 are plotted in Fig 7.

thumbnail
Fig 7. Functional principal components (eigenfunctions) for observations allocated to State 3.

https://doi.org/10.1371/journal.pone.0326598.g007

The first FPC explained 25.2% of the variance, and represents a high alpha and beta-1 power, contrasted with low delta and beta-2 power. The second FPC explained 24.3% of the variance and represents high frequency beta-1 power with a peak at 20 Hz, contrasted with low alpha power. The third FPC explained 13.9% of the variance and represents power at low-frequency alpha and beta-2, contrasted with theta, high-frequency alpha and beta-1 power. Among differences in other frequency ranges, scores on FPC3 indicate low frequency alpha peaks. The fourth FPC explained 10.9% of the variance and represents high power in the delta, very high frequency alpha and beta-2 ranges, contrasted with low power for theta, low frequency alpha and beta-1. Higher scores on FPC4 here indicate higher frequency alpha oscillations. The fifth FPC explained 6.7% of variance and represents high power for theta, upper alpha and beta-2, contrasted with power for delta, low-frequency alpha and beta-1.

Bayesian regression models – Relating functional analysis characteristics and health measures

The Bayesian regression models revealed several substantial associations between functional analysis outputs and health measures. For State 1, FHMM dynamics showed the number of transitions was positively associated with NIH Flanker performance. FPCA weightings in State 1 showed negative associations between FPC3 and NIH Pattern performance, and between FPC4 and performance on NIH Card, Flanker and List tasks. For State 3, FHMM dynamics showed the number of states was positively associated with YSR Externalising, while number of transitions was negatively associated with YSR Internalising. Time spent in State 3 was negatively associated with SCARED SR and positively associated with NIH Pattern performance. FPCA weightings in State 3 showed positive associations between FPC3 and both YSR scales (Externalising and Internalising), and between FPC4 and both SCARED SR and NIH Flanker performance. We encourage interested readers to compare these Bayesian regression results across Tables 3 and 4.

thumbnail
Table 3. Regression coefficients and 95% credible intervals from Bayesian regression model for State 1. Entries in bold indicate coefficients with 95% CI excluding zero.

https://doi.org/10.1371/journal.pone.0326598.t003

thumbnail
Table 4. Regression coefficients and 95% credible intervals from Bayesian regression model for State 3. Entries in bold indicate coefficients with 95% CI excluding zero.

https://doi.org/10.1371/journal.pone.0326598.t004

State 1 – Bayesian regression model.

Table 3 presents regression coefficients and 95% credible intervals for the Bayesian regression model for State 1. This model revealed multiple substantial associations between functional analysis outputs from resting state brain activity and measures of psychopathology and cognitive function, while controlling for the effect of age, sex and handedness as covariates. Full regression results tables for all four states, including intercepts and regression coefficients for control covariates, are provided in the Supplementary Materials.

Among individuals for whom State 1 was dominant, the number of transitions between latent states was positively associated with performance on the NIH Flanker task measuring executive function and attention, with each additional transition associated with an average 6 points higher score on this task (95% CI [2.0, 9.9]). Scores on FPC3, indicating higher theta, alpha and beta-2 power contrasted with beta-1 power, were associated with poorer processing speed as measured by the NIH Pattern task ( = -59 [-113, -4.6]). Weaker associations (with credible intervals spanning zero) were also found between FPC3 scores and the NIH Card ( = -32 [-76, 11]) and List ( = -33 [-71, 5.9]), measuring executive function and working memory. Scores on FPC4, indicating lower frequency alpha and beta peaks and low delta power, had substantial negative associations with all four cognitive function measures. Taken together these associations with FPC3 and FPC4 suggest that among individuals for whom State 1 is dominant, higher alpha and beta-2 power and lower delta power are broad features revealed by FPCA that are associated with poorer cognitive function.

Several weaker effects were also identified for higher anxiety levels, measured by the SCARED SR scale, being associated with number of states visited ( = 8 [-3.3, 19]), percentage of time spent in dominant State 1 ( = 35 [-5.1, 75]), and scores on FPC2 ( = 16 [-1.7, 34]) indicating high power in the beta and high-frequency alpha ranges contrasted with theta power.

State 3 – Bayesian regression model.

Table 4 presents regression coefficients and 95% credible intervals for the Bayesian regression model for State 3. In terms of temporal dynamics of latent state occupancy, the number of states visited was associated with higher externalising psychopathology on the YSR Externalising dimension ( = 7.3 [0.11, 15]). Among individuals for whom State 3 was dominant, the number of transitions between states was associated with lower scores on YSR Internalising ( = -4.5 [-7.7, -1.3]), YSR Externalising ( = -2.5 [-5.5, 0.45]), anxiety symptoms on SCARED SR ( = -3.9 [-8.1, 0.34]), and depressive symptoms on MFQ SR ( = -2.8 [-5.8, 0.27]). The proportion of time spent in State 3 was associated with lower scores on SCARED SR ( = -45 [-89, -1.1]), YSR Internalising ( = -33 [-66, 0.03]), and higher scores on the NIH Pattern task ( = 101 [15, 185]). Beyond the broad effect that individuals spending a majority of their time in State 3 had better cognitive function and lower psychopathology, an additional effect appears to be associated with moving frequently between State 3 and other states, as the number of transitions was associated with lower scores on all psychopathology measures.

Considering functional characteristics in the frequency domain, substantial associations with health measures were identified for scores on FPC3 and FPC4 in State 3. Scores on FPC3 indicate high power for low-frequency alpha and beta-2 contrasted with low theta and beta-1 power. Scores on FPC3 were associated with higher scores on YSR Internalising ( = 24 [9.5, 38]), YSR Externalising ( = 17 [3.9, 31]), and depressive symptoms on MFQ SR ( = 13 [-0.76, 27]). Scores on FPC4 represent high power for high-frequency alpha and beta-2 contrasted with low delta power. Scores on FPC4 were associated with higher scores on SCARED SR ( = 25 [0.22, 50]), YSR Externalising ( = 17 [-1.1, 35]), and MFQ SR ( = 16 [-1.3, 34]), as well as higher scores on the NIH Flanker task ( = 24 [1.8, 48]). Taken together, these results indicate that among individuals for whom State 3 is dominant, higher alpha power, lower theta power, and higher frequency beta oscillations may be indicative of greater psychopathology.

Discussion

In this paper we have introduced flawless analysis, a novel functional analysis framework with a nested model structure that incorporates functional latent states and temporal dynamics using a FHMM, and frequency characteristics stratified by those states using functional principal component analysis. Applying flawless analysis to time series of power spectral densities calculated from EEG data, we have made novel discoveries that build on previous work regarding data-driven phenotypes of resting state brain activity in young people.

In the case of resting state brain activity measured using EEG data, modelling each PSD as a curve or functional data observation offers deeper insights compared to extracting features of relative power in specific canonical frequency bands. It remains common practice in EEG research to characterise brain activity across individuals in terms of relative power contained within these legacy frequency bands [14, 15]. However, the use of fixed EEG power bands to characterise brain activity across individuals has received substantial criticism [16, 65]. Traditional frequency bands for EEG (e.g. delta, theta, alpha, beta) do not correspond to consistent functional groupings of brain activity across individuals, especially in the age range of early adolescence when the frequency content of brain activity is known to be rapidly evolving [66]. Making inferences about oscillatory activity based on relative power within fixed canonical bands is liable to conflate true differences in oscillatory power with a number of other physiological processes including shifts in oscillation centre frequency within or between individuals [64], or changes in the aperiodic exponent of the frequency distribution [16]. In contrast to the traditional approach, a functional analysis approach based on periodic spectral features extracted from EEG data allows us to cater for inter-individual differences in oscillatory peak frequencies, make fewer assumptions about frequency content falling within traditional power bands, and capture more nuance and novel characteristics of interest in the frequency domain.

Notable findings include identifying patterns of difference in psychopathology and cognitive function between groups of individuals who spent 80% or more of their time in each latent state. From the perspective of applied neuroscience and mental health research, we found similar patterns and extended on earlier findings regarding data-driven phenotypes of resting state brain activity in young people [67]. We have identified 4 functional latent states which young people in this sample occupied during eyes closed, resting state brain activity. At a high level, these states were associated with substantial variation in psychopathology and cognitive function between individuals for whom each latent state was dominant. Notably, individuals mainly allocated to State 1 had elevated risk for psychopathology and poorer cognitive function, and broadly this state was characterised by having high delta power, very low alpha power, and high relative beta-2 power. Individuals mainly allocated to State 3 exhibited a profile of lower psychopathology and better cognitive function, and this state broadly featured low theta power, a high frequency alpha peak, and moderate beta-1 power with a beta peak around 21 Hz.

Using FPCA stratified by latent state, and subsequent Bayesian regression models, we have also identified associations between temporal dynamics of latent state occupancy, frequency characteristics within states, and a variety of important health outcomes. Findings that stand out from these analyses include that scores on FPC3 and FPC4 in State 1, with prominent features including power in the theta, low-frequency alpha and beta-2 ranges, were associated with poorer cognitive function relative to other individuals for whom State 1 was dominant (Table 2). For State 3, scores on FPC3 and FPC4 were associated with higher scores on several psychopathology measures, indicating that lower theta power, higher alpha power, and higher frequency beta peaks may be indicative of greater psychopathology among individuals for whom State 3 was dominant.

Our regression models also revealed a number of contrasting effects of state occupancy dynamics between State 1 and State 3. In State 1, number of transitions between states was associated with improved executive function, and weakly associated with improved working memory. In State 3, number of transitions was associated with reduced psychopathology across all four measures. In State 1, proportion of time spent in dominant state was weakly associated with higher psychopathology and poorer cognitive function across several measures. In State 3, proportion of time spent in the dominant state was associated with lower anxiety symptoms and internalising psychopathology, and better processing speed. These results are preliminary, and while the mechanisms are currently unclear for the differences observed here in the associations of health measures with temporal dynamics of latent state occupancy, the contrasting effects between states displays the detailed characteristics from functional analyses that differentiate individuals within and between dominant latent states.

Our findings extend on previous research on EEG characteristics in neurodevelopmental conditions and psychopathology. For instance, the high delta and low alpha power observed in our State 1, associated with poorer cognitive function and high psychopathology, aligns with previous studies linking higher delta-alpha ratios to increased incidence of social anxiety in adolescents [75] and autism spectrum disorder in children [72], as well as poorer cognitive function following brain injury or stroke [73, 74]. However, our functional approach reveals more detailed patterns of relationships between frequency characteristics and cognitive measures. For instance, the association we found between scores on FPC4 in State 1 and poorer cognitive function suggests that the interplay between theta, low-frequency alpha and beta power, indicating the presence of lower peak frequencies for alpha and beta activity, may be more informative than examining these bands in isolation, as is common in traditional EEG analyses. This highlights the potential of our functional approach to uncover more complex spectral signatures of cognitive function and psychopathology, going beyond simple ratios to consider the entire spectral profile and its dynamics over time.

Overall these results demonstrate unique insights that are available through the flawless analysis framework. They may be indicative of interaction effects of state occupancy dynamics which vary according to dominant state, and suggest the presence of further subgroups or phenotypes which could be identified on the basis of latent state dynamics and frequency characteristics within states. This demonstrates the unique insights available through our nested FDA framework relative to other analytical approaches. The centroids of functional latent states identified here bear a close resemblance to the average power spectral densities identified across 5 clusters in our previous work [67]. Extension of this work could pursue formal replication and comparison of flawless with the previous methodology used to identify resting state EEG phenotypes. These findings warrant further investigation, for which a possible approach could be clustering individuals based on flawless outputs to develop more detailed data-driven phenotypes of resting state brain activity.

Functional data analysis (FDA) methods can offer different analytical perspectives compared to multivariate parametric approaches with manually extracted features, particularly in their ability to consider the complete functional form of the data [3]. FDA is a recent area of growth in statistical research, and our work contributes a novel approach for nesting multiple FDA tools together to analyse temporal, latent state and frequency characteristics in time-varying functional data. By integrating these approaches in a nested structure, we are able to generate richer insights into functional and time-frequency characteristics of time-varying functional data than those available through using the individual methods in isolation. At the first stage, FHMM analysis provides insight into latent state characteristics and temporal dynamics of latent state occupancy. This stage also importantly enables the analysis to compare ‘like with like’ - by stratifying resting state brain activity data by allocation to latent states, we can apply FPCA to understanding the functional harmonics in the frequency domain that distinguish individuals while occupying matched latent states. Without this method, applying FPCA alone to time series of functional data over individuals would be conflating variation across multiple distinct latent states, and disregarding temporal dynamics.

Our nested approach differs from direct FTS dimension reduction methods like FSSA [18] in its analytical goals and structure. While FSSA provides an elegant mathematical framework for decomposing functional time series using trajectory operators and functional singular value decomposition, flawless prioritises interpretability through its nested structure of state-based decomposition followed by within-state functional analysis. This approach allows us to first identify distinct functional states and their temporal dynamics, then examine detailed functional characteristics within matched states. The separation of these analytical levels helps preserve interpretability for clinical applications, while still capturing complex patterns in both the temporal and frequency domains. While direct FTS methods may be more mathematically elegant or computationally efficient, the nested structure of flawless offers unique benefits for applied researchers seeking to understand both broad state-based patterns and nuanced functional variation within states.

Other functional analysis methods that have been developed to simultaneously model temporal and frequency domains together for neuroscientific data (e.g. [5, 12, 13]) tend to do so at the cost of substantially increased model complexity. These approaches simultaneously analyse neuroscientific data in multiple functional domains including spatial, temporal and frequency, and are able to capture complex inter-related patterns among these combined functional domains. However, the results of these models tend to be challenging to interpret in terms of influential characteristics in each of the domains of interest, which is an obstacle to utility for applied researchers or practitioners to integrate novel findings of functional analysis with their existing understanding of the applied subject matter. By combining functional methods that analyse these individual domains in a nested structure, our method enables analysis of inter-related patterns of features between the levels of latent state characteristics, temporal patterns of state occupancy, and functional frequency characteristics within states, while also generating specific insights at each of these levels of analysis. As we have shown, outputs at these levels be combined as inputs for multivariate statistical tools to understand their shared relationships to external outcomes – including, for our application, psychopathology and cognitive function in young people.

The present study had several limitations. It used a restricted frequency range of 1.5 - 30 Hz for the sake of interpretability, and for computational performance of the FOOOF algorithm. This approach may have excluded potentially relevant features of resting state EEG phenotypes that exist outside of this frequency range. While our focus on the Cz electrode provided valuable insights, we acknowledge that this single-electrode approach limits our ability to capture spatial patterns of brain activity. Related to these limitations, future research could also expand the frequency range under investigation to incorporate low delta (0 - 1.5 Hz) and gamma (30+ Hz) ranges, which may reveal additional features and provide a more comprehensive understanding of resting state EEG phenotypes. Another avenue for future extension could include analysis of data from multiple EEG channels, capturing activity from different brain regions such as prefrontal and frontal activity, which may be more directly related to some of the cognitive and psychopathological measures we studied. This approach could enable the investigation of spatial patterns and functional connectivity in resting state EEG phenotypes, providing a more holistic understanding of brain function and the potential relationships with psychopathology and cognitive function.

Given the focus on FHMM and FPCA in this paper, extensions to the flawless framework could investigate sensitivity analysis and comparison with other functional analysis methods for studying latent state dynamics and functional data reduction. For this work we chose to use FPCA, as it is a widely used and well-established method for functional data reduction [2, 3], making it an appropriate choice for this initial effort to combine functional latent state and data reduction methods in a nested framework. However, other approaches to functional dimensionality reduction are available, including functional factor analysis and other variations on the orthogonal rotation between components for FPCA [68, 69]. For functional HMMs, to our knowledge there are no existing implementations of alternative latent state analyses such as Hidden Semi-Markov Models or Conditionally Autoregressive HMMs, which can account for more complex structures of temporal autocorrelation in time series data [70, 71]. Future work could pursue implementation of alternative latent state analysis methods for use with functional data, and compare performance in the flawless framework with the FHMM approach we have used.

While our current approach for initialising the FHMM algorithm relies on visual identification and averaging of stable centroids, we acknowledge that an algorithmic method for distinguishing centroids could enhance the robustness of our approach. The implications of averaging centroids for initialisation include potential loss of some fine-grained distinctions between states, but in this case it helps to identify stable, consistent patterns across subsamples and improve the stability of algorithm performance. Future work could explore algorithmic methods for initial centroid identification, including a functional implementation of k-means++ [55], to further improve the reliability and reproducibility of our results.

The applied findings presented in this paper provide a foundation for developing sophisticated data-driven phenotypes based on resting state EEG data. Using the outputs of flawless analysis, future work could explore clustering analyses to develop more detailed phenotypes, capturing not only broad differences in the characteristics of latent states, but also the temporal dynamics of individual trajectories among states, and frequency characteristics within states. This approach, which embraces the complexity and richness of functional data from resting state EEG, has potential to contribute to our understanding of brain activity patterns and contribute to a more comprehensive characterisation of individual differences in EEG phenotypes and their associations with psychopathology and cognitive function.

The flawless analysis framework, initially developed in this context of modelling resting state EEG data, has the potential for application across a wide range of fields that involve functional data. In disciplines such as ecology and biomedical research, functional data often captures complex temporal and spatial patterns that are difficult to analyse with traditional statistical methods. By adapting flawless analysis to accommodate the specific requirements of different domains, researchers can leverage its capabilities to explore previously unidentified relationships, identify novel patterns, and ultimately contribute to the advancement of knowledge in their respective areas of study.

Appendix – Notation reference table

Supporting information

**S1 File. Supplementary Materials**

Supplementary materials for manuscript: “Functional analysis within latent states: A novel framework for analysing functional time series data”

https://doi.org/10.1371/journal.pone.0326598.s001

(PDF)

Acknowledgments

We extend our heartfelt gratitude to the participants and their caregivers in the Healthy Brain Network study. We are grateful to the Child Mind Institute and the leaders of the Healthy Brain Network for their pioneering research and generous provision of such a valuable open-source dataset to the community.

References

  1. 1. Margaritella N, Inácio V, King R. Parameter clustering in Bayesian functional principal component analysis of neuroscientific data. Stat Med. 2021;40(1):167–84. pmid:33040367
  2. 2. Wang JL, Chiou JM, Müller HG. Functional data analysis. Annu Rev Stat Appl. 2016;3:257–95.
  3. 3. Shang HL. A survey of functional principal component analysis. AStA Adv Stat Anal. 2014;98:121–42.
  4. 4. Lin X, Li R, Yan F, Lu T, Huang X. Quantile residual lifetime regression with functional principal component analysis of longitudinal data for dynamic prediction. Stat Methods Med Res. 2019;28(4):1216–29. pmid:29402190
  5. 5. Li K, Luo S. Dynamic predictions in Bayesian functional joint models for longitudinal and time-to-event data: An application to Alzheimer’s disease. Stat Methods Med Res. 2019;28(2):327–42. pmid:28750578
  6. 6. Wang S, Nie Y, Sutherland JM, Wang L. Pattern discovery of health curves using an ordered probit model with Bayesian smoothing and functional principal component analysis. Stat Methods Med Res. 2021;30(2):458–72. pmid:32976070
  7. 7. Ramsay J, Hooker G, Graves S. Introduction to functional data analysis. In Functional data analysis with R and MATLAB. Springer; 2009. p. 1–19.
  8. 8. Sidrow E, Heckman N, Fortune SM. Modelling multi-scale, state-switching functional data with hidden Markov models. Can J Stat. 2022;50(1):327–56.
  9. 9. Warmenhoven J, Cobley S, Draper C, Harrison A, Bargary N, Smith R. Considerations for the use of functional principal components analysis in sports biomechanics: Examples from on-water rowing. Sports Biomech. 2019;18(3):317–41. pmid:29141500
  10. 10. Wu PP-Y, Sterkenburg N, Everett K, Chapman DW, White N, Mengersen K. Predicting fatigue using countermovement jump force-time signatures: PCA can distinguish neuromuscular versus metabolic fatigue. PLoS One. 2019;14(7):e0219295. pmid:31291303
  11. 11. Xie S, Lawniczak AT. Feature extraction of epileptic EEG in spectral domain via functional data analysis. In ICPRAM; 2019. p. 118–27.
  12. 12. Hasenstab K, Scheffler A, Telesca D, Sugar CA, Jeste S, DiStefano C, et al. A multi-dimensional functional principal components analysis of EEG data. Biometrics. 2017;73(3):999–1009. pmid:28072468
  13. 13. Scheffler A, Telesca D, Li Q, Sugar CA, Distefano C, Jeste S, et al. Hybrid principal components analysis for region-referenced longitudinal functional EEG data. Biostatistics. 2020;21(1):139–57. pmid:30084925
  14. 14. Anijärv T, Can A, Gallay C. Spectral changes of EEG following a 6-week low-dose oral ketamine treatment in adults with major depressive disorder and chronic suicidality. Int J Neuropsychopharmacol. 2023;:pyad006
  15. 15. Newson JJ, Thiagarajan TC. EEG Frequency Bands in Psychiatric Disorders: A Review of Resting State Studies. Front Hum Neurosci. 2019;12:521. pmid:30687041
  16. 16. Donoghue T, Haller M, Peterson EJ, Varma P, Sebastian P, Gao R, et al. Parameterizing neural power spectra into periodic and aperiodic components. Nat Neurosci. 2020;23(12):1655–65. pmid:33230329
  17. 17. Bellman R. Curse of dimensionality. Adaptive control processes: A guided tour. Princeton, NJ: Princeton University Press, vol. 3(2); 1961.
  18. 18. Haghbin H, Najibi SM, Mahmoudvand R, Trinka J, Maadooliat M. Functional singular spectrum analysis. Statistics. 2021;10(1):e330.
  19. 19. Haghbin H, Maadooliat M. A journey from univariate to multivariate functional time series: A comprehensive review. Wiley Interdiscipl Rev: Comput Stat. 2024;16(1):e1640.
  20. 20. Kember J, Stepien L, Panda E, Tekok-Kilic A. Resting-state EEG dynamics help explain differences in response control in ADHD: Insight into electrophysiological mechanisms and sex differences. PLoS One. 2023;18(10):e0277382. pmid:37796795
  21. 21. Shappell HM, Duffy KA, Rosch KS, Pekar JJ, Mostofsky SH, Lindquist MA, et al. Children with attention-deficit/hyperactivity disorder spend more time in hyperconnected network states and less time in segregated network states as revealed by dynamic connectivity analysis. Neuroimage. 2021;229:117753. pmid:33454408
  22. 22. Martino A, Guatteri G, Paganoni AM. Hidden Markov models for multivariate functional data. Stat Prob Lett. 2020;167:108917.
  23. 23. Rabiner L, Juang B. An introduction to hidden Markov models. IEEE ASSP Mag. 1986;3(1):4–16.
  24. 24. Wieczorek T, Wieckiewicz M, Smardz J, Wojakowska A, Michalek-Zrabkowska M, Mazur G, et al. Sleep structure in sleep bruxism: A polysomnographic study including bruxism activity phenotypes across sleep stages. J Sleep Res. 2020;29(6):e13028. pmid:32160378
  25. 25. Dijkstra F, Viaene M, De Volder I, Fransen E, Cras P, Crosiers D. Polysomnographic phenotype of isolated REM sleep without atonia. Clin Neurophysiol. 2020;131(10):2508–15. pmid:32773210
  26. 26. Shamshoian J, Şentürk D, Jeste S, Telesca D. Bayesian analysis of longitudinal and multidimensional functional data. Biostatistics. 2022;23(2):558–73. pmid:33017019
  27. 27. Alexander LM, Escalera J, Ai L, Andreotti C, Febre K, Mangone A, et al. An open resource for transdiagnostic research in pediatric mental health and learning disorders. Sci Data. 2017;4:170181. pmid:29257126
  28. 28. Ramsay J, Hooker G, Graves S. Smoothing: computing curves from noisy data. In Functional data analysis with R and MATLAB. Springer; 2009. p. 59–82.
  29. 29. Lou HL. Implementing the Viterbi algorithm. IEEE Signal Process Mag. 1995;12(5):42–52.
  30. 30. Shinkawa H, Takahashi M, Adachi M. Psychometric validation of the Japanese version of the child and adolescent social support scale (CASSS) in early adolescents. Jpn Psychol Res. 2021.
  31. 31. Li Y, Kwok SYCL. A longitudinal network analysis of the interactions of risk and protective factors for suicidal potential in early adolescents. J Youth Adolesc. 2023;52(2):306–18. pmid:36334177
  32. 32. Latham MD, Dudgeon P, Yap MBH, Simmons JG, Byrne ML, Schwartz OS, et al. Factor structure of the early adolescent temperament questionnaire-revised. Assessment. 2020;27(7):1547–61. pmid:30788984
  33. 33. Karamacoska D, Barry RJ, Steiner GZ, Coleman EP, Wilson EJ. Intrinsic EEG and task-related changes in EEG affect Go/NoGo task performance. Int J Psychophysiol. 2018;125:17–28. pmid:29409782
  34. 34. Rogala J, Kublik E, Krauz R, Wróbel A. Resting-state EEG activity predicts frontoparietal network reconfiguration and improved attentional performance. Sci Rep. 2020;10(1):5064. pmid:32193502
  35. 35. Angold A, Costello EJ, Messer SC. Development of a short questionnaire for use in epidemiological studies of depression in children and adolescents. Int J Methods Psychiatr Res. 1995.
  36. 36. Behrens B, Swetlitz C, Pine DS, Pagliaccio D. The screen for child anxiety related emotional disorders (SCARED): Informant discrepancy, measurement invariance, and test-retest reliability. Child Psychiatry Hum Dev. 2019;50(3):473–82. pmid:30460424
  37. 37. Achenbach TM. Manual for the youth self-report and 1991 profile. University of Vermont Department of Psychiatry; 1991.
  38. 38. Petot D, Petot JM, Chahed M. Is the youth self-report total score a reliable measure of both a general factor of psychopathology and Achenbach’s eight syndromes? A cross-cultural study. J Psychopathol Behav Assess. 2023;45(1):58–74.
  39. 39. Weintraub S, Dikmen SS, Heaton RK, Tulsky DS, Zelazo PD, Bauer PJ, et al. Cognition assessment using the NIH toolbox. Neurology. 2013;80(11 Suppl 3):S54–S64. pmid:23479546
  40. 40. Langer N, Ho EJ, Alexander LM, Xu HY, Jozanovic RK, Henin S, et al. A resource for assessing information processing in the developing brain using EEG and eye tracking. Sci Data. 2017;4:170040. pmid:28398357
  41. 41. Walden AT, Percival DB and McCoy EJ. Spectrum estimation by wavelet thresholding of multitaper estimators. IEEE Trans Signal Process. 1998;46(12):3153–3165.
  42. 42. Babadi B, Brown EN. A review of multitaper spectral analysis. IEEE Trans Biomed Eng. 2014;61(5):1555–64. pmid:24759284
  43. 43. Prerau MJ, Brown RE, Bianchi MT. Sleep neurophysiological dynamics through the lens of multitaper spectral analysis. Physiology. 2017;32(1):60–92.
  44. 44. Delorme A, Sejnowski T, Makeig S. Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis. Neuroimage. 2007;34(4):1443–9. pmid:17188898
  45. 45. Thomson DJ. Spectrum estimation and harmonic analysis. Proc IEEE. 1982;70(9):1055–96.
  46. 46. Bokil H, Andrews P, Kulkarni JE, Mehta S, Mitra PP. Chronux: A platform for analyzing neural signals. J Neurosci Methods. 2010;192(1):146–51. pmid:20637804
  47. 47. Cohen MX. Analyzing neural time series data: Theory and practice. MIT Press; 2014.
  48. 48. Cappé O, Moulines E, Rydén T. Inference in hidden Markov models. In: Proceedings of EUSFLAT conference; 2009. p. 14–6.
  49. 49. Baum LE, Petrie T, Soules G. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat. 1970;41(1):164–71.
  50. 50. Welch LR. Hidden Markov models and the Baum-Welch algorithm. IEEE Inform Theory Soc Newslett. 2003;53(4):10–3.
  51. 51. Viterbi A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inform Theory. 1967;13(2):260–9.
  52. 52. Martino A, Guatteri G, Paganoni AM. Hmmhdd: Hidden Markov models for high dimensional data. R package version 1.0; 2022. http://github.com/martinoandrea92/hmmhdd
  53. 53. Tarpey T, Kinateder KK. Clustering functional data. J Class. 2003;20(1).
  54. 54. Martino A, Ghiglietti A, Ieva F. A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional data. Stat Methods Appl. 2019;28:301–22.
  55. 55. Arthur D, Vassilvitskii S. K-means: The advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms; 2007. p. 1027–35.
  56. 56. Yu B. Stability. Bernoulli. 2013;19(4).
  57. 57. Joseph VR. Optimal ratio for data splitting. Stat Anal Data Mining: ASA Data Sci J. 2022;15(4):531–8.
  58. 58. Ramsay JO, Graves S, Hooker G. Functional data analysis. R package version 6.0.5; 2022. https://CRAN.R-project.org/package=fda
  59. 59. Bürkner P-C. Advanced Bayesian multilevel modeling with the R package brms. R J. 2018;10(1):395.
  60. 60. Bürkner P-C. Bayesian item response modeling in R with brms and Stan. J Stat Soft. 2021;100(5).
  61. 61. Oldfield RC. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia. 1971;9(1):97–113. pmid:5146491
  62. 62. Freschl J, Azizi LA, Balboa L, Kaldy Z, Blaser E. The development of peak alpha frequency from infancy to adolescence and its role in visual temporal processing: A meta-analysis. Dev Cogn Neurosci. 2022;57:101146. pmid:35973361
  63. 63. Villafaina S, Collado-Mateo D, Fuentes-García JP, Cano-Plasencia R, Gusi N. Impact of fibromyalgia on alpha-2 EEG power spectrum in the resting condition: A descriptive correlational study. Biomed Res Int. 2019;2019:7851047. pmid:31058192
  64. 64. Haegens S, Cousijn H, Wallis G, Harrison PJ, Nobre AC. Inter- and intra-individual variability in alpha peak frequency. Neuroimage. 2014;92(100):46–55. pmid:24508648
  65. 65. Saad JF, Kohn MR, Clarke S, Lagopoulos J, Hermens DF. Is the theta/beta EEG marker for ADHD inherently flawed? J Atten Disord. 2018;22(9):815–26. pmid:25823742
  66. 66. Saby JN, Marshall PJ. The utility of EEG band power analysis in the study of infancy and early childhood. Dev Neuropsychol. 2012;37(3):253–73. pmid:22545661
  67. 67. Forbes O, Schwenn PE, Wu PP-Y, Santos-Fernandez E, Xie H-B, Lagopoulos J, et al. EEG-based clusters differentiate psychological distress, sleep quality and cognitive function in adolescents. Biol Psychol. 2022;173:108403. pmid:35908602
  68. 68. Chen R, Yang D, Zhang CH. Factor models for high-dimensional tensor time series. J Am Stat Assoc. 2022;117(537):94–116.
  69. 69. Acal C, Aguilera AM, Escabias M. New modeling approaches based on Varimax rotation of functional principal components. Mathematics. 2020;8(11):2085.
  70. 70. Yu SZ. Hidden semi-Markov models. Artificial Intelligence. 2010;174(2):215–43.
  71. 71. Lawler E, Whoriskey K, Aeberhard WH, Field C, Mills Flemming J. The conditionally autoregressive hidden Markov model (CarHMM): Inferring behavioural states from animal tracking data exhibiting conditional autocorrelation. JABES. 2019;24(4):651–68.
  72. 72. Chan AS, Sze SL, Cheung M-C. Quantitative electroencephalographic profiles for children with autistic spectrum disorder. Neuropsychology. 2007;21(1):74–81. pmid:17201531
  73. 73. Aminov A, Rogers JM, Johnstone SJ, Middleton S, Wilson PH. Acute single channel EEG predictors of cognitive function after stroke. PLoS One. 2017;12(10):e0185841. pmid:28968458
  74. 74. Leon-Carrion J, Martin-Rodriguez JF, Damas-Lopez J, Barroso y Martin JM, Dominguez-Morales MR. Delta-alpha ratio correlates with level of recovery after neurorehabilitation in patients with acquired brain injury. Clin Neurophysiol. 2009;120(6):1039–45. pmid:19398371
  75. 75. Schmidt LA, Poole KL, Hassan R, Willoughby T. Frontal EEG alpha-delta ratio and social anxiety across early adolescence. Int J Psychophysiol. 2022;175:1–7. pmid:35192865
  76. 76. Yao D, Wang L, Oostenveld R, Nielsen KD, Arendt-Nielsen L, Chen ACN. A comparative study of different references for EEG spectral mapping: the issue of the neutral reference and the use of the infinity reference. Physiol Meas. 2005;26(3):173–84. pmid:15798293
  77. 77. Scrivener CL, Reader AT. Variability of EEG electrode positions and their underlying brain regions: Visualizing gel artifacts from a simultaneous EEG-fMRI dataset. Brain Behav. 2022;12(2):e2476. pmid:35040596