A New Approach to Investigate the Association between Brain Functional Connectivity and Disease Characteristics of Attention-Deficit/Hyperactivity Disorder: Topological Neuroimaging Data Analysis

Background Attention-deficit/hyperactivity disorder (ADHD) is currently diagnosed by a diagnostic interview, mainly based on subjective reports from parents or teachers. It is necessary to develop methods that rely on objectively measureable neurobiological data to assess brain-behavior relationship in patients with ADHD. We investigated the application of a topological data analysis tool, Mapper, to analyze the brain functional connectivity data from ADHD patients. Methods To quantify the disease severity using the neuroimaging data, the decomposition of individual functional networks into normal and disease components by the healthy state model (HSM) was performed, and the magnitude of the disease component (MDC) was computed. Topological data analysis using Mapper was performed to distinguish children with ADHD (n = 196) from typically developing controls (TDC) (n = 214). Results In the topological data analysis, the partial clustering results of patients with ADHD and normal subjects were shown in a chain-like graph. In the correlation analysis, the MDC showed a significant increase with lower intelligence scores in TDC. We also found that the rates of comorbidity in ADHD significantly increased when the deviation of the functional connectivity from HSM was large. In addition, a significant correlation between ADHD symptom severity and MDC was found in part of the dataset. Conclusions The application of HSM and topological data analysis methods in assessing the brain functional connectivity seem to be promising tools to quantify ADHD symptom severity and to reveal the hidden relationship between clinical phenotypic variables and brain connectivity.


Methods
To quantify the disease severity using the neuroimaging data, the decomposition of individual functional networks into normal and disease components by the healthy state model (HSM) was performed, and the magnitude of the disease component (MDC) was computed. Topological data analysis using Mapper was performed to distinguish children with ADHD (n = 196) from typically developing controls (TDC) (n = 214).

Results
In the topological data analysis, the partial clustering results of patients with ADHD and normal subjects were shown in a chain-like graph. In the correlation analysis, the MDC showed a significant increase with lower intelligence scores in TDC. We also found that the rates of comorbidity in ADHD significantly increased when the deviation of the functional connectivity from HSM was large. In addition, a significant correlation between ADHD symptom severity and MDC was found in part of the dataset.
Introduction characteristics of datasets based on the distance between data points along a pre-assigned filter function [34]. Usually, filter functions for the disease-specific data analysis are provided by the healthy state model (HSM), which was first introduced in microarray data analysis [32]. The HSM essentially unravels the data according to the extent of overall deviation from a healthy (or normal) state, and provides a means to define the guiding filter functions. Mapper, guided by the filter and distance functions, approximately collapses the data into a simple and low dimensional shape. Mapper was successfully applied to genomic expression data from diseased tissue, and classifying breast cancer [31] and diabetes subtypes [34].
In this study, we present the topological analysis tool, Mapper, in combination with HSM and their application to functional neuroimaging data. We investigated the association between the disease components analyzed by Mapper and HSM with clinical phenotypes such as IQ, symptom severity and the comorbidity rate of ADHD to test whether brain functional connectivity patterns are related to differences in these phenotypic variables of interest.

Methods and Materials Datasets
The preprocessed resting state fMRI data was obtained from the ADHD-200 Consortium website (http://fcon_1000.projects.nitrc.org/indi/adhd200). We selected datasets from New York University (NYU) and Peking University (PU) for our study because these two institutes have the largest data samples among the ADHD-200 database and these datasets include equal amount of patients with ADHD and TDC. The NYU dataset includes 98 TDC and 118 children with ADHD. The PU dataset includes 116 TDC and 78 children with ADHD. Psychiatric diagnoses, including comorbidity information, were established through psychiatric interviews with experienced child psychiatrists using the Schedule of Affective Disorders and Schizophrenia for Children-Present and Lifetime Version administered to parents and children (NYU and PU) and the Conners' Parent Rating Scale-Revised, Long Version (NYU) or the ADHD Rating Scale IV (PU). Symptom severity such as inattention and hyperactivity/impulsivity and the ADHD index, which is an overall measure of ADHD symptom severity, were rated by parents. Intelligence was evaluated with the Wechsler Abbreviated Scale of Intelligence (NYU) or the Wechsler Intelligence Scale for Chinese Children-Revised (PU). The details for the phenotypic and clinical variables are described elsewhere [35].

Preprocessing
Briefly, for the construction of the functional network, we used the extracted time courses from the Athena preprocessed data. A detailed description of the preprocessing steps can be found in the Supporting Information as well as on the Athena website. The filtered time course files, ADHD200_AAL_TCs_filtfix.tar.gz, can be downloaded from the ADHD-200 Preprocessed Data website. The functional network of each subject (R ij ) was then computed by Pearson's correlation coefficients between the time courses of i-th and j-th regions of interest (ROIs). The upper triangular part of the functional connectivity matrix for each subject was extracted and vectorized as following: where i is the subject index, n is the number of ROIs, and the dimension of T i is m = n(n − 1)/ 2. Here, we called the vectorized functional connectivity data as functional connectivity vector T i . Finally, the functional network dataset, D = [D ij ], can be obtained as illustrated in Fig 1, where i represents the subject index, j represents the j-th elements of T i , and the vector T i is the i-th row vector of D = [D ij ].

Healthy state model (HSM)
The functional connectivity vector T i of each subject as described in Eq (1) can be decomposed into the normal component and the disease component by HSM [32] as follows: where the normal component (Nc) of data mimics HSM. The detailed description about HSM can be found in the SI. Then, the residual of the fit to HSM becomes the disease component as follows: Finally, the magnitude of the disease component for each subject can be obtained using L 2norm as follows: where R Dc uv is a residual of the disease component in the individual functional network. The L 2norm of the disease component measures the overall amount of deviation from the HSM.

Topological data analysis
In this study, the topological data analysis was used to extract a geometric shape from the relationships among subjects by using a new technology named "partial clustering". Initially, Mapper, a tool for topological data analysis [31,34] was introduced in the neuroimaging society. The first step for analyzing neuroimaging data using Mapper is to define distance and filter functions. The use of distance function is to measure dissimilarity between disease components of the individual functional connectivity vector. Usually, the correlation distance is used as a distance function: where r(X, Y) measures correlation coefficients between two vectors: X and Y.
The essential role of the filter function is to collapse high-dimensional functional network data to a single data point and to capture a neurobiologically meaningful characteristic of the data [31]. In the current study, the filter function measured the magnitude of the disease component in the functional network data. In general, the value of the filter function becomes larger when a large number of functional connections deviate by a large extent from the HSM either in a positive or negative direction. The second step for the topological data analysis is to define the clustering method. We chose the single-linkage method that was widely used in the topological data analysis. A detailed description of these particular clustering procedures can be found in the references [33,36]. The last step is to visualize the resulting topology using a graph ( Fig C in S1 File). In the resulting graph, each node is a subset of subjects, and edges connect similar nodes. The color of each node encodes the value of the filter function averaged across all the data points to the node, with blue representing a low value and red denoting a large value.

Statistical analysis
First, group differences in the values of the filter function, which represent the magnitude of the disease component, were examined by a one-way analysis of variance (ANOVA). Second, Pearson's correlation coefficients between the values of the filter function and clinical phenotypic variables were evaluated to find the significant relationships between these measures. Third, for the value of the filter function, we conducted analysis of the receiver operating characteristics including the estimation of sensitivity and specificity. Fourth, the correlation analysis was conducted to reveal a relationship between a psychiatric comorbidity and resulting topology. For this analysis, we calculated Pearson's correlation coefficients between the ratios of the subject with psychiatric comorbidity in each node in the resulting topology and the node index, where the node index represents the node number with lower (higher) index indicates a subset of subjects having a lower (higher) value of the filter function.

Demographic variables and clinical measures
Some differences existed in the clinical characteristics between participants from NYU and PU ( Table 1). There was no significant difference in age between the TDC and ADHD group, but the proportion of males was higher in the ADHD sample compared to the TDC sample. In the dataset, several subjects did not have scores from clinical measures and were excluded from the correlation analysis ( Table 2). Scores of the ADHD index were significantly lower for PU than for NYU participants (p < 0.0005, Table A in S1 File), which likely reflect differences between the ADHD Rating Scale IV (PU) and the Conners' Parent Rating Scale-Revised, Long Version (NYU). In addition, we computed the ratio of comorbidity in patients with ADHD for the NYU and PU datasets. In the NYU dataset, 36% (42 of 118) of patients with ADHD had the following comorbid psychiatric symptoms: anxiety disorder (15 patients), depressive disorder (8 patients), learning disorder (LD, 6 patients), ODD (6 patients), and other disorders (7 patients). In the PU dataset, 53% (41 of 78) of patients with ADHD had the following comorbid psychiatric symptoms: ODD (25 patients), LD (7 patients), tics (6 patients), conduct disorder (3 patients). Due to this substantial difference in clinical characteristics of each dataset, analyses were conducted separately for the NYU and PU datasets.

Distribution of disease component
The filter function successfully measured a magnitude of the disease component; the subjects with smaller values of the filter function, which represented the smaller magnitude of the disease component, were mostly in the TDC group while the subjects with larger values of the filter function were mostly patients with ADHD. Fig 2 shows the distributions of the value of filter function for the two groups, distinguishing ADHD patients from normal subjects. The values of the filter function are almost the same for with and without scrubbing the time points that showed large head motions (i.e., the framewise displacement > 1mm) when evaluating the functional connectivity (r = 0.9940 for the NYU and r = 0.9909 for the PU dataset). A one-way ANOVA found significant group differences in the values of the filter function (p < 0.0005, Table 3). We have not found any significant confounding effects of the phenotypic information, such as age, gender or medication status, to the group differences in the magnitude of the disease component ( Table C in S1 File). Also, the value of the filter function, which measures the magnitude of the disease component, has excellent sensitivities and specificities (>96%) for the diagnosis of the children with ADHD at a cut-off score of 12 (11) for the NYU (PU) dataset (Table D in S1 File).

Topological data analysis
Topological data analysis using Mapper was applied to the functional neuroimaging data and the chain-like graph was obtained as a result (Fig 3 and Table B in S1 File for NYU data set).
The blue-colored nodes contained mostly normal subjects, whereas red-colored nodes contained patients with ADHD who generally had large deviation from the functional network of the healthy subjects. The illustrations of the number of subjects and the occupation ratio of group members are presented in Fig 3B and 3C.  The statistically significant thresholds are labeled as *P < 0.05 **P < 0.005.

Relationships between resulting topology and clinical phenotypic measures
Since one important goal of topological data analysis is to obtain knowledge about the data followed by quantitative analysis, qualitative graphical (Fig 4) and quantitative correlation analyses ( Table 2) were performed to find the hidden relationship between the Mapper results and clinical phenotypic variables. We inspected each node and computed the average value in each node. First, the visualization of the symptom severity as a function of node index revealed that the blue color nodes, whose index numbers ranged from 1 to 10, had significantly lower symptom severity than those of the red color nodes, whose index numbers ranged from 40 to 49 ( Fig 4A). Significant statistical differences in the ADHD index, hyperactivity/impulsivity, and inattention scores between the blue and red color nodes were confirmed (p < 0.0005).
Especially the symptom severity, represented by the ADHD index score from the PU dataset, showed positive correlations with the values of the filter function (r = 0.23, p = 0.0488; Table 2). However, for the NYU sample, the ADHD index did not show significant correlations with the value of the filter function. Second, the visualization of the intelligence scale as function of node index showed the decreasing trend across blue color nodes, which index number ranges from 1 to 19, and the constant trend across red color nodes, which index number ranges from 20 to 49 ( Fig 5B). In the TDC group, we revealed that the significant negative correlations between the intelligence scales and the values of the filter function as described in Table 2. We compared 10 TDC subjects with the highest value of the filter function with the lowest values regarding demographic factors and ADHD symptom severity. The result shows that TDC subjects with higher values of the filter function consist of a younger age group (NYU dataset), contain more females (NYU dataset), have significantly lower IQ (NYU and PU datasets), and more severe hyperactivity/impulsivity (NYU dataset) compared to those with lower values of the filter function (Table 4). Finally, the correlation analyses found a  significant positive relationship between the ratio of subjects with psychiatric comorbidity in each node and the node index for the NYU (r = 0.85, p<0.0005) and PU (r = 0.87, p<0.0005) datasets, respectively.

Discussion
To the best of our knowledge, this is the first attempt to incorporate the HSM and topological data analysis for unveiling hidden relationships between clinical phenotypic variables of interest, such as IQ, symptom severity, and comorbidity rate of ADHD, and data on brain functional connectivity. The two methodologies were applied for the first time to create an objective measure of disease severity and a partial clustering model of patients with ADHD based on neuroimaging data. These two methods, when used in combination, might be promising tools to quantify disease components within brain networks.

Interpretation of topological data analysis
The partial clustering methods through topological data analysis produced an easily recognizable chain-like graph (Fig 3). Originally, we expected that Mapper would yield a branch-shaped graph with two progressive arms; each arm differentiating the inattentive subtype of ADHD from the combined subtype, similar to two subtypes of breast cancer that were detected in a previous study [31]. However, unlike our hypothesis, our analysis could not identify discrete subgroups of ADHD to validate the DSM-IV subtype model of ADHD. This result is in line with previous studies questioning the discriminant validity of the DSM-IV ADHD subtypes. Studies of etiology, neuropsychological functioning, and treatment response do not provide enough evidence for the distinction between ADHD-I and ADHD-C subtypes, even though they provide support for the validity of the DSM-IV inattention and hyperactivity-impulsivity symptom dimensions [37]. In addition, the strongest argument against the ADHD subtype model is the instability of the subtype classification over time (only 35% meet criteria for the same subtype after 5 years) [38]. Recently, the nosology of ADHD subtypes has been updated in the DSM-V [39]: categorical subtypes of ADHD have been retained, but they are now referred to as combined presentation, predominantly inattentive presentation, and predominantly hyperactive/impulsive presentation. This wording change from "subtype" to "presentation" reflects the fluidity in how the symptoms of disorder may present in the same individual over time [38]. The model proposed by Lahey and Willcut (2010) defines ADHD as "a single disorder without subtypes, with dimensional modifiers that reflect the number of inattention and hyperactivity-impulsivity symptoms" [40]. Our results are consistent with this model of ADHD, in which the Mapper result is presented as a long gradual progression (Fig 3), showing ADHD symptoms as dimensional (or quantitative) traits that form a continuum with the normal state.

Intelligence and the magnitude of disease component
According to our results (Fig 5C and 5D and Table 2), significant negative correlations between the magnitude of the disease component (represented by the filter value) and the IQ score were found in the TDC group but not in the ADHD group. Our results agree with recent studies showing that brain structure (e.g., cerebral gray matter volume and white matter microstructure) or small-world network parameters were associated with IQ for controls, but not for ADHD [41,42]. One possible explanation for these findings is that the relationship between clinical measures (e.g., IQ score) and brain functional measures may vary depending on the presence or absence of pathological processes. Therefore, certain relationships demonstrated in healthy individuals may not be observed in clinical populations. However, our result is rather unexpected, since it is well known that ADHD symptoms can interfere with performance in intelligence tests and adversely affect IQ scores [43]. In clinical practice, it is often observed that performance IQ in ADHD children is on average 7-10 points lower than that of comparisons [6]. A previous study also suggested a negative association between ADHD symptoms and IQ scores [43]. Taken together, further studies need to be done to clarify the relationship between IQ score and ADHD symptoms in this patient population. In other applications, the identification of TDC subjects who present with sub-threshold ADHD symptoms is one of the key areas of biomarker research, because in mild forms of the disease it is difficult to distinguish between disorder and normal groups in clinical practice [21,44]. In these diagnostically ambiguous situations, Mapper presents its strength over standard clustering methods in that long gradual drifts in the graphs are visible, as shown in Fig 3, implying the continuity from normal to progressively advanced disease symptoms [31]. Thus, TDC subjects with higher values of the node index might be different from those with lower values, even though they are clustered within the same TDC group. Our analysis of TDC subjects with the highest value of the filter function showed that this subgroup differed significantly from the TDC subjects with the lowest value; they were of younger age, had significantly lower performance IQ, and more severe hyperactivity/impulsivity symptoms. A longitudinal follow up of this subgroup of TDC will be needed to investigate whether these individuals constitute a high-risk group who might eventually develop ADHD.

Comorbidity, ADHD symptom severity and the magnitude of disease component
The presence of comorbidity has important implications for understanding assessment and treatment of patients with ADHD; thus, we examined the association between the ratio of subjects with comorbidity and the magnitude of disease component represented by the node index in each node. Although several studies reported that neuropsychological deficits are similar in patients with ADHD only and patients with ADHD plus comorbidity [45,46], other studies reported the opposite result proclaiming that patients with ADHD plus comorbidity have greater neuropsychological deficits than those with ADHD alone [47][48][49]. Our study supports the later view showing that the rates of comorbidity in ADHD increase when the disease component of functional connectivity (represented by the value of the filter function or node index) is large; the greater the node index is, the higher the rates of comorbidity in ADHD patients.
In addition, a significant positive correlation between the magnitude of the disease component represented by the value of the filter function and ADHD index score was found in the PU dataset ( Fig 5B). However, significant correlations were not found in the NYU dataset ( Fig  5A). Although the reasons for the different results from each institution are unclear at present, one possible explanation would be the difference in phenotypic characterization of the study samples at both sites. For example, ADHD symptoms were measured using different rating scales in each institution (PU used the ADHD Rating Scale IV; NYU used the Conners' Parent Rating Scale-Revised, Long Version). Therefore, further studies to replicate the PU findings in different datasets are needed to reach a definitive conclusion.

Limitations
The results mentioned above should be interpreted with caution with the following limitations in mind. First, the ADHD-200 data have site-specific differences in behavioral measurement, imaging data acquisition, scanner quality and protocols, and subject populations from the site contributing to the data. To overcome these problems, we analyzed the NYU and PU dataset separately, and mainly presented the results from the NYU site for this study (Fig 2). Nonetheless, the comparison of the NYU dataset with the independent PU dataset produced very similar results, except for the one result regarding the association between the value of the filter function and ADHD symptoms. Thus, the methodologies we propose may be considered as relatively unaffected by these sources of variation and bias caused by differences between study samples. Second, two groups were successfully identified on the basis of brain imaging data, such as the magnitude of the disease component, but phenotypic information such as gender, age, and medication status were not included in the topological data analysis. The demographic information may also contribute greatly to the disease classification, as shown in the competition results from the ADHD-200 dataset, which shows that diagnosis based on demographic variables outperforms imaging-based diagnostic prediction. Third, additional interaction effects of disease duration and medication may further impact the neurobiological substrate in specific ways [21] and future studies need to be consider these factors. Fourth, our findings may not demonstrate specificity to ADHD. Given our finding that subjects who are most different from controls (larger value of the filter function) have more comorbid disorders, the value of the filter function may reflect a brain of individuals who have more impairments. Fifth, The magnitude of disease component obtained from the healthy state model might not fully capture the complexity of the functional network, such as strength and direction of correlations among different brain regions. Yet, it could tell us the amount of deviation in the functional network as a single value, which is more convenient for further correlation analyses with clinical variables.

Conclusions
Despite the limitations, we introduced the HSM and a topological clustering tool in the analysis of neuroimaging data for identifying a brain-phenotypic relationship. We found that the magnitude of the disease component obtained from HSM is significantly correlated with IQ scores in the TDC group, and the resulting topology contained the information of symptom severity or comorbidity rates of ADHD as function of node index. The application of HSM and topological data analysis methods to brain connectivity data might be a promising tool to quantify the disease component of ADHD and reveal the hidden relationship between clinical phenotypic variables and brain connectivity.
Supporting Information S1 File. Supplementary methods and results. Mean and standard deviations (SD) of the symptom severity and intelligence scale in NYU dataset and PU dataset ( Table A). Number of subjects in each group (TDC and ADHD) for each bin of the output graph of Mapper for NYU dataset (Table B). The effects of the phenotypic information to the magnitude of the disease component (Table C). The summary of receiver operating characteristic (ROC) analysis using the value of the filter function (Table D). The decomposition of the original functional connectivity vector T i into the Normal component, which is the linear models fit T Nc i onto the Healthy State Model, and the disease component T Dc i vector of residuals. For example, decompositions of T i with (A) small and (B) large disease component were visualized (Fig A). The areas under the receiver operating characteristics (ROC) curves for the value of the filter function were illustrated for (A) the NYU and (B) PU dataset, respectively (Fig B). Schematic diagram of topological data analysis using Mapper. (A) The data is sampled from a noisy Y-shape point cloud in the two-dimensional space, and the filter function is f(x,y) = y. We divided the range of the filter into 5 intervals and a 50% overlap. (B) For each interval, we compute the clustering of the points lying within the domain of the filter restricted to the interval.
Distributions of the distances from single linkage dendrogram in each filter bin. For example, distance distributions for 1 st and 9 th filter bin were presented. The summation of frequencies appeared after zero bins is the number of clusters, (C) Finally, we have the simplicial complex by connecting the clusters whenever they have non-empty intersection. The color of vertices represents the average filter value (Fig C). Sample application of Mapper to the Y-shape noisy point cloud. In this example illustration, 5 intervals with 20-80% overlaps and 10 intervals with 80% overlap are the appropriate choose of the input parameters of Mapper (Fig D). Sample application of Mapper to O-shape noisy point cloud data. In this example illustration, 5 intervals with 50-80% overlaps, 10 intervals with 50-80% overlaps, and 15 intervals with 80% overlap are the appropriate choose of the input parameters of Mapper (Fig E). Visualization of the clinical phenotype data as a function of the node index in the PU data: (A) The average symptom severity in each bin of graph; (B) The average intelligence scores in each bin of graph (Fig F). (DOCX)