Graph-to-signal transformation based classification of functional connectivity brain networks

Complex network theory has been successful at unveiling the topology of the brain and showing alterations to the network structure due to brain disease, cognitive function and behavior. Functional connectivity networks (FCNs) represent different brain regions as the nodes and the connectivity between them as the edges of a graph. Graph theoretic measures provide a way to extract features from these networks enabling subsequent characterization and discrimination of networks across conditions. However, these measures are constrained mostly to binary networks and highly dependent on the network size. In this paper, we propose a novel graph-to-signal transform that overcomes these shortcomings to extract features from functional connectivity networks. The proposed transformation is based on classical multidimensional scaling (CMDS) theory and transforms a graph into signals such that the Euclidean distance between the nodes of the network is preserved. In this paper, we propose to use the resistance distance matrix for transforming weighted functional connectivity networks into signals. Our results illustrate how well-known network structures transform into distinct signals using the proposed graph-to-signal transformation. We then compute well-known signal features on the extracted graph signals to discriminate between FCNs constructed across different experimental conditions. Based on our results, the signals obtained from the graph-to-signal transformation allow for the characterization of functional connectivity networks, and the corresponding features are more discriminative compared to graph theoretic measures.


Introduction
The human brain is a highly interconnected network. While early studies of neurophysiological and neuroimaging data focused on the analysis of isolated regions, i.e. univariate analysis, most of the recent work indicates that the network organization of the brain fundamentally shapes its function [1]. Complex network theory has contributed significantly to the characterization of the topology of FCNs, in particular in the assessment of functional integration and segregation [2,3]. Thus, generating comprehensive maps of brain connectivity, also known as PLOS  In this paper, we extend deterministic graph-to-signal transformations from binary to weighted networks using the resistance distance [31]. Some advantages of the resistance distance include invertibility and accounting for the global structure of the graph, thus incorporating information about multiple paths. The resulting signals provide information about the topology of the network which can be used to extract descriptive features from the network. In this paper, we propose to implement well-known signal features such as entropy and statistical moments for these graph-to-signals. The extracted features are naturally low dimensional and are unsupervised, thus do not depend on the quality and the size of the training data unlike FCN based features. Finally, we apply this new transform and the accompanying features to FCNs constructed from an EEG speeded-reaction task experiment. The results obtained from this data set indicate that the proposed graph-to-signal transformation can identify the brain regions central to error-related negativity (ERN). Furthermore, the features extracted from these signals are more discriminative compared to conventional graph theoretic measures and FCN based classification.

Phase synchrony
Weighted connectivity networks were constructed from EEG data using a measure of phase synchrony. Each electrode was considered as a vertex of the graph and the weights between vertices were obtained by computing the phase synchrony between two regions. In this paper, the pairwise phase synchrony was computed by using a recently introduced time-frequency phase synchrony (TFPS) measure based on the reduced interference Rihaczek (RID-Rihaczek) time-frequency distribution [32]. For a signal x i (t), the RID-Rihaczek distribution is defined as [32]: where exp À ðytÞ 2 s � � is the Choi-Williams kernel, [33], exp j yt s À � is the kernel function for the Rihaczek distribution [34] and A(θ, τ) is the ambiguity function of the given signal x i and is defined as: The instantaneous phase of x i is computed from C i (t, f) as: The phase difference between two signal x i and x j can then be computed as: Phase Locking Value (PLV), which quantifies the phase synchrony between two signals x i and x j , is defined as the consistency of the phase differences ϕ i,j (t, f) across trials and can be computed as [35]: where K is the total number of trials, i.e. the number of times a given stimulus is repeated, and � k i;j ðt; f Þ is the phase difference for the kth trial between x k i and x k j as defined by (4). Once the pairwise PLV values are computed between all pairs of electrodes, the weighted adjacency matrix corresponding to the FCN can be constructed as the average of PLV i,j (t, f) within the time interval and frequency band of interest. Thus, the connectivity matrix W is constructed such that W ij = ∑ t225−75ms ∑ f2θband PLV i,j (t, f), i.e. the average connectivity within 25-75 ms time window and theta (θ: 4 − 8Hz) frequency band.

Graph theory
An undirected graph G = (V, E) is defined by a set of N nodes, v i 2 V, and a set of edges, e ij , i, j 2 {1, . . ., N}. The relationships between the nodes of the graph is represented by the adjacency matrix A = [A ij ] for binary graphs, and W = [W ij ] for weighted graphs. In binary graphs, A ij = 1 when nodes i and j are connected and A ij = 0 when the nodes are not connected. For weighted graphs, W ij represents the weight of the edge between nodes i and j and equals to zero when i = j. The degree matrixΔ is defined as the diagonal matrix with entries For binary graphs the combinatorial Laplacian L is defined as L = Δ − A. The elements of L are: whereΔ ii is the degree of node v i . Similarly, the Laplacian for weighted graphs is defined as

Graph theoretic measures
Complex networks can be characterized using graph theoretic metrics such as the clustering coefficient, characteristic path length, global efficiency, small world parameter and small world propensity [36,37]. In this paper, we use graph theoretic measures defined for weighted networks as features for classification. Using graph theoretic measures defined for weighted networks circumvents the shortcomings associated with thresholding [2,19,20]. The features considered in this paper are as follows.
Clustering coefficient: The mean clustering coefficient is a measure of segregation and reflects mainly the fraction of clustered connectivity available around individual nodes. The clustering coefficient for a weighted network is defined as [38]: where t w i is the weighted geometric mean of the triangles around a node i defined as and k i is the degree of node i. Characteristic Path Length: The characteristic path length of the network is the average shortest path length between all pairs of nodes in the network. Path length in the brain network represents the potential routes of information flow between two different brain regions and quantifies the potential for functional integration [2]. For a weighted network, the characteristic path length is calculated as [2]: where d w ij is the shortest weighted path length between node i and j defined as f refers to a map (e.g. an inverse function) from weight to length and g i$ w j is the shortest weighted path between i and j. Global Efficiency: The average inverse shortest path length is defined as the global efficiency of a network. It is a measure of functional integration similar to characteristic path length but can also be computed meaningfully for disconnected networks as an infinite path length results in zero efficiency [39]. The global efficiency for a weighed network is given by [39]: where d w ij is the shortest weighted path length between node i and j defined by Eq (9). Small-World Parameter (SW): A network that has significantly more clusters than a random network but approximately the same characteristic path length as a random network is formally defined as a small-world network [40]. Small-world networks are simultaneously strongly clustered and integrated. This phenomenon of small worldness is captured by the small-world parameter which is the ratio of the normalized clustering coefficient to the normalized path length. For a weighted network, the small-world parameter is given as [2,41]: where C and C rand are the clustering coefficients of the network and a random network with the same degree distribution, respectively, and L and L rand are the characteristic path lengths of the network and a random network with the same degree distribution, respectively. The random networks are generated using the Erdos-Renyi model with the same number of nodes and connection density. Small-World Propensity (SWP): Small world propensity is a measure that quantifies the level of small-worldness displayed by a network while accounting for the variation of network density [24]. SWP is measured by computing the deviation of the observed network's clustering coefficient and characteristic path length from random (C rand , L rand ) and lattice (C lat , L lat ) networks designed with the same degree distribution and same number of nodes as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi where D C ¼ C lat À C C lat À C rand , and D L ¼ LÀ L rand L lat À L rand .
In this work, we computed all of these well-known graph theoretic measures and compared them with the graph signal features.

Graph-to-signal transformation based on the resistance distance matrix
The goal of CMDS is to find a projection of the high-dimensional data into a lower dimensional space such that the Euclidean distances between points are preserved [42]. In particular, for our application of transforming graphs into signals, the goal is to obtain coordinate vectors that preserve the functional connectivity between the different brain regions [29].
In order to extract these coordinate vectors, first, the adjacency matrix A of a given network is transformed into a squared distance matrix, D (2) , which is consequently double centered as where D (2) = D � D is the entry-wise squared Euclidean distance matrix also known as the Hadamard product, J N is a centering matrix defined as vector of ones, and T denotes the transpose. In order to preserve the positive definiteness of B, the matrix D has to be a valid distance matrix and conditionally negative definite. CMDS has been used in literature for the transformation of binary [28,43] and weighted networks [44]. For the binary network, the distance D is based on the binary adjacency matrix A.
In this paper, we propose a graph-to-signal transformation of the weighted graphs using the resistance distance, R. The resistance distance was introduced by Klein and Randic as an alternative to the shortest path distance for applications in chemistry [45]. It is inspired by basic circuit theory, where each edge on the graph represents a resistor with value 1 W ij [46]. The resistance distance between node i and node j is defined as R ij , and is computed for complete graphs through the Moore-Penrose pseudo inverse of the Laplacian, L, L † [28], as Each entry R ij in R corresponds to the squared Euclidean distance between nodes i and j [47]. For a connected graph, R ij � d(i, j), where d(i, j) is the shortest path distance, and equality condition will hold when there is only one path between i and j [48]. R is a valid squared Euclidean distance matrix as each entry R ij satisfies the following rules [49]: As a result, R can be directly substituted in (13) to obtain the corresponding Gram matrix B as It can be shown that the resulting matrix B is a positive semi-definite matrix with rank(B) = C, C � N. Therefore, B has C number of nonzero eigenvalues, and N − C number of eigenvalues equal to zero. The next step in graph-to-signal transformation is to perform the spectral factorization of Based on X, a total of C signals of length N corresponding to the columns of X are obtained.
The ith signal x i 2 R N�1 is defined as the ith column of X with i = 1, 2, . . ., C. In this paper, we will refer to x i s as the signals representing the network. If the generated signals X are not distorted, it is possible to get back to the original network from the signals. First, the resistance distance matrix R can be inferred from the graph signals x i s by computing the squared Euclidean distance between the points as followŝ whereR is the estimated R, C corresponds to the total number of components and x c (i) and x c (j) correspond to the ith and jth entries of the cth component. The original adjacency matrix can then be recovered fromR, for both weighted and binary graphs following the procedure detailed in [50].

Graph signal features
In this section, we describe several well-known features adapted to graph signals. Along with common signal measures like Shannon Entropy (ShEn), skewness and kurtosis, we propose a new measure named graph spectral entropy (GSE) for quantifying the structural information of graphs based on the signals obtained from the networks. The extracted features are explained below. Shannon Entropy (ShEn): Shannon entropy is a standard entropy measure widely used for signal analysis. It quantifies the order state of a signal through the probability density function of the distribution. Shannon entropy of the ith graph signal is computed as [51]: where Q i is the probability density function for the ith graph signal obtained through the histogram of the signal values. ShEn was computed for each of the graph signals, x i [n], i = 1, 2, . . ., C, generated through the graph-to-signal transformation. The average entropy, H ¼ over all signals was extracted as a feature for consequent analysis. Skewness and Kurtosis: Skewness (S) [52] and Kurtosis (Ku) [53] of a signal are measures of the third and fourth moments, respectively and are defined as where μ 3i is the third central moment, μ 4i is the fourth central moment and σ i is the standard deviation of the ith signal. The nth central moment can be computed as where μ i is the mean of the ith signal and Q i is the probability density function for the ith graph signal. The average skewness S ¼ Ku i C measured over all the signals were considered as two features of the graph signal. Graph Spectral Entropy: We propose a new graph entropy measure based on the spectra of the graph signals. In particular, we propose to compute the graph entropy based on the normalized power spectrum of x i ½n�; i ¼ 1; 2; :::;C, where we consider theC < C signals with highest energy. This parameter is selected empirically similar to the selection of the total number of factors in Principal Components Analysis (PCA). The magnitude spectrum of the ith signal is defined as M i ½k� ¼ jFfx i gj 2 , where F denotes the discrete Fourier transform, The normalized power spectrum of the ith signal for the positive frequencies is computed as , where k = 0, 1, . . ., b(N − 1)/2c corresponds to discrete frequency bins [54]. The normalized graph entropy for the ith graph signal is defined as where i ¼ 1; 2; :::;C [54]. Since (19) refers to the Shannon entropy, it is bounded as 0 � H i � log(N 2 /2). We propose to use the normalized power spectrum rather than the original signals for entropy computation since computing the Shannon entropy directly on the signals does not necessarily provide information about the network's structural content. For example, for a structured network such as the ring network, the corresponding signals are pure sinusoids [28,55], with almost uniform histograms resulting in high entropy. On the other hand, the power spectrum of a sine wave is well localized at a particular frequency thus its Shannon entropy is theoretically zero. This is consistent with the intuition that a ring network is deterministic and thus, should exhibit low entropy. Thus, the lower bound of H i is achieved when the distribution is an impulse, and the upper bound occurs when the distribution is uniform. In terms of graph structures, the lower bound corresponds to the ring lattice and the upper bound corresponds to a random network. In order to account for the variation in the network entropy as the probability of attachment varies, we propose to weigh the entropy of each graph signal using its energy using , where w ¼ ðw 1 ; w 2 ; :::; wC Þ. We define the weighted graph spectral entropy (GSE) as This definition of network entropy is independent of graph theoretic measures and the eigenspectrum of the adjacency matrix. The structural information of the network is thus obtained from the signals that already contain the network topological information.

Illustration of the proposed measure
Proposed graph to signal transformation is illustrated step by step for a toy example in Fig 1. Here a 5 × 5 weighted network was generated and transformed through proposed graph-tosignal transformation for extracting the graph signal features . Fig 1(a) shows the plot of the generated toy graph. The step-wise signal generation procedure is illustrated in Fig 1(b). The adjacency matrix was generated in the first step followed by the generation of the Laplacian matrix L (from (6)). The resistance distance matrix R was computed using (14). The Gram matrix B was computed from R following (16). In step 5, graph signals are obtained through spectral factorization of B. As described above, four graph signal features namely graph spectral entropy, Shannon entropy, skewness and kurtosis were extracted from these signals in step 6. Graph-to-signal transformation based classification of functional connectivity brain networks

Simulations
Graph-to-signal transformation for binary networks. We first compare the proposed distance measure, R, with respect to D for binary networks. For this purpose, we qualitatively compare the signals obtained from multiple binary networks. First, we simulate two k-regular graphs with N = 128 nodes and average degrees K = 2 and K = 10. Fig 2(a) and 2(b) show the graph signals with the highest eigenvalue obtained from R and the distance D, respectively. As expected, the signals based on the resistance distance matrix are sinusoidal signals (Fig 2(a)). From these figures, it is observed that the amplitude of signals obtained from R is inversely proportional to the average degree, K, yielding a higher amplitude when K = 2 and a smaller amplitude when K = 10. On the other hand, D cannot distinguish between k-regular graphs with varying average degrees.
We also compared both methods for an Erdős-Rènyi binary graph for a probability of attachment p equal to 0.5. For the original distance matrix D, the signals are random signals (Fig 2(d)), as previously shown in [56]. On the other hand, signals estimated from R still exhibit a random structure, with peaks that are inversely proportional to p (Fig 2(c)). The location of these peaks corresponds to the nodes with the smallest degree, i.e. the largest peak occurs in the first signal and corresponds to the node with the smallest degree. For the resistance distance, a node with small degree will have a high resistance distance between it and the remaining nodes in the network. Therefore, signals obtained from the transformation of binary networks through the resistance distance are more informative than those obtained from D. Graph-to-signal transformation for weighted graphs. The proposed transformation was also assessed on weighted networks. Fig 3(a) shows the signals resulting from a small-world network with average degree K = 6, and N = 128 nodes. As seen in Fig 3(a), for a network with a low rewiring probability, p = 0.1, the resulting signals are sinusoidal signals with some noise. This is consistent with previous work on binary networks [28], where it has been shown that the small-world network is equivalent to a k-regular graph network plus noise.
In addition to the small-world network, we investigated the graph-to-signal transformation of a weighted stochastic block network consisting of 200 nodes and with fixed probability of attachment, p = 0.3 and 3 clusters (Fig 3(b)). The weighted stochastic block network generalizes the stochastic block model to networks with edge weights drawn from any exponential family distribution [57]. Using this model, each node i belongs to one of K blocks or communities. i, and each edge A ij exists with a probability ij that depends only on the group memberships of the connecting vertices. Nodes in the same block are stochastically equivalent, indicating their equivalent roles in generating the network's structure. The stochastic block model is fully specified by a vector z denoting the group membership of each vertex and a K × K matrix of cluster connection probabilities. Weighted stochastic block model extends this by allowing the edges to have some weight values. The weights are assigned randomly from the uniform distribution in the interval [0, 1]. It can be observed from these figures that the first K − 1 signals reveal the clusters' structure, and the K th signal is an impulse. In addition, the size of each cluster can be inferred from the support of the constant regions in the first K − 1 signals. Thus, the proposed approach effectively transforms weighted networks into signals and reflect structural properties of the networks.

EEG data
In this paper, we analyze an EEG dataset from a previously published cognitive control-related error processing study [58]. The study was designed following the experimental protocol approved by the Institutional Review Board (IRB) of the Michigan State University. The data collection was performed in accordance with the guidelines and regulation established by this protocol. Written and informed consent was collected from each participant before data collection.
The experiment consisted of a speeded-reaction Flanker task [59] in which subjects identified the middle letter on a five-letter string, being congruent (e.g. MMMMM) or incongruent (e.g. MMNMM) with respect to the Flanker letters. Flanker letters (e.g. MM MM) were shown during the first 35 ms of each trial, and during the following 100 ms the Flanker and target letters were shown on the screen. This was followed by an inter-trial interval of variable duration ranging from 1200 ms to 1700 ms. A total of 6 blocks consisting of 80 trials composed the experiment, and letters were changed between blocks. EEG responses were recorded by the 64 electrode ActiveTwo system (BioSemi, Amsterdam, The Netherlands). The sampling frequency was 512 Hz. The EEG channel locations are given in Fig 4. Trials containing artifacts were rejected and volume conduction was reduced through the Current Source Density (CSD) Toolbox [60]. A total of 18 subjects and 58 channels were considered for the analysis, for which the total number of error trials ranged from 20 to 61. The same number of correct responses was chosen randomly. Fig 5 shows the event-related Graph-to-signal transformation based classification of functional connectivity brain networks potentials for error and correct responses, i.e. error-related negativity (ERN) and correctrelated negativity (CRN), from electrode FCz averaged over trials and subjects. As can be seen from this figure, ERN has a larger negative amplitude with the peak within 0-100 ms, where 0 refers to the response time.
In this paper, we are interested in studying the differences in the FCNs corresponding to error-related negativity (ERN) and the correct-related negativity (CRN) through a classification task. Previous studies have shown that the ERN is associated with increased synchronization in the theta band (4)(5)(6)(7)(8) between electrodes in the central and lateral frontal regions [58,61,62]. For this reason, a FCN was constructed for each subject by averaging the PLV over the time window 25-75 ms and the frequency bins corresponding to the theta band per subject and response type. This results in two FCNs of size 58 × 58 per subject, one corresponding to error responses and the other to correct responses. A total of 58 signals are extracted from the graph-to-signal transformation of each FCN. The mean ± standard deviation of stress function for the two response types are 4.19 −19 ± 1.12 −18 (CRN) and 3.70 −19 ± 1.16 −18 (ERN).
Graph-to-signal transformation of FCNs. FCNs constructed for ERN and CRN responses were first averaged to obtain representative networks and then transformed into signals using (16). For illustration purposes, we show the first six graph signals corresponding to the correct and the error responses in Fig 6(a) and 6(b), respectively. We focus on the first six signals obtained from this transformation as the eigenvalues of the matrix B in (16) drop off significantly after the sixth eigenvalue. As the graph signals are a function of the nodes or different electrodes, the location of the peaks of the graph signals signify the distribution of spatial activity. It can be observed from Fig 6 that while the energy of the graph signals from CRN is distributed uniformly across the 58 brain regions, the energies of the ERN graph signals are more concentrated within the first 20 electrodes, which correspond to the frontal and frontalcentral regions. This implies that right after an error response most of the brain activity centralizes within the frontal regions. This is line with prior work indicating the role of prefrontal cortex during ERN [62,63].  show the magnitude spectra of the signals corresponding to average error and correct FCNs. For the average FCN constructed from error responses, the frequency content of the signals increases with the signal number, suggesting an organized structure such as k-regular graph networks. On the other hand, the spectra of graph signals corresponding to correct responses suggest a random network structure.
Feature extraction. For both ERN and CRN networks, graph theoretic and graph signal features were extracted for each network constructed for each subject and response type, i.e. a total of 36 networks. A total of 5 graph theoretic features (clustering coefficient, characteristic path length; global efficiency; small world parameter and small world propensity) were extracted for each network corresponding to each subject resulting in a feature matrix of dimension 36 × 5. On the other hand, for graph signals, four features named graph spectral entropy, skewness, kurtosis and Shannon entropy were extracted for each signal and then averaged across the graph signals corresponding to each network resulting in a feature matrix of dimension 36 × 4.
Classification of FCNs. In this section, we evaluate the classification power of the features extracted from graph signals and compare these features with conventional graph theoretic measures as well as the full FCNs used as feature vectors. For a comprehensive comparison, we employed a set of classifiers including support vector machines (SVM), linear discriminant analysis (LDA), logistic regression and k-nearest neighbor (kNN) (with k = 20).
As we have a small dataset (n = 18), the accuracy of each classification method was determined based on its prediction accuracy on leave-one-out prediction technique. Leave-one-out method for validation is a particular case of cross-validation where all test subsets comprise of a single instance. As reported by Kotsiantis et al., this type of validation considers all the instances and computationally more expensive, but is beneficial when the most accurate approximation of a classifier's performance is required [64]. Though this method is computationally expensive, we have used this method to ensure the most accurate estimate of the classifier's error rate.
Since the operation involves binary classification, sensitivity and specificity defined as follows were used as performance measures in addition to accuracy.
where TP = True Positive; FN = False Negative; and where, TN = True Negative and FP = False Positive. In order to determine which measure, as a continuous test statistics, best discriminates between error and correct networks, we also computed receiver operating characteristic (ROC) curve for each measure as shown in Fig 8. In the ROC curve, the sensitivity or true positive rate is plotted as a function of the 1-Specificity or false positive rate for different threshold values. As a result, each point of the ROC plot represents a sensitivity/specificity pair that corresponds to a decision threshold. The overall accuracy of the test point can be detected from the proximity of its ROC plot to the upper left corner [57]. For our experiment, the threshold values were computed through threshold averaging method [65]. For each ROC curve, the area under the curve (AUC) was also computed as it serves as a quantitative measure of the discrimination power of the test statistics. Table 1 shows the classification performance for full network as feature matrix along with graph theoretic (clustering coefficient, path length, global efficiency, small world and small world propensity) and graph signal (proposed graph spectral entropy, Shannon entropy, skewness and kurtosis) features in terms of accuracy, sensitivity, specificity and AUC for different classifiers. From these results, it can be seen that the classification accuracy is much lower when the FCNs are used as features. This is due to the fact that FCNs may be noisy making it hard to discriminate between the two response types. For the graph theoretic features, the small world propensity is the most effective feature. An overall accuracy of 94.4% was obtained by linear SVM using all the graph theoretic features. Moreover, the FCNs constructed from error responses exhibited significantly increased small-world (p = 0.00203, Wilcoxon ranksum test with p < 0.01) and small-world propensity (p = 0.0008280, Wilcoxon rank-sum test with p < 0.01) measure compared to the FCNs from correct responses. This finding of decreased small-world characteristics in correct response networks is indicative of increased randomness and is in line with previous studies that reported increased small-worldness for ERN compared to CRN [23]. For the graph signal features, it can be seen that the graph spectral entropy was the most effective graph signal feature. An overall accuracy of 97.2% was obtained by linear SVM using all features for discriminating between ERN and CRN connectivity networks. Along with the overall 3% improvement of accuracy, the AUC also increased from 0.95 to 0.99 compared to the graph theoretic measures. Moreover, FCNs from correct responses show higher entropy than FCNs from error responses and this difference is significant (p = 0.0000554, Wilcoxon rank-sum test with p < 0.01). This is consistent with the fact that the error-related negativity is associated with increased synchronization which results in less random networks and hence lower network entropy.
Comparing the graph theoretic and graph signal features in Table 1, we can see the graph spectral entropy has the highest AUC, indicating that among all features, graph spectral entropy is the most effective test statistic to discriminate between the two response types. Therefore, graph spectral entropy is more sensitive to the structural changes in the network compared to small-world and small-world propensity measures in this study.
In order to illustrate the differences between weighted FCNs vs. thresholded binary FCNs for classification, proposed analysis was performed for thresholded FCNs as well. We generated thresholded FCNs using the data driven orthogonal minimal spanning trees (OMSTs) approach described in [66]. For the binary FCNs, we used the distance matrix D to be the original adjacency matrix whereas we use the resistance matrix R for the weighted FCNs. The same features were extracted for the graph signals for both binary and weighted FCNs. The results are given in Table 2. As it can be seen from this Table 2, the graph signal features are more discriminative for weighted FCNs compared to the binary ones with a difference in accuracy around 6% and AUC of 0.91 vs. 0.88. This is due to the fact that some information is lost through the process of thresholding.
Although the proposed approach has several merits as illustrated in this paper, the analysis is limited to a single data set with a small sample size. As such, the methodology proposed here can be used to guide similar studies with larger sample sizes so that more rigorous quantitative and qualitative analysis can be performed. It is important to note that the major novelty of the current paper is the introduction of a new framework to analyze FCNs rather than the introduction of new feature extraction and classification methods. With graph theoretic metrics, the whole network is reduced to a single number, e.g. small-world parameter. Although this may be attractive for purposes of data reduction and summarizing network topology, this approach also results in some loss of information. Graph-to-signal transformation, on the other hand, results in as many vectors or signals as the number of nodes in the network. Moreover, it is possible to reconstruct the network from these signals unlike graph theoretic metrics. Therefore, graph theoretic metrics can be thought of as lossy compression applied on the network whereas graph-to-signal transformation is a lossless operation. Any well-known signal processing algorithm and feature extraction method can be easily applied to these graph signals. Consequently, future work could explore extracting different types of features like energy, bandwidth, spectral features and other features like Hurst exponent, Lyapunov exponent, Hjorth parameters, correlation coefficients etc. [67] from signals obtained by the proposed graph-to-signal transformation. Exploration of different features may lead to the interpretation of more subtle characteristics of the complex networks which is not possible using the conventional graph theoretic features.

Conclusion
In this paper, we introduced a new graph-to-signal transformation for weighted FCNs. The signals obtained from this transformation were used to characterize the networks and to extract discriminative features. Results acquired from this study indicate that the features extracted from graph signals are more discriminative compared to conventional graph theoretic measures and the original FCNs for classifying between error and correct responses. In particular, the graph spectral entropy decreases during the ERN interval, while the entropy increases after correct responses. This implies that ERN has a more modular structure implying increased segregation. This finding is in line with previous research showing more localized activity during ERN compared to CRN [68]. Therefore, accumulated evidence from this study suggests that the proposed graph-to-signal transformation based approach can be used to successfully characterize the dynamics of the functional connectivity networks.