Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Elastic-Net Copula Granger Causality for Inference of Biological Networks

Abstract

Aim

In bioinformatics, the inference of biological networks is one of the most active research areas. It involves decoding various complex biological networks that are responsible for performing diverse functions in human body. Among these networks analysis, most of the research focus is towards understanding effective brain connectivity and gene networks in order to cure and prevent related diseases like Alzheimer and cancer respectively. However, with recent advances in data procurement technology, such as DNA microarray analysis and fMRI that can simultaneously process a large amount of data, it yields high-dimensional data sets. These high dimensional dataset analyses possess challenges for the analyst.

Background

Traditional methods of Granger causality inference use ordinary least-squares methods for structure estimation, which confront dimensionality issues when applied to high-dimensional data. Apart from dimensionality issues, most existing methods were designed to capture only the linear inferences from time series data.

Method and Conclusion

In this paper, we address the issues involved in assessing Granger causality for both linear and nonlinear high-dimensional data by proposing an elegant form of the existing LASSO-based method that we call “Elastic-Net Copula Granger causality”. This method provides a more stable way to infer biological networks which has been verified using rigorous experimentation. We have compared the proposed method with the existing method and demonstrated that this new strategy outperforms the existing method on all measures: precision, false detection rate, recall, and F1 score. We have also applied both methods to real HeLa cell data and StarPlus fMRI datasets and presented a comparison of the effectiveness of both methods.

1 Introduction

In the modern age of bioinformatics, scientists are endeavoring to find ways to cure diseases at their source, making the recuperation process faster and more efficient. For this reason, researchers from diverse areas are striving to comprehend and replicate complex networks involved in the operation of various functions in human body. Among those networks, most of the research is focused on mapping of effective brain connectivity in the brain for specific task and gene networks in the translation of different biological reactions.

1.1 Brain connectivity

The brain connectivity analysis is crucial for exploring the network topology and understanding of the inter- and intra-communications involved during execution of any task as brain function does not involve isolated regions but rather requires a network of various regions to perform any task [1]. This motivated the researchers to develop the means to extract and replicate that network information.

The review by Firston [2] and others [3] divided the brain connectivity studies into three distinctive branches namely structural, functional, and effective connectivity. The Structural connectivity analysis involves the study of the anatomical links of fiber tracks that associate the neuron pools across different brain regions. Functional connectivity maps the region of the brains that are spatially distributed, but functionally connected. These functional maps are generated using statistical concepts that capture the deviation of statistical independence. However, the effective Connectivity represents an amalgamation of structural and functional connectivity showing the directional effects within a network pool.

1.2 Gene Networks

Gene is the basic physical and functional unit of heredity that communicates and interacts with each other to make proteins that help in performing various biological functions. Thus motivating researchers to obtain a better understanding of protein’s functional interactions which provide exceptionally valuable information for discovering susceptibilities of a disease to its treatment.

In recent times, a wide range of methods for network analysis have been developed to detect the brain connectivity and gene networks that use time series data extracted from fMRI and DNA microarray. These time series data can be analyzed by utilizing a number of techniques from various fields such as econometrics. Among several techniques, Granger causality is ubiquitously used in biological network analysis (gene network analysis [47] and mapping of effective brain connectivity [812]) because of its simplicity in terms of its implementation and interpretation [13, 14]. However, its use faces limitations when dealing with high dimensional data.

The standard implementation of Granger causality as proposed in [15] was originally developed with the aim of analyzing direct or linear causality by using ordinary least-squares (OLS) methods for causality estimation. However, the use of OLS implementation limits its use for the high-dimensional biological dataset. Therefore, in this paper, we are proposing a new method based on the elastic net and copula approaches for finding the Granger causality for high-dimensional data. The proposed method will not only be able to detect direct causality but will also analyze indirect causality as well.

This paper first reviews the concept of Granger causality, its existing implementations, and their limitations in Section 2. In Section 3, we present our proposed method for addressing these limitations. Section 4 covers the experimental details and results used to evaluate the performance of the proposed method. Finally, the discussion and conclusion are presented in Section 5.

2 Granger Causality

Granger causality analysis was first introduced in econometrics [15] for studying causal relationships using different financial time series data. The classical literature, such as New Introduction to Multiple Time Series Analysis by Lütkepohl [16], showed that Granger causality can be applied using many modeling techniques such as vector autoregressive (VAR) models, infinite order VAR, impulsive response functions and more. However, the literature in bioinformatics indicates that the VAR modeling technique is commonly used because of its simplicity and ease of interpretation [17].

To understand the concept of Granger causality, let us consider two-time series, X, and Y. If previous values of both X and Y can be used to predict the current value of X, i.e., X[n], then we can conclude that Y Granger-causes X. Thus, including both variables increases the prediction’s effectiveness over that of using past values of X alone [15, 16].

Granger causality can be used to explain direct or indirect influences. To understand the concept of direct (linear) influence and indirect (nonlinear) influence, consider a simple four-variable scenario as shown in Fig 1. Direct influence is represented by direct links between nodes, as exemplified by the edge between Node 2 and Node 1, whereas indirect influence can be traced using the edge that is mediated by one or more nodes. In Fig 1, an indirect influence can be observed by tracing the edge from Node 4 to Node 1, mediated by Node 3 and Node 2. If either Node 2 or Node 3 is blocked, then Node 4 will have no effect on Node 1.

Inference of biological networks using high-dimensional data extracted via various techniques confronts two critical challenges in finding the temporal causal relationships: 1) not all significant confounders in the data sets are known and 2) a large number of high-dimensional time series need to be analyzed.

The first challenge centers on the fact that most data sets do not measure all confounders, which makes surreptitious effects due to unobserved confounders unavoidable. In this circumstance, prior knowledge of unobserved confounders can be utilized to overshadow their effect.

The second challenge requires us to develop adaptive algorithms that can reveal temporal dependencies by utilizing a substantial number of time series that have few observation values.

In the past, several methods were used to handle such high-dimensional data, such as pairwise analysis [18], kernel-based algorithms [11, 19], and other regularization-based methods [20, 21]. Another viable alternative, proposed by Nelsen [22], uses the copula to discover dependencies between random variables. The use of probability theory in the form of copulas counters the spurious effect of confounders by utilizing marginal probabilities to incorporate prior information about them.

Recently, Liu, Lafferty, and Wasserman [23] introduced the Gaussian copula with nonparametric marginals, which can be used to estimate high-dimensional undirected graphs for inferring influences between high-dimensional time series. This idea was taken forward by Bahadori and Liu [24], who proposed a new method for finding Granger causality, called the Gaussian non-paranormal (GNPN) model.

They defined the GNPN model by considering a time series X = (X1, …, Xn) having a GNPN distribution of GNPN(X,B,F) as long as there are functions such that Fj(Xj) for j = 1, …, n are jointly Gaussian and can be factored according to the VAR model through coefficient B = {βi,j}. To be more precise, variables can be factored as shown below: where pN(z|μ,σ) is the Gaussian density function having variance σ2 and mean μ, L is the maximal time delay, zit,Lagged = [zi(tL), …, zi(t − 1)] is the history of zi till time t, and βi,j = [βi,j(1), …, βi,j(L)] is the vector of coefficients modeling the effect of time series zi on the target time series.

Based on this information, the time series zj Granger-causes zi if at least one value in the coefficient vector βj is statistically nonzero.

In order to show the convergence of Copula Granger is consistent like LASSO Granger, Bahadori and Liu [24] use the mathematical concept discussed in [23, 25] which uses the linear model: where x is a p x 1 is a random vector having zero mean and unit variance, β is the coefficient vector and ε is noise having zero mean and unit variance.

The real n observation samples f or i = 1,…,n, yields the covariance that follows the following rate as suggested by [23] Where and is our estimate of covariance using the actual and noisy samples xi and and assuming that the matrix is positive semi-definite. This assumption is bound by following equation which is modified version of equation (22) of [25]:

Provided is bounded for some constant K2 and deriving the lower bound in Eq 26 using the fact that φmin(Δ) ≥ 0 yields the following equation:

Since φmax(Δ) diminishes with respect to according to results from [23] and having the incoherent design assumption [25] for lower bound of , the proof is established following the steps in [25].

Further mathematical details and proof of the concepts are explicitly not discussed here and can be referred to in the original article [24]. However, the gist of their technique is to isolate the marginal properties of the data from their dependency structures. In order to implement this idea, they used ℓ1 (LASSO) regularization to estimate dependency structures for high-dimensional data. In this way, they used the advantage of LASSO scalability for higher dimensions and at the same time used copulas to handle the nonlinearity of the data.

2.1 Limitations of Existing Work

In the standard LASSO estimator [26], the ℓ1 penalty is used to obtain the sparse solution to the following optimization problem: where is the ℓ1-norm penalty on β and induces sparsity in the solution, and λ ≥ 0 is a tuning parameter.

The use of the ℓ1-norm penalty helps in the simultaneous operations of regularization and shrinkage and thus makes LASSO an appealing variable selection method. However, despite these advantages, LASSO faces some limitations, as discussed by Zou [27], that makes it unstable when used for high-dimensional data and limits the variable selection before saturation when the number of variables is greater than the number of observation points.

These LASSO problems mainly arise when dealing with either highly correlated predictors, which usually results in the random selection of predictors, or when all predictors are identical, as discussed in detail in Friedman, Hastie, and Tibshirani [28].

3 Elastic-Net Copula Granger Causality

The instabilities of LASSO can be circumvented using an extension of LASSO called elastic net. It is robust to high correlations among predictors [27] and can select more than p variables when n >> p (that is, the number of variables >> the number of observation points). It uses a mixture of the ℓ1 (LASSO) and ℓ2 (ridge regression) penalties and can be formulated as: where , , λ1 is the tuning parameter for ℓ1 (LASSO), and λ2 is the tuning parameter for ℓ2 (ridge regression).

As we know, ℓ2 (ridge regression) [29] works well with a large number of predictors that either have nonzero coefficients, or are drawn from a normal distribution and are highly correlated [28]. Therefore, its presence in elastic net helps improve variable selection, whereas the ℓ1 (LASSO) penalty induces the grouping effect and stabilizes the solution paths with respect to random sampling [30]. Thus, using the combination of both penalties should greatly improve the predictions.

Based on these findings, we are proposing a method called elastic-net copula Granger causality (ECGC) that we expect to be more precise than the existing GNPN method. In this new method, we employ elastic-net regularization instead of ℓ1 (LASSO) regularization to estimate dependency structures. The use of elastic net will overcome the shortcomings of the existing LASSO methods and will also exploit the advantages of copula to handle the nonlinearity of data.

Therefore, instead of solving the optimization problem proposed by Bahadori and Liu [24],i.e., where λ is the tuning parameter, we are proposing to use the following optimization problem: where λ1 and λ2 are tuning parameters for ℓ1 (LASSO) and ℓ2 (ridge regression) penalties, respectively and are calculated using as follow: λ1 = α and λ2 = (1-α)/2. The pseudocode for implementing elastic-net copula Granger causality is summarized below:

  1. 1. Find the marginal distribution for each time series:
  2. 2. Map the observations into copula space:

In practice, as proposed by Bahadori and Liu [17], to avoid large numbers Φ−1(0+) and Φ−1(1) we use the Winsorized estimator of the distribution function:

  1. 3. Find Granger causality using elastic net and copula for different values of tuning parameters among .
  2. 4. Select Granger causality based on the minimum Akaike information criteria [31].

To evaluate our proposed method, we performed extensive experiments, which are discussed in the next section.

4 Experimentation and Performance Evaluation

4.1 Experimentation

For the implementation of our proposed method, we used MATLAB together with the Sparse Learning with Efficient Projections (SLEP) toolbox [32]. SLEP is a well-known toolbox that has functions related to regularization and has been used in the past by many researchers [33, 34].

For the comparison exercise, we used the code provided in [35], which implements LASSO copula Granger causality, and the Glmnet toolbox [36] in MATLAB.

In order to remain unbiased in our comparison, instead of using self-created data sets, we used the same data that have already been used by other authors to test similar kinds of algorithms.

4.1.1 Simulated Data.

The first set of data was used in [37] and [38]; it simulates the scenario of three variables (genes) and uses the following set of mathematical equations:

The second set of simulated data was first used by Schelter et al. [39] and later in [18, 40, 41] and [42]. It simulates the scenario of five variables and uses the following set of equations:

We generated the desired quantities of data from these sets of equations, as is done by other authors [18, 37, 38, 40, 41] and [42], by first initializing the equations with white Gaussian noise having zero mean and unit variance; we then used the iterative process to generate the remainder of the data.

The direct and indirect influence structure of the simulated autoregressive models 1 and 2 are depicted by the directed graphs in Fig 2 and Fig 3, respectively. In both figures, a directed edge represents the Granger causality between two nodes or variables.

The third simulated dataset used in this paper is from the Dialogue for Reverse Engineering Assessments and Methods (DREAM). DREAM is one of the major platforms for genetic research, in which series of challenges are given to researchers. The aim of these challenges is to provide researchers, the means to find and implement new and existing theories in the areas of cellular network inference and quantitative model building in systems biology.

The “In Silico Network Challenge,” one of the DREAM4 challenges, has the objective of inferring gene regulation network structures via reverse engineering from the given in silico gene expression data sets using any possible means. For this paper, we used the dataset for the InSilico_Size10 sub-challenge, entitled “In Silico Network of Size 10” [4345].

4.1.2 Real Data.

HeLa Cancer Cell Dataset: The real data used in this paper is from the HeLa human cancer cell line that is collected by Whitfield et al. [46] by performing multiple experiments using DNA microarray analysis of the HeLa cell line. In this paper, we are using their experiment 3 data set, which has been used by other researchers as well [6, 47].

The experiment 3 data set identified more than 1100 genes that are periodically expressed during the cancer cell cycle. From these genes, we used 19 preselected genes that are regarded as highly influential and have been investigated by other researchers [6, 47, 48]. The 19 preselected genes that we considered are: “PCNA, NPAT, E2F1, CCNE1, CDC25A, CDKN1A, BRCA1, CCNF, CCNA2, CDC20, STK15, BUB1B, CKS2, CDC25C, PLK1, CCNB1, CDC25B, TYMS, and DHFR”.

As the data was not collected at homogeneous intervals, it was interpolated by cubic smoothing splines [49] before being used as advised in [6, 47].

fMRI StarPlus Dataset: The Second real dataset used in this study is called StarPlus dataset collected by [50] that can be freely accessed from [51]. This dataset had been used in past by several researchers [8, 48, 52, 53] and contained the data that was acquired to study the brain activity involved during comprehension of the relationship between sentence and pictures.

During the study, they performed series of experiments on 13 normal subjects and divided these experiments into 40 trials. In each trial, every subject had to relate a sentence with a picture and then decide the relation between sentence and picture. These 40 trials were further divided into two equal parts. In one part, they first introduced the sentence and asked to relate it to a picture whereas for the next part, they showed picture followed by a sentence.

In either setting, both stimuli were provided only for 4-second exposure and a 4-second blank screen in between. Then after the second stimulus, the subjects were asked to answer the question then rest for 15 seconds before the start of next trial. More details about experimental settings, sentences, and pictures are explicitly not discussed here and can be referred to [50].

The images that were acquired during the experiments were pre-processed using FIASCO program [54] to reduce some artifacts (signal drift and head motions) introduced during image acquisition process. Then pre-processed images were analyzed, and 25 distinct anatomical regions of interest (ROIs) were selected for further study. These regions includes: “left dorsolateral prefrontal cortex (LDLPFC) and right dorsolateral prefrontal cortex (RDLPFC), calcarine sulcus (CALC), left frontal eye fields (LFEF), right frontal eye fields (RFEF), left inferior parietal lobule (LIPL), right inferior parietal lobule (RIPL), left intraparietal sulcus (LIPS), right intraparietal sulcus (RIPS), left inferior frontal gyrus (LIFG), left opercularis (LOPER), right opercularis (ROPER), supplementary motor areas (SMA), left and right inferior temporal lobule (LIT, RIT), left and right posterior precentral sulcus (LPPREC, RPPREC), left and right supramarginal gyrus (LSGA, RSGA), left temporal lobe (LT), right temporal lobe (RT), left and right triangularis (LTRIA, RTRIA), left superior parietal lobule (LSPL) and right superior parietal lobule (RSPL)”.

4.2 Performance Evaluation

The performance evaluation of both methods was done by comparing our proposed method with the existing method using the following measures: precision, false detection rate (FDR), recall, and F1 score. These measures were calculated more than 5000 times for each scenario, and then their average values were used as results, to minimize impulsive errors.

Results for simulated data sets 1, and 2 are shown in Tables 1 and 2, respectively, whereas the results for the DREAM4 data are summarized in Fig 4. Moreover, the effective brain map involved in human deductive reasoning is shown in Fig 5.

thumbnail
Fig 5. Brain Connectivity map involves in deductive reasoning.

https://doi.org/10.1371/journal.pone.0165612.g005

For the real HeLa cell data, as there are no means to compare the effectiveness against other methods, we extracted the top 20 significant interactions. We looked up these 20 interactions in the BioGRID database [55] to see whether they had been reported earlier. Once all 20 interactions were analyzed, we compared the number of matches found by each method. This approach for evaluation has been used and suggested by other researchers [56]. The results of our proposed method and the LASSO copula method are summarized in Table 3.

thumbnail
Table 3. Top 20 Significant Gene Interactions using Elastic-Net Copula Granger Causality and LASSO Copula Granger Causality.

https://doi.org/10.1371/journal.pone.0165612.t003

5 Discussion and Conclusion

For the evaluation of performance, we divided the simulated data into two groups: high-dimensional data and low-dimensional data. High-dimensional cases were defined as those having less than 100-time points, whereas low-dimensional cases are those having more than 100-time points. These definitions were determined based on the fact that techniques similar to DNA microarray analysis or fMRI, which can analyze multiple genes simultaneously, usually generate 15–75-time points because of their data procurement procedures.

Based on this division, the results for simulated data sets 1 and 2 show that for high-dimensional data, on the average, elastic-net copula has 14.8% better precision, 15.92% lower FDR, 19.18% higher recall, and a 16.33% higher F1 score. For the low-dimensional cases, the elastic-net copula has 16.5% better precision, 16% lower FDR, 29.3% higher recall, and a 20.75% higher F1 score.

Based on these findings on simulated data, we observe that our method is consistent with both low- and high-dimensional data with respect to precision, FDR, and F1 score. For recall, we believe that the drastic increase of performance for low-dimensional cases is due to the greater availability of information as more time point values are considered.

Similar trends for precision, FDR, recall, and F1 score can be seen from the DREAM4 data (Fig 4), where we observe improved results with the use of elastic-net copula Granger causality.

For brain connectivity map, there is no standard way to verify the resultant network other than performing some clinical trials. However, clinical trials and their results are out of scope of this paper.

However, for the results on real HeLa data (among those top 20 significant interactions listed in Table 3), we were able to detect 7 reported interactions using the elastic-net method, whereas only 5 interactions were found using the LASSO-based method. These matched interactions are shown in bold in Table 3. Of those seven interactions found by the elastic-net method, we note two interesting interactions (highlighted in Table 3) that were not detected at all by the LASSO copula method. These interactions are related to different cancer cell cycles, as reported in [5759].

In this paper, we have proposed a new method called elastic-net copula Granger causality, which can use high-dimensional data for assessing both linear and nonlinear gene and brain networks. We have compared the performance of the new method with its predecessor, LASSO copula Granger causality. Based on the evidence from extensive experimentation, it is clear that elastic-net copula outperforms the existing LASSO copula. Moreover, when applied to real cancer cell data, it shows the capacity to detect some significant interactions that the other method is not able to detect, further reinforcing the effectiveness of our approach. Therefore, in our view, our proposed method provides a more stable regularization based technique to study gene and brain networks thus helping the researcher to manage and treat disease more meritoriously and proficiently.

Consent and Sources of Real Data

Although this study involves human participants, formal consents or ethical committee approval is not required as experimental data used in this research is not collected by current authors and is freely accessible. HeLa cell Genetic is acquired from published article of Michael et al. and can be accessed from http://genome-www.stanford.edu/Human-CellCycle/Hela/data.shtml. Similarly, StarPlus fMRI data is acquired from published work of Mitchell et al. and can be accessed freely from http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-81/www/.

Author Contributions

  1. Conceptualization: MSF.
  2. Formal analysis: MSF.
  3. Investigation: MSF MYS.
  4. Methodology: MSF MYS.
  5. Project administration: MYS.
  6. Resources: MSF.
  7. Software: MSF.
  8. Supervision: MYS.
  9. Validation: MYS.
  10. Visualization: MSF MYS.
  11. Writing – original draft: MSF.
  12. Writing – review & editing: MSF MYS.

References

  1. 1. Heuvel MPvd , Sporns O. Rich-club organization of the human connectome. The Journal of Neuroscience. 2011;31(44):15775–86. pmid:22049421.
  2. 2. Friston KJ. Functional and effective connectivity: a review. Brain Connectivity. 2011;1(1):13–36. pmid:22432952.
  3. 3. Horwitz B. The elusive concept of brain connectivity. NeuroImage. 2003;19(2):466–70. pmid:12814595
  4. 4. Tam GHF, Hung YS, Chang C, editors. Meta-Analysis on Gene Regulatory Networks Discovered by Pairwise Granger Causality. The 7th International Conference on Systems Biology (lSB) 2013 23–25 August; Huangshan, China: IEEE.
  5. 5. Michailidis G, d'Alche-Buc F. Autoregressive models for gene regulatory network inference: sparsity, stability and causality issues. Math Biosci. 2013;246(2):326–34. pmid:24176667.
  6. 6. Hlavácková-Schindler K, Bouzari H, editors. Granger Lasso Causal Models in Higher Dimensions-Application to Gene Expression Regulatory Networks. ECML/PKDD 2013 Workshop Scalable Decision Making: Uncertainty, Imperfection, Deliberation (SCALE); 2013; Prague, Czech Republic.
  7. 7. Zhang ZG, Hung YS, Chan SC, Xu WC, Hu Y, editors. Modelling and Identification of gene regulatory networks: A Granger casusality approach. Ninth International Conference on Machine Learning and Cybernetics; 2010 11–14 July Qingdao: IEEE.
  8. 8. Hutchinson RA, Niculescu RS, Keller TA, Rustandi I, Mitchell TM. Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models. NeuroImage. 2009;46(1):87–104. pmid:19457397.
  9. 9. Zhenyu Z, Yun J, Tianyu T, Zuhong L, Yijun L, Yonghong C, et al., editors. Detecting Effective Connectivity in Human Brain using Granger Causality. BioMedical Engineering and Informatics, 2008 BMEI 2008 International Conference on; 2008 27–30 May 2008.
  10. 10. Liao W, Ding J, Marinazzo D, Xu Q, Wang Z, Yuan C, et al. Small-world directed networks in the human brain: multivariate Granger causality analysis of resting-state fMRI. NeuroImage. 2011;54(4):2683–94. pmid:21073960.
  11. 11. Amblard PO, Vincent R, Michel OJJ, Richard C, editors. Kernelizing Geweke's measures of granger causality. Machine Learning for Signal Processing (MLSP), 2012 IEEE International Workshop on; 2012 23–26 Sept. 2012.
  12. 12. Barnett L, Seth AK. The MVGC multivariate Granger causality toolbox: a new approach to Granger-causal inference. Journal of Neuroscience Methods. 2014;223:50–68. pmid:24200508.
  13. 13. Roy S, Das D, Choudhury D, Gohain G, Sharma R, Bhattacharyya D. Causality Inference Techniques for In-Silico Gene Regulatory Network. In: Prasath R, Kathirvalavakumar T, editors. Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science. 8284: Springer International Publishing; 2013. p. 432–43.
  14. 14. Tam GHF, Chunqi C, Yeung Sam H, editors. Application of Granger causality to gene regulatory network discovery. Systems Biology (ISB), 2012 IEEE 6th International Conference on; 2012 18–20 Aug. 2012.
  15. 15. Granger CWJ. Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Econometrica. 1969;37(3):424–38.
  16. 16. Lütkepohl H. New Introduction to Multiple Time Series Analysis: Springer Berlin Heidelberg; 2005.
  17. 17. Fujita A, Sato JR, Kojima K, Gomes LR, Nagasaki M, Sogayar MC, et al. Identification of Granger Causality between Gene Sets. Journal of Bioinformatics and Computational Biology. 2010;8(4):679–701.
  18. 18. Tam GHF, Chunqi C, Hung YS. Gene regulatory network discovery using pairwise Granger causality. IET Systems Biology. 2013;7(5):195–204. pmid:24067420
  19. 19. Marinazzo D, Pellicoro M, Stramaglia S. Kernel Granger Causality and the Analysis of Dynamical Networks. Physical Review E. 2008;77(5).
  20. 20. Shimamura T, Imoto S, Yamaguchi R, Fujita A, Nagasaki M, Miyano S. Recursive regularization for inferring gene networks from time-course gene expression profiles. BMC Systems Biology. 2009;3(1):41.
  21. 21. Fujita A, Sato J, Garay-Malpartida H, Yamaguchi R, Miyano S, Sogayar M, et al. Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC Systems Biology. 2007;1:39. pmid:17761000
  22. 22. Nelsen RB. An Introduction to Copulas. 2nd ed. New York: Springer-Verlag; 2006.
  23. 23. Liu H, Lafferty J, Wasserman L. The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs. Journal of Machine Learning Research. 2009;10:2295–328.
  24. 24. Bahadori MT, Liu Y, editors. An Examination of Practical Granger Causality Inference. 2013 SIAM International Conference on Data Mining; 2013; Austin,Texas,USA.
  25. 25. Meinshausen N, Yu B. Lasso-type recovery of sparse representations for high-dimensional data. 2009:246–70.
  26. 26. Tibshirani R. Regression Shrinkage and Selection via the LASSO. Journal of the Royal Statistical Society, Series B (Methodological). 1996;58(1):267–88.
  27. 27. Zou H. The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association. 2006;101(476):1418–29.
  28. 28. Friedman JH, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software. 2010;33(1):1–22. pmid:20808728
  29. 29. Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics. 1970;12(1):55–67.
  30. 30. Ogutu JO, Schulz-Streeck T, Piepho HP. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. BMC Proc. 2012;6 Suppl 2:S10. pmid:22640436; PubMed Central PMCID: PMC3363152.
  31. 31. Akaike H. A new look at the statistical model identification. Automatic Control, IEEE Transactions on. 1974;19(6):716–23.
  32. 32. Liu J, Ji S, Ye J. SLEP: Sparse Learning with Efficient Projections. 4.1 ed: Arizona State University; 2009.
  33. 33. Suk H-I, Wee C-Y, Shen D. Discriminative Group Sparse Representation for Mild Cognitive Impairment Classification. In: Wu G, Zhang D, Shen D, Yan P, Suzuki K, Wang F, editors. Machine Learning in Medical Imaging. Lecture Notes in Computer Science. 8184: Springer International Publishing; 2013. p. 131–8.
  34. 34. Liu J, Xu Y, Cheng J, Zhang Z, Wong D, Yin F, et al. Multiple Modality Fusion for Glaucoma Diagnosis. In: Zhang Y-T, editor. The International Conference on Health Informatics. IFMBE Proceedings. 42: Springer International Publishing; 2014. p. 5–8.
  35. 35. Liu Y, Bahadori MT, Li H. Copula-Granger 2012. Available: www-scf.usc.edu/~mohammab/codes/codes.html.
  36. 36. Qian J, Hastie T, Friedman J, Tibshirani R, Simon N. Glmnet for Matlab 2013. Available: http://www.stanford.edu/~hastie/glmnet_matlab/.
  37. 37. Ding M, Chen Y, Bressler SL. Granger Causality: Basic Theory and Application to Neuroscience. Handbook of Time Series Analysis: Wiley; 2006. p. 438–60.
  38. 38. Siyal MY, Furqan MS, Monir SMG, editors. Granger Causality: Comparative Analysis of Implementations for Gene Regulatory Networks 13th International Conference on Control, Automation, Robotics and Vision; 2014; Singapore: IEEE.
  39. 39. Schelter B, Winterhalder M, Eichler M, Peifer M, Hellwig B, Guschlbauer B, et al. Testing for directed influences among neural signals using partial directed coherence. Journal of Neuroscience Methods. 2006;152(1–2):210–9. pmid:16269188.
  40. 40. Furqan MS, Siyal MY. Random forest Granger causality for detection of effective brain connectivity using high-dimensional data. Journal of Integrative Neuroscience. 2015;15(01):55–66. pmid:26620192
  41. 41. Baccala LA, Sameshima K. Partial directed coherence: a new concept in neural structure determination. Biological Cybernetics. 2001;84:463–74. pmid:11417058
  42. 42. Seth AK. A MATLAB toolbox for Granger causal connectivity analysis. Journal of Neuroscience Methods. 2010;186(2):262–73. pmid:19961876.
  43. 43. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the National Academy of Sciences. 2010;107(14):6286–91. pmid:20308593
  44. 44. Marbach D, Schaffter T, Mattiussi C, Floreano D. Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods. Journal of Computational Biology. 2009;16(2):229–39. pmid:19183003
  45. 45. Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK, Alexopoulos LG, Xue X, et al. Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges. PLoS ONE. 2010;5(2):e9202. pmid:20186320
  46. 46. Whitfield ML, Sherlock G, Saldanha AJ, Murray JI, Ball CA, Alexander KE, et al. Identification of Genes Periodically Expressed in the Human Cell Cycle and Their Expression in Tumors. Molecular Biology of the Cell. 2002;13(6):1977–2000. PMC117619. pmid:12058064
  47. 47. Lozano AC, Abe N, Liu Y, Rosset S. Grouped graphical Granger modeling for gene expression regulatory networks discovery. Bioinformatics. 2009;25(12):i110–8. pmid:19477976; PubMed Central PMCID: PMC2687953.
  48. 48. Furqan MS, Siyal MY. Inference of biological networks using Bi-directional Random Forest Granger causality. SpringerPlus. 2016;5(1):1–10. pmid:27186478
  49. 49. Green PJ, Silverman BW. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. New York: Chapman and Hall; 1994.
  50. 50. Keller TA, Just MA, Stenger VA. Reading span and the time-course of cortical activation in sentence-picture verification. Annual Convention of the Psychonomic Society; Orlando, FL2001.
  51. 51. Mitchell T, Wang W. StarPlus fMRI data Carnegie Mellon University's CCBI2001. Available: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-81/www/.
  52. 52. Mitchell T, Hutchinson R, Niculescu RS, Pereira F, Wang X. Learning to Decode Cognitive States from Brain Images. Machine Learning. 2004;57:145–75.
  53. 53. Wang X, Mitchell T. Detecting cognitive states using machine learning. Iterim working paper. 2002.
  54. 54. Eddy WF, Fitzgerald M, Genovese C, Lazar N, AudrisMockus , Welling J. The Challenge of Functional Magnetic Resonance Imaging. Journal of Computational and Graphical Statistics. 1999;8(3):545–58.
  55. 55. Chatr-aryamontri A, Breitkreutz B-J, Oughtred R, Boucher L, Heinicke S, Chen D, et al. The BioGRID interaction database: 2015 update. Nucleic Acids Research2014.
  56. 56. Cheng D, Bahadori MT, Liu Y. FBLG: a simple and effective approach for temporal dependence discovery from time series data. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining; New York, New York, USA. 2623709: ACM; 2014. p. 382–91.
  57. 57. Wolthuis R, Clay-Farrace L, van Zon W, Yekezare M, Koop L, Ogink J, et al. Cdc20 and Cks Direct the Spindle Checkpoint-Independent Destruction of Cyclin A. Molecular Cell. 2008;30(3):290–302. pmid:18471975
  58. 58. Kong M, Barnes EA, Ollendorff V, Donoghue DJ. Cyclin F regulates the nuclear localization of cyclin B1 through a cyclin–cyclin interaction. The EMBO Journal. 2000;19(6):1378–88. pmid:PMC305678.
  59. 59. Radulovic M, Crane E, Crawford M, Godovac-Zimmermann J, Yu VPCC. CKS Proteins Protect Mitochondrial Genome Integrity by Interacting with Mitochondrial Single-stranded DNA-binding Protein. Molecular & Cellular Proteomics. 2010;9(1):145–52. pmid:19786724