Finding Associations among Histone Modifications Using Sparse Partial Correlation Networks

doi:10.1371/journal.pcbi.1003168

Figure 1.

Global view of the algorithm.

The data matrix is rank-transformed, and the covariance matrix is computed. is then inverted, negated and normalized as described in Materials and Methods to obtain the partial correlation matrix . Cross-validation is performed to build a mask which is applied on to give a sparse partial correlation network.

More »

Expand

Figure 2.

a) Network in CD4+ cells.

Blue edges represent negative partial correlations, while red edges represent positive partial correlations. b) Overlap with the CD4+ network built on numerical data. Numerical data means that the counts are taken to the log instead of being ranked, so quantitative information is preserved. There is very little difference between the two networks. c) Consensus network. Blue edges represent negative partial correlations, while red edges represent positive partial correlations. Bright edges (blue and red) represent edges that are common to all networks, light edges (light blue and pink) are found in two networks out of three. Any blue means a negative partial correlation, while red or pink means a positive partial correlation.

More »

Expand

Figure 3.

Similarity between experiments and cell types.

All plots have the same construction. The x-axis shows the number of top pairs that are considered . The y-axis shows the proportion of these pairs that are found in the two lists being compared (three lists for subplot d), as an estimate of the similarity between partial correlation matrices. a) Similarity - within H1 cells - between Roadmap and ENCODE data, i.e. between experiments, using variables available in all datasets only. For the top 10 pairs, the overlap is 80% (). b) Similarity between two cell types and between experiments, using variables available in all datasets only. For the top 10 pairs, the overlap is 50% between CD4+ and IMR90, and between CD4+ and H1 Roadmap (), 60% between CD4+ and H1 ENCODE, and between IMR90 and H1 ENCODE (), and 70% between IMR90 and H1 Roadmap (). c) Similarity between two cell types for the 23 variables used throughout the study. For the top 30 pairs, the overlap is 47% between CD4+ and IMR90, and between CD4+ and H1 (), and 63% between IMR90 and H1 (). d) Similarity between all three cell types for the 23 variables used throughout the study. For the top 30 pairs, the overlap is 33% ().

More »

Expand

Figure 4.

Effect matrix in CD4+ cells.

The color code represents the difference between the partial correlation coefficient and the correlation coefficient . The difference is given in the lower cell of the corresponding pair. The variable that has the largest effect is written in the upper cell of the corresponding pair.

More »

Expand