Skip to main content
Advertisement

< Back to Article

Fig 1.

Schematic overview of FCNsignal framework.

(A) FCNsignal is mainly composed of an encoder architecture, a decoder architecture, and a skip architecture, which takes as input DNA sequences and predicts the base-resolution signals. FCNsignal can simultaneously realize multiple tasks with high accuracy by using the maximum values of predicted signals. (B) FCNsignal can accurately capture the signals of the shifted binding regions that are separately located at 250bp, 500bp, and 750bp. (C) FCNsignal can accurately capture the signals of the binding regions that were randomly inserted into negative sequences.

More »

Fig 1 Expand

Fig 2.

Performance comparison of FCNsignal and the competing methods on the 53 ChIP-seq datasets.

(A) The MSE and Pearsonr values for BPNet, FCNA, and FCNsignal, and the Wilson’s test p-values (paired) of the two metrics between FCNsignal and the other two methods. (B) The AUC and PRAUC values for BPNet, DanQ, DeepCNN, FCNA, FCNsignal, and LSGKM, and the Wilson’s test p-values (paired) of the two metrics between FCNsignal and the other five methods. The red triangles represent the average values.

More »

Fig 2 Expand

Fig 3.

Motif prediction performance comparison of FCNsignal and the competing methods.

(A) The–log2(p-value) values for BPNet, DanQ, DeepCNN, FCNA, FCNA*, FCNsignal, MEME and STREME, and the Wilson’s test p-values (paired) between FCNsignal and other seven methods. Note that the red triangles represent the average values and the value of 0 means not finding the target. (B) Detailed view of identified TF motifs for K562. The TF motifs of the inner loop are the target ones and the motifs of the outer loop are the found ones. The size of circles corresponds to the–log2(p-value) value and different colors are used to designate different TF classes. The TF motifs marked by ‘pink’ belong to the same TF family sharing the consensus binding sequence while the ones marked by other colors are more likely the indirect TF motifs.

More »

Fig 3 Expand

Fig 4.

The performance of FCNsignal in locating TF-DNA binding regions.

(A) The results of FCNsignal in locating potential binding regions for CTCF and YY1 on the whole chromosome 17 as well as the distribution of h3k27ac signals in the supported and unsupported sequences, where ‘Supported’ means existing in the real peaks and ‘Unsupported’ means the opposite. (B) The number of motif instances found by FIMO on the located regions and the negative regions as well as the distribution of the matched–log2(p-value) values. (C) The motif enrichment analysis on the located regions and the negative regions. The dash lines represent the results on the negative regions. (D) The base-resolution signals of DNA sequences of arbitrary length predicted by FCNsignal, the true signals, and the h3k27ac signals. The range of the top DNA sequence is chr17: 8376183–8384180 (7997bp length), and the range of the bottom DNA sequence is chr17: 41548466–41580329 (31863bp length).

More »

Fig 4 Expand

Fig 5.

Performance comparison of FCNsignal and the competing methods on the six ATAC-seq datasets.

(A) The MSE and Pearsonr values for BPNet and FCNsignal. (B) The AUC and PRAUC values for LSGKM, DeepEmbed, DeopenBPNet, and FCNsignal. (C) The Pearsonr between predicted maximum signals and true maximum signals for MCF7 (Pearsonr: 0534) and IMR90 (Pearsonr: 0.556). For MCF7, the Pearnsonr for DNA sequences with low and high openness are 0.075 and 0.634 respectively. For IMR90, the Pearnsonr for DNA sequences with low and high openness are -0.054 and 0.693 respectively.

More »

Fig 5 Expand

Fig 6.

The enrichment of different TFBSs in the six cell lines.

(A) The heat maps of diverse motifs in the six cell lines. CTCF binding motif is enriched in all six cell lines. EVT6, NR2C2, TFAP2A, and TEAD1 binding motifs are enriched in the GM12878, HepG2, MCF7, and IMR90 respectively. (B) The intersection ratio between the above five TFs ChIP-seq peaks and the ATAC-seq peaks of the six cell lines. The intersection ratio of CTCF to the six cell lines is all very high (mean: 0.786) corresponding to the observation that CTCF binding motif is enriched in all six cell lines. The maximum intersection ratio of the other four TFs is 0.319, 0.458, 0.22, and 0.573, corresponding to the GM12878, HepG2, MCF7, and IMR90 cell lines respectively.

More »

Fig 6 Expand

Fig 7.

FCNsignal successfully pinpoints causal disease-related SNPs from LD groups.

(A) The SNP scores of FCNsignal (top) and DeltaSVM (bottom) on the myeloma, pan-autoimmune, and CLL. The risk variants for the myeloma, pan-autoimmune, and CLL are rs4487645, rs6927172, and rs539846 respectively. (B) The distribution of the SNP scores of cancer breast predicted by FCNsignal (left) and DeltaSVM (right). The SNP scores of LD groups predicted by FCNsignal are more concentrated in the low score region than the ones predicted by DeltaSVM.

More »

Fig 7 Expand