CoRE-ATAC: A deep learning model for the functional classification of regulatory elements from single cell and bulk ATAC-seq data
Fig 5
Predicting functionality of REs from clusters of PBMC snATAC-seq data.
(A) Single cell clusters annotated for 7 immune cell types. Two-pass clustering identified a total of 15 cell clusters which we annotated using hierarchical clustering with sorted bulk ATAC-seq data (shown in (B)) to identify 7 different immune cells corresponding to these clusters. (B) Hierarchical clustering of snATACclusters with bulk ATAC-seq data. Numbers and highlighted regions within the heatmap correspond to cell clusters and annotations in (A). 7 immune cell types were observed with both snATAC and bulk ATAC-seq samples. (C) (Top) Average precision values for predicting cis-RE function in snATAC for 6 annotated clusters with available ChromHMM states. Model performances suggest that CoRE-ATAC is an effective tool for interrogating cis-RE activity from snATAC data. (Bottom) Mean average precision and average F1 score values for promoters, enhancers, insulators and other. (D) Percent of super enhancers detected among CoRE-ATAC enhancers, demonstratingCoRE-ATAC’s ability to identify cell-type-specific enhancers that are most relevant to disease. (E) GREGOR SNP enrichment analysis highlighting selected diseases whose SNPs were significantly enriched within the enhancer elements predicted by CoRE-ATAC. Enhancers from PBMCsnATAC-seq were significantly enriched for SNPs associated with immune diseases. (F) Genome browser view of IL7R for bulk ATAC and snATAC samples for CD4+T cells. ATAC-seq read profiles and CoRE-ATAC predictions between snATAC and bulk ATAC were found to be similar to one another, demonstrating CoRE-ATAC as a robust method for cis-RE predictions. Red represents promoter predictions, yellow represent enhancer predictions, and gray represent “other” predictions from CoRE-ATAC.