Skip to main content
Advertisement

< Back to Article

Fig 1.

Overview of the DeConveil framework.

(A) Relationship between gene expression and DNA CN. Boxplots show the distribution of mRNA Z-scores across five CN groups in LUAD tumor samples; the dashed blue line represents a locally weighted scatterplot smoothing (LOESS) fit. (B) Input data and modeling design. Matched RNA-seq read counts and absolute gene CN values are provided as input matrices; a design matrix encodes sample conditions (e.g., tumor = 1, normal = 0). (C) Differential expression testing. Volcano plot illustrates selection of DEGs based on |log2FC| > 1 and p-value < 0.05. (D) Gene classification framework. Comparison of CN-naive (PyDESeq2) and CN-aware (DeConveil) models assigns genes to dosage-sensitive (DSGs), dosage-insensitive (DIGs), dosage-compensated (DCGs), or non-DEGs categories. (E) Conceptual summary of gene-dosage classes. DSGs show CN-dependent expression, DIGs show CN-independent regulation, DCGs exhibit buffered responses to CN alterations, and non-DEGs show stable expression. Fig 1E was created in BioRender. Davydzenka, K. (2026) https://BioRender.com/9l2b19o.

More »

Fig 1 Expand

Fig 2.

DeConveil benchmarking on simulated gene expression data.

(A) Schematic overview of the simulation framework. Gene expression counts are generated for two biological conditions (e.g., healthy vs tumor) with CN alterations present in one condition. Expression differences in each gene-dosage class (DSGs, DCGs, DIGs, non-DEGs) reflect changes in the expected mean expression µg, as defined by the generative model (see S1 Text), which jointly depends on biological condition and CN. Ground-truth DEGs are defined by the simulation and compared against detected DEGs. (B) Evaluation of DE detection performance under CN confounding. Precision, recall, F1-score, and Matthews correlation coefficient (MCC) are shown as a function of sample size per condition (10, 20, 40, and 60), comparing DeConveil (CN-aware) and PyDESeq2 (CN-naive). (C) Assessment of DeConveil’s accuracy in effect size estimation. Top: Mean Square Error (MSE) between estimated and true log₂FC. Bottom: Pearson correlation between estimated and true log₂FC. Results compare DeConveil and PyDESeq2. (D) Gene dosage classification performance of DeConveil. Precision, recall, F1-score, and MCC are shown for distinguishing DCGs from DSGs and DIGs, under weak and strong CN signal conditions, as a function of sample size.

More »

Fig 2 Expand

Fig 3.

Impact of CN corrections on DGE analysis in lung adenocarcinoma (LUAD).

(A) Distribution of CN states across LUAD tumor samples. (B) Gene categorization by CN status and DeConveil class (DSGs, DIGs, DCGs). Genes with CN loss (CN = 0 or 1 in ≥25% of samples) are explicitly shown here despite having near-diploid mean CN values. Stacked bars indicate proportions of CN states (loss, neutral, gain, amplification), with percentages denoting fractions of the total gene set. (C) Volcano plots comparing PyDESeq2 (CN-naive) and DeConveil (CN-aware) DE analyses. Genes are plotted by log₂FC and FDR; significance thresholds are |log₂FC| > 1 and FDR < 0.05. (D) Comparison of effect size (log₂FC) and FDR (bottom row) estimates between PyDESeq2 and DeConveil across CN states (loss, neutral, gain, and amplification). The diagonal reference line represents a one-to-one correlation. (E) Distribution of effect size differences (log₂FC) between methods across CN states. (F) Sankey diagram showing reassignment of genes between expression categories (upregulated, downregulated, non-significant) when CN correction is applied.

More »

Fig 3 Expand

Fig 4.

Cross-cancer comparison of gene-dosage classes and their functional associations.

(A) Venn diagrams illustrate the overlap of DeConveil defined gene categories (DSGs, DIGs, and DCGs) across LUAD, LUSC, and BRCA. Selected oncogenes (ONC) and tumor suppressor genes (TSGs) found in each category are highlighted. Additionally, lncRNAs were identified within the DCGs category. Genes classified differently across cancer types are assigned to all relevant categories. (B) Gene Ontology (GO) over-representation analysis for biological processes associated with DSGs, DIGs, and DCGs across three cancer types. Dot size indicates the number of genes per term, and color denotes enrichment significance (–log10 adjusted p-value). (C) Distribution of ONC and TSGs within each gene category across private DEGs of three cancer types.

More »

Fig 4 Expand

Fig 5.

Prognostic relevance of DeConveil gene-dosage classes.

(A) Cox proportional hazards analysis identified prognostic genes within DeConveil defined DSGs, DIGs, DCGs, compared with PyDESeq2 DEGs. Forest plots show hazard ratios (HR) for genes selected by LASSO regression (p < 0.05). (B) Kaplan-Meier survival curves comparing high-risk and low-risk patient groups based on prognostic gene signatures derived from DSGs, DIGs, and DCGs under DeConveil and PyDESeq2 models (p < 0.05). The concordance index (C-index) values indicate the predictive accuracy of the survival model. (C) GO enrichment analysis of biological pathways associated with DeConveil derived prognostic genes in each gene category.

More »

Fig 5 Expand