Multiple instance fine-mapping: Predicting causal regulatory variants with a deep sequence model

Alexander Rakowski; Christoph Lippert

doi:10.1371/journal.pgen.1012208

Abstract

Identifying causal genetic variants in a computational manner remains an open problem. Training end-to-end prediction models is not possible without large ground-truth datasets, while results of genome-wide association studies (GWAS) are entangled by linkage disequilibrium (LD), and gene expression datasets do not contain genetic variation at individual-level. Here, we propose Multiple Instance Fine-mapping (MIFM) – a multiple instance learning (MIL) objective to overcome the lack of strong labels by grouping putatively causal variants together based on their LD scores. Using MIFM, we trained a deep classifier on a dataset aggregating over 13,000 GWAS to predict causal variants based on their underlying DNA sequences. We validated variants prioritized by MIFM by constructing polygenic risk scores which transferred better to different target ancestries. Furthermore, we demonstrated how MIFM can be used to disentangle effect sizes of highly-correlated variants to better fine-map GWAS results.

Author summary

Genome-wide association studies have identified tens of thousands genetic variants associated with traits or diseases. However, the majority of identified variants is only spuriously correlated with the phenotype of interest, having no causal effect on it. Instead, these variants are often inherited together with nearby biologically causal variants, thus creating the spurious associations. Fine-mapping, i.e., predicting which variants are causal, is crucial for downstream tasks, such as uncovering the biological mechanisms affecting the phenotype or robustly identifying individuals with high genetic risk of a disease. While most fine-mapping methods are based on the available association statistics or functional annotations of genetic regions, it should be possible to identify causal variants based on their neighboring DNA sequences. However, training a standard machine learning classifier for that task is obstructed by the scarcity of strong, ground-truth labels. Here, we proposed a method to train sequence models predicting variant causality using weakly-labeled data. We trained a model on a large set of associated variants, and demonstrated its utility by improving cross-ancestry predictions of genetic risk, or disentangling the effect sizes of highly correlated variants.

Citation: Rakowski A, Lippert C (2026) Multiple instance fine-mapping: Predicting causal regulatory variants with a deep sequence model. PLoS Genet 22(6): e1012208. https://doi.org/10.1371/journal.pgen.1012208

Editor: Heather J. Cordell, Newcastle University, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND

Received: July 11, 2025; Accepted: June 5, 2026; Published: June 29, 2026

Copyright: © 2026 Rakowski, Lippert. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: https://github.com/HealthML/multiple-instance-fine-mapping.

Funding: CL has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101016775. https://www.interveneproject.eu/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. AR has received a salary from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101016775.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Genome-wide association studies (GWAS) remain a powerful tool for identifying genetic variants associated with phenotypes or diseases, with recent studies detecting up to thousands of associations per trait. However, while one usually assumes a clear genotype → phenotype causal direction, only a small fraction of variants significant in a GWAS are expected to be truly causal [1]. Due to linkage disequilibrium (LD), single nucleotide polymorphisms (SNPs) in proximity to causal variants become associated with the trait and can even have lower p-values than the causal SNPs. Identifying the true causal variants is important for understanding the underlying mechanisms such as transcription factor (TF) binding and for making robust predictions in populations with different LD structures than the one where the GWAS was performed.

Without in-vivo experimental validation, one can employ computational fine-mapping methods, which aim to narrow down the set of putative causal variants from GWAS summary statistics. The simplest approach is to test the significance of all candidate SNPs in a joint model. While yielding unbiased estimates of the true effect sizes, it is not feasible in scenarios with large numbers of variants or strong LD. More advanced methods utilize the Bayesian framework to estimate credible sets of variants given prior knowledge on the distribution of effect sizes [2–4], optionally incorporating functional annotations as additional priors [5–7]. While powerful, the above methods require selecting hyperparameters, such as the assumed number of causal variants, rely on availability of functional annotations for the regions of interest, and are sensitive to LD patterns, yielding large credible sets for strongly correlated variants.

Another approach is to use machine learning (ML) prediction models to assign a score to each variant as a proxy of its likelihood of being causal. A common choice are deep neural networks (DNNs) trained to predict functional genomic annotations or gene expression values from DNA, with architectures typically based on a convolutional neural network (CNN) [8–10] or a transformer backbone [11]. The difference in predictions for a sequence with the reference allele and a sequence with the alternative allele is then taken as a measure of the potential causality of a variant. As opposed to the statistical methods, the ML-based approaches are independent of GWAS results or the LD structure. However, while they take DNA data at base pair level resolution as inputs, they are typically trained on reference genome data, which limits the SNP-level variability of DNA motifs to ones present at population-level. Furthermore, the performance of such models can be hindered by coarseness and noise of the labels [12,13].

Here, we introduce Multiple Instance Fine-mapping (MIFM), a framework for training deep learning models to predict the causality of non-coding variants directly from the underlying DNA sequence, without the need for summary statistics or functional annotations at test time. We circumvent the lack of ground-truth labels at SNP resolution by formulating the training objective as a multiple instance learning (MIL) problem, where putatively causal variants in LD with each other are grouped together to form a single, weakly-labeled positive example, and fit the model on a dataset of more than 2 million associated variants from over 13,000 studies. We demonstrate the robustness of MIFM-prioritized variants by creating polygenic scores (PGS) of 20 traits, which transferred better from European to non-European ancestries, compared to variants selected using existing fine-mapping methods. Furthermore, we show how MIFM can be used to detect additional signals in analyses of GWAS results by prioritizing variants for joint tests, even in strongly-correlated cases. Finally, we report the results of model analysis which revealed enrichment of regulatory elements, existing and putatively novel TF motifs, as well as context-dependent mechanisms. The corresponding code as well as the trained model used in our experiments are available at github.com/HealthML/multiple-instance-fine-mapping

2 Description of the method

2.1 Overview of the method

We proposed Multiple Instance Fine-mapping (MIFM), a framework for training models prioritizing causal genetic variants with the multiple instance learning (MIL) paradigm, using GWAS associations as training data (Fig 1). Our goal was to obtain a classifier predicting the probability of a variant being causal, given the DNA sequence around it. To train such a model in the standard supervised manner one would need ground-truth labels for each individual instance (variant). However, variants discovered with GWAS typically contain a large number of false positives due to LD between SNPs. To circumvent the lack of ground-truth labels, we trained a model to classify bags of instances (LD blocks of variants), instead of individual SNPs. We constructed the training dataset by selecting significant variants from a large set of GWAS results and grouping SNPs in LD with each other into positive bags. Conversely, we created negative examples by selecting common variants from the human reference genome which were not significantly associated in any of the GWAS. During training, the model makes instance-level predictions for each SNP within an LD block, which are then pooled by selecting the highest score as the bag-level prediction. Once trained, the pooling operation is discarded, and the instance-level model can be used to make predictions for individual variants. See Sect 2.2 for a detailed description of the method, and Sect 2.3 regarding dataset construction.

Download:

Fig 1. Overview of Multiple Instance Fine-mapping (MIFM) – a framework for training variant prioritization models.

We frame the task of identifying causal variants as a multiple instance learning (MIL) problem, where loci of GWAS-associated SNPs are grouped to form positive “bags” for the MIL algorithm, to overcome the lack of instance-level (per-variant) labels. We assume that at each positive bag contains at least one causal variant, while negative bags contain none. Dataset creation: We construct the training dataset using a large set of GWAS results, by selecting SNPs significantly associated in any study (marked in red). Since we do not have variant-level labels, we treat whole LD-blocks of associated SNPs as positive bags. Conversely, we construct the set of negative examples (marked in blue) by selecting the remaining, non-significant variants, to form single-element negative bags. Model training: we train a prediction model, e.g., a deep neural network, to classify the MIL bags, based on the underlying DNA sequences of the variants. The model first makes separate predictions for each element in the bag, which are then aggregated using the max operator to yield a single bag-level prediction. After the model is trained, we can discard the max operation and predict the causality of single variants.

https://doi.org/10.1371/journal.pgen.1012208.g001

2.2 Fine-mapping as a multiple instance learning problem

We begin by introducing the MIL paradigm for a binary classification task. We assume that the data consists of pairs of , where are input features and are binary labels, and the input instances are grouped into bags of examples , where m can differ across . We use i to index along the bag-level (e.g., ) and j to index individual instances within a bag (e.g., ). At training time, we only have access to bag-level labels, which indicate whether at least one instance in a bag is positive, and are defined as:

(1)

This can be interpreted as a form of weak labeling, since we do not know which particular instance(s) in a positive bag are positive, as opposed to strong, instance-level labels. Our goal is then to learn an instance-level classifier , given the bag-level data . A common approach to train f is to define a bag-level classifier F:

(2)

and train it to minimize the cross-entropy loss :

(3)

wrt. the bag-level examples .

As GWAS estimate the marginal effect size of each SNP, the significantly associated variants typically comprise groups of SNPs in LD with each other, out of which only a small fraction is truly causal, and most SNPs are spuriously associated with the trait of interest through their correlations with causal variants. On the other hand, we assume that the causal signals are driven by DNA patterns around the variants, e.g., by TF binding motifs, and could be identified given enough data. Databases such as GWAS Catalog [14] or CAUSALdb [15] aggregate the results of thousands of GWAS, providing a large set of putatively causal variants of the human genome across a range of traits and populations. In order to use these data to infer about causal variants, we propose the following MIL scenario: let each represent the DNA sequence centered at a SNP and be unobserved ground-truth labels indicating causal variants. We construct positive bags as independent LD-blocks of putatively causal variants, e.g., surpassing a given significance threshold in any study and grouped using a clumping procedure, while the negative ones are single-element bags of variants without any significant associations. If at least one causal variant is present in each positive bag, this is a valid MIL objective. Assuming that the causal variants share DNA patterns between each other, we can successfully train a sequence model f, e.g., a neural network, to predict causal variants.

Formally, given s GWAS over v SNPs, let be the summary statistics of the i-th GWAS and be the p-value for the j-th variant in that study. Furthermore, let b be the total of independent LD-blocks, and be pair-wise disjoint sets of integers indicating which variants belong to which LD block. Given a significance threshold T, the positive bags are then defined as:

(4)

while the negative bags are defined as:

(5)

2.3 Dataset and model training

We constructed the training dataset using data from the CAUSALdb2 database [16], which aggregates 13,709 GWAS summary statistics, resulting in over 2,618,834 putative causal variants from the GRCh37 human genome, which are grouped into 2,772 independent LD-blocks. We further divided the blocks into primary and secondary signals, using labels provided by CAUSALdb2, yielding 4,790 smaller blocks, and excluded blocks with fewer than 10 variants, to reduce the chance of a small number of false positive SNPs driving the training signal. We treated the resulting blocks as separate bag-level positive examples . Since CAUSALdb2 only contains associated variants, we created the negative bags from common () GRCh37 variants which were not present in CAUSALdb2, excluding SNPs further than 128 base pairs away from any CAUSALdb2 variant, yielding a total of 1,045,506 training examples, each forming a separate single-element negative bag.

For model training, we employed a modified, smaller version of the Basenji2 CNN architecture [9], with 4 blocks in the first model stage, 2 blocks in the second stage, a final number of 64 filters, and a single output. We one-hot encoded the 512 base pair DNA sequences around each variant to serve as the inputs . As per Eq 2, for each block the CNN makes a separate prediction for each individual input , and the maximum value across is taken as the bag-level prediction. We trained the model for 100 epochs using the Adam optimizer [17] with a learning rate of and exponential decay of per epoch, with a batch size of 32 bags. For regularization, we used dropout on the residual layer connections of the model with a rate of 0.5, and randomly shifted the input sequences by up to 8 base pairs. We used a modified DNA one-hot encoding by introducing a 6-th special token “V”, which we replaced the nucletotide value in the middle of the sentence, i.e., at the SNP position (A = [1,0,0,0,0], C = [0,1,0,0,0], G = [0,0,1,0,0], T = [0,0,0,1,0], N=[0,0,0,0,0], V=[0,0,0,0,1]). The use of this additional token allowed us to employ random shift augmentations by signaling the position of the variant of interest to the network, and thus potentially increasing its precision to the base pair level. Using the standard 5-token encoding (e.g., as in [11]) with random shifts would instead force the model to make the same predictions for any neighboring SNPs within the shift range of the variant of interest. To further increase robustness, we employed model ensembling [18] by repeating the training with 5 different random seeds, and training a student model [19] to predict the averaged output of the 5 models. We used the same network architecture for the student model with the exception of using the standard 5 token encoding and not using data augmentations. We implemented the models using the PyTorch [20] and PyTorch Lightning [21] software libraries, and trained them using a single NVIDIA A40 48GB GPU and 8 CPU cores per-model, with an average training time of 34 hours. We did not observe a benefit of using larger versions of the model, other network architectures (Enformer [11], BPNet [10], a pretrained MFD [22]), or differentiable alternatives to the max operator in the formulation of F in Eq 2 (log-sum-exponential [23], generalized mean [24], attention-based [25]). Finally, we note that since only “weak” labels are used for model fitting, the network can be further retrained, or fine-tuned, “at test time”, i.e., whenever one obtains results from a new GWAS, by updating the positive bags with the set of new putative variants.

2.4 Functional annotation of the training data

We utilized GenoSTAN [26] and silencerDB [27] data to annotate CAUSALdb2 variants in terms of their regulatory functions. For GenoSTAN data, we mapped all the variants from CAUSALdb2 into hg38 coordinates using the UCSC genome browser LiftOver tool [28] and annotated them with chromatin state annotations of 127 cell lines for the hg38 genome which we downloaded from https://www.cmm.in.tum.de/public/paper/GenoSTAN/. For each variant, we assigned to it the chromatin states which were repeated in at least 5 different cell lines. A single variant could thus have multiple assigned states due to heterogeneity of the cell line experiments. For example, it could be marked as an enhancer-like element in one experiment and marked as a repressed region in a different cell type. For silencerDB, we downloaded annotations for the GRCh37 genome from http://health.tsinghua.edu.cn/SilencerDB/download/Species/Homo_sapiens.bed, and marked all CAUSALdb2 variant within each silencerDB region as silencers. We performed enrichment analyses for each annotation class by computing the odds ratios of SNPs with the given label compared to a subset of variants passing a given MIFM score threshold, and computed the p-values for enrichment using the Fisher exact test [29].

2.5 Motif discovery

We used Transcription-Factor Motif Discovery from Importance Scores (TF-MoDISco) [10,30] to identify motifs contributing to MIFM predictions. We computed the attribution scores for all training sequences using DeepLIFT [31] and ran TF-MoDISco with 200,000 positive and 200,000 negative seqlets (motif occurences). Finally, we matched the resulting patterns to known human TF binding models from HOCOMOCO 11 [32] using the TOMTOM [33]. For the in silico mutagenesis analysis of the contributions of positive and negative patterns, we selected sequences containing the 3 top positive and 3 top negative TF-MoDISco patterns matching the ARID3A binding motif. For each positive-negative pattern pair, we did the following:

We computed the offset with respect to the ARID3A motif for the positive and negative pattern. As we had access to the starting positions of the TF-MoDISco patterns for each example, we also computed the start position of ARID3A motif in them.
We obtained MIFM predictions for all positive and negative examples.
We selected the positions in the negative pattern where the probability of the top nucleotide exceeded 0.4.
For each positive example, we replaced the nucleotides at the positions matching the nucleotides selected from the negative pattern.
We computed MIFM predictions for the modified positive examples.
We repeated steps 3–5 by modifying the negative examples with the positive pattern.

2.6 Construction and evaluation of polygenic risk scores

We selected GWAS of 10 continuous and 10 binary traits from the CAUSALdb database [16] which were performed in populations of European ancestry and had matching traits in UK Biobank (UKB). For each study, we divided the corresponding variants into LD-blocks according to the CAUSALdb2 labels, treating the primary and secondary signals within a single block as separate blocks. We then constructed PGS by selecting the variant(s) with the highest fine-mapping annotation score from each block and using the effect sizes estimated in the corresponding GWAS. This resulted in 10 PGS per GWAS, using annotation scores from: MIFM, “raw” p-values, CADD v1.7 scores [34], pretrained DNN models: Basenji2 [35], DeepSEA-SEI [36], and Enformer [11], and each of the 7 fine-mapping tools included in the CAUSALdb2 annotations: ABF [37], CAVIARBF [6], FINEMAP [3], PAINTOR [5], SuSiE [4], PolyFun FINEMAP [3,7], and PolyFun SuSiE [4,7]. For the raw p-values, we calculated the annotation scores as 1 minus the p-value of a variant. For Basenji2 and Enformer, we calculated the annotation scores as the maximum differences in predictions for the alternative versus reference allele over all model outputs. To account for differences in ranges for different outputs, we first obtained predictions for common variants in the 1,000 Genomes dataset [38], and used these to normalize the outputs of each prediction track. We evaluated the scores on the African (AFR), Admixed American (AMR), Central/South Asian (CSA), East Asian (EAS) and Middle Eastern (MID) ancestry subsets of UKB, which we defined using the ancestry analysis functionality of pgs-calc [39], with a 1,000 Genomes LD reference panel [38]. This resulted in a total of 100 scenarios (20 traits 5 ancestries). Within each scenario, we divided the samples into 5 folds and fitted 5 linear models of the PGS and covariates (age, sex, UKB assessment center, genotyping batch, and the first 10 genetic principal components). Each time we selected a different set of 4 folds for model training and the remaining fold for evaluation, and averaged the final outcome. For each of the 100 scenarios, we assessed the significance of the difference between the performances in terms of the R² score (we used the McFadden pseudo-R² [40] for binary traits) of MIFM and each baseline using a permutation test with 10⁸ permutations. We performed this evaluation in 3 settings, each time selecting the top 1, 5 or 10 variants with the highest annotation score per-block, dividing the effect sizes by the number of variants per-block.

2.7 GWAS and conditional analyses

We performed GWAS of 4 traits — height, red blood cell count, systolic blood pressure, and heel bone mineral density — on a sample of N = 40,000 unrelated individuals from UKB using the standard linear regression functionality of the BOLT-LMM software [41]. We filtered the SNPs with the following criteria: minor allele frequency (MAF), Hardy-Weinberg Equilibrium with a significance level of , and included imputed variants with an INFO score , which resulted in 9,637,426 SNPs in total. We transformed the phenotypes using the rank-based inverse normal transformation [42] and adjusted them for confounders using age, sex, the identifiers of the genotyping array and UKB assessment center, and the first 10 genetic principal components. For each trait, we constructed a set of independent loci using the clumping functionality of the PLINK software [43] with a significance threshold of for the lead SNPs (variants with the lowest p-value per locus) and a threshold of for secondary variants associated with the lead SNPs. Since we focus our analysis on disentangling the effect sizes of highly correlated variants, we used an R² threshold of 0.9 and a physical distance threshold of 1,000 kb. For each lead SNP and its secondary variants, we fitted three joint models of SNPs and confounders: a baseline model with all variants in the clump, and two models, which only included the lead SNP and variants filtered based on either their p-values of MIFM scores. For the filtered models we retained secondary variants with p-values below the 30-th percentile of p-values of all GWAS-associated SNPs, or above the 70-th percentile of scores for the MIFM model. For each locus we additionally fitted a full joint model of all variants on a second, independent sample, and tested for differences in effect size estimates of all significant variants from the previous step between the filtered, and full models. This was to exclude cases where a non-causal variant becomes significant as a “substitute”, due to the true causal variant beyond removed.

3 Verification and comparison

3.1 MIFM variants are enriched for enhancer, repressed, and silencer chromatin signatures

To characterize variants prioritized by MIFM, we analyzed whether they enrich for regulatory elements compared to all putatively causal variants from CAUSALdb2. We computed the enrichment for 20 GenoSTAN-defined states [26], which we divided into 6 groups: enhancers, promoters, repressed regions, transcriptional elongations, repressed-enhancer regions, and low-signal regions (Fig 2). MIFM-prioritized variants are enriched significantly for repressed regions (odds ratio (OR) from 1.02 for the lowest quantile to 1.13 for the highest quantile), repressed-enhancer regions (OR from 1.01 to 1.13), and the enhancer elements (OR from 1.01 to 1.10), with the highest enrichment for “strongly” defined enhancers subgroup (“Enh.6”, up to an OR of 1.16). Conversely, we observed a significant depletion of low-signal regions at higher model score quantiles (OR from 0.98 at the 0.7 quantile to 0.97 at the 0.9 quantile), and for transcription elongations (OR 0.99 for the 0.3 quantile to 0.98 for the 0.9 quantile), with the exceptions of “Gen5’.13” which was enriched for, with an OR up to 1.12. We did not observe significant deviations from 1 in the odds ratios for promoter regions. To understand why MIFM enriches for repressed regions, we further analyzed the repressed and repressed-enhancer variants prioritized by MIFM, and observed a significant enrichment for enhancer regions compared to all repressed and repressed-enhancer variants of CAUSALdb2 (up to 1.08 for repressed regions and up to 1.10 for repressed enhancer-like regions) (Tables A–B in S1 Appendix), indicating that repressed regions prioritized by MIFM are often active enhancers in other cell lines. Additionally, we analyzed repressed, repressed-enhancer, and enhancer regions in terms of silencers, and observed a significant enrichment in MIFM-prioritized repressed regions and in enhancer regions (up to 1.07 in both subsets), and no significant change in OR for repressed-enhancer regions (Tables C–F in S1 Appendix). The presence of silencers is associated with the H3K27me3 histone modification [44], which also characterizes the repressed GENOSTAN states, further suggesting that MIFM enriches for repressed regions with functional elements. Silencers can act as enhancers depending on the cellular context [45–47], and their enrichment in the enhancer subset can either indicate a preference of the model for such “dual” elements.

Download:

Fig 2. Enrichment analysis of regulatory elements in variants prioritized by MIFM.

We plot the odds ratios (y-axis) of variants with MIFM scores above each quantile (x-axis) versus all variants in the CAUSALdb2 data. Each plot represents a different regulatory group based on GenoSTAN [26] chromatin state annotations (plots a to f) and silencerDB [27] validated and predicted silencer elements (plot g).

https://doi.org/10.1371/journal.pgen.1012208.g002

3.2 Syntax analysis of a MIFM trained model

In order to analyze the syntax and patterns relevant for a trained MIFM model, we employed TF-MoDISco and identified 161 unique DNA patterns in total, consisting of 67 patterns with positive attribution scores for models predictions and 94 patterns with negative attributions. The negative patterns had a smaller support in terms of corresponding seqlets (instances of similar patterns), with a median number of 798 seqlets compared to a median 1,257 of seqlets for a positive pattern. In total, 60% of the positive patterns and 40% of the negative ones had a support of at least 1,000 corresponding seqlet instances in the data. 20 positive and 49 negative patterns were significantly matched to at least one known human TF binding motif using TOMTOM [33]. TF motifs with the highest number of matched patterns are shown in Table F in S1 Appendix. Overall, we observed several motifs that were matched to both positive and negative patterns at the same time.

We further analyzed how predictions of MIFM are influenced by the positive and the negative patterns by performing in silico mutagenesis of sequences with patterns matching the binding motif of the ARID3A protein, a TF with reported interactions with other regulatory elements [48–50]. We modified sequences containing positive patterns upstream of the SNP position by replacing the positive pattern with the negative one, and vice versa, and compared MIFM predictions for the original and modified sequences (Fig 3). Adding negative patterns to sequences with a positive pattern shifted the distributions of scores towards lower values (from an average mean score of 0.54 to 0.48) and increased their spread (average standard deviation from 0.03 to 0.08). Adding positive patterns to “negative” sequences slightly shifted their scores towards higher values (average mean from 0.17 to 0.20, average standard deviation from 0.08 to 0.09). This suggests that the context of a variant is necessary, but not sufficient, for it to be predicted as causal by MIFM — one can “disable” the function of a variant by modifying its context, but it is not enough to modify the context to make a variant being predicted as causal.

Download:

Fig 3. In silico mutagenesis of ARID3A-matching patterns and the corresponding MIFM predictions.

We selected 3 positive patterns (rows) and 3 negative patterns (columns) matching the ARID3A motif which were identified by TF-MoDISco as influencing MIFM predictions. For each positive-negative pattern pair, we computed MIFM scores for sequences containing the positive pattern (odd subcolumns) and sequences with the negative pattern (even subcolumns) and plot the density functions of the scores in blue. We modified the sequences by adding the negative pattern to the positive sequences (odd subcolumns) and vice versa (even subcolumns), scored the modified sequences with MIFM, and plot the density functions of the modified scores in red. The vertical lines denote the means of MIFM scores of the original (blue) and modified sequences (red).

https://doi.org/10.1371/journal.pgen.1012208.g003

4 Applications

4.1 Polygenic risk scores created with MIFM transfer better to non-European ancestries

The predictive performance of PGS can decrease when applied to populations different from the one where the GWAS summary statistics were obtained from [51–53]. Due to varying LD patterns across ancestries, SNPs associated with the phenotype in the GWAS population might not be tagging the causal variants in the target population. As most studies are biased towards European populations [54–56], this can increase health disparities, e.g., by failing to identify individuals at risk in minority ancestries [54]. On the other hand, there is evidence for causal variants and their effect sizes being consistent across ancestries [57–59]. Thus, identifying causal variants should improve the cross-ancestry transferability of PGS.

We created PGS by prioritizing variants using MIFM and 12 baseline methods, and evaluated their performance on non-European ancestries (Sect 2.6). Each method was evaluated for 20 traits and 5 ancestries, yielding a total of 100 scenarios per model. Across all scenarios, the MIFM PGS explained the most variance within the phenotypes, with an average R² = 0.042, followed by R² = 0.04 for DeepSEA-SEI and CADD (Fig 4). Within each scenario, we compared the performance of PGS created with MIFM and each baseline, and counted the total number of scenarios where MIFM would perform significantly better or worse than a baseline (Fig 5). Overall, MIFM performed better in 15% and worse in 3% of all scenarios. The net number of scenarios where MIFM performed better was positive regardless of the baseline, ranging from a net difference of 2% for Enformer, to 15% for PAINTOR. The smallest improvements were obtained for the AMR ancestry (9% of scenarios better, 5% worse), while the largest improvements were for the AFR ancestry (19% better, 1% worse) (Fig A in S1 Appendix). With respect to individual traits, MIFM performed the worst for inflammatory bowel disease (5% better, 25% worse) and glucose levels (2% better, 12% worse), while the most consistent improvements were for serum urate levels (72% better) and HDL cholesterol levels (42% better) (Fig B in S1 Appendix). All individual R² scores for each model-ancestry-trait combination are included in S1 Table. As certain baseline methods, e.g., SUSIE, are designed to output a set of multiple putatively causal variants, instead of a single, most likely one, we repeated the above evaluation in two additional settings, selecting the top 5 and top 10 prioritized variants per-block (Figs C–F in S1 Appendix, S2–S3 Tables). This led to an improvement in terms of the R² scores for the 7 finemapping tools, and in the top 5 setting, PAINTOR and PolyFun FINEMAP achieved a higher score than MIFM (R² = 0.43 vs. R² = 0.42 for MIFM, Fig E in S1 Appendix). In the top 10 variants, Enformer was significantly better in a larger number of scenarios (6 better, 4 worse, Fig D in S1 Appendix).

Download:

Fig 4. Mean performance measured by R² of PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits.

We created PGS using results from 20 GWAS performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model.

https://doi.org/10.1371/journal.pgen.1012208.g004

Download:

Fig 5. Performance comparison of PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits.

We created PGS using results from 20 GWAS performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model. For each baseline, we counted the number of scenarios where MIFM would perform better than the baseline (in green), worse (in red), or not significantly different (in gray).

https://doi.org/10.1371/journal.pgen.1012208.g005

4.2 MIFM enables discovery of additional GWAS signals

Joint regression models of multiple variants can estimate the causal effect sizes instead of the marginal ones [60]. However, in the presence of a large number of highly correlated SNPs, large sample sizes are needed to disentangle the signals. Thus it is often infeasible to test all the putative variants jointly, especially for studies of modest sizes. We used MIFM to prioritize variants for conditional testing of highly-correlated () SNPs and compared the results with a naive approach of selecting all variants in high LD in 4 moderately sized GWAS in Table 1. In each GWAS, the joint regression of MIFM-prioritized SNPs yielded a larger number of significant variants, yielding 47 variants in total, compared to 32 variants from the baseline models. 2 out of the 47 variants had significantly different effect size estimates in the larger model, and might be false positives. One of the variants identified by the MIFM joint model was above the significance threshold in the GWAS and would otherwise be undetected by the marginal effect size estimates. We also counted secondary signals, i.e., cases where more than 1 SNP was significant in a joint model, where MIFM identified two cases more than the baseline models (Table 2).

Download:

Table 1. Number of variants identified by conditional analyses of GWAS results for selected phenotypes. We tested SNPs highly correlated with each lead SNP using a joint model of all variants in LD (2nd column) and of variants prioritized by p-values (3rd column) and by MIFM (4th column). In the 5th column we report the number of variants identified with MIFM whose effect size estimates were matching the estimates from the full model.

https://doi.org/10.1371/journal.pgen.1012208.t001

Download:

Table 2. Number of secondary signals identified by conditional analyses of GWAS results for selected phenotypes. We tested SNPs highly correlated with each lead SNP using a joint model of all variants in LD (2nd column) and of variants prioritized by p-values (3rd column) and by MIFM (4th column), and report the number of secondary signals, i.e., cases where more than one variant was significant in the joint model.

https://doi.org/10.1371/journal.pgen.1012208.t002

5 Discussion

Identifying causal non-coding variants is typically done with fine-mapping methods which rely on summary statistics and population LD structure, without directly using the underlying DNA sequences. Alternatively, one can employ sequence models of gene expression, which, however, are trained on reference genome data and do not observe individual-level DNA variation. To this end, we proposed a problem formulation of fine-mapping using the MIL objective, where we predict the presence of GWAS associations within LD-blocks containing multiple variants. By using the underlying DNA sequences as input features we can exploit similarities in DNA patterns between causal variants, while by constructing the labels using GWAS summary statistics we indirectly incorporate the individual-level genetic variation which drives SNP associations.

Using this approach, we trained a DNN model which predicts the probability of a SNP being causal given its neighboring DNA sequence, allowing us to prioritize variants of interest. One of the motivations for identifying causal variants is to robustly predict genetic liability of a phenotype across different populations, especially those which are under-represented in GWAS. By evaluating MIFM prioritized variants across a range of traits and ancestries, we were able to increase the robustness of polygenic scores predictions compared to a wide range of baselines. Furthermore, we showed that utilizing sequence information can be useful for disentangling highly-correlated GWAS variants, a task otherwise statistically infeasible with typical sample sizes.

We note that our goal was to propose a framework for training variant-prioritization models, rather than developing a new DNN architecture. We employed a relatively lightweight model for our experiments (less than 25,000 parameters), and while we did not observe improvements with more complex architectures, we did not conduct an exhaustive comparison of possible DNN models. Besides improving the predictive performance, a valuable extension would be to increase the interpretability of the model with interpretable-by-design architectures [61,62]. We further note that as MIFM utilizes a database of GWAS results, it can be continuously fine-tuned whenever new summary statistics are available, each time further narrowing down the MIL objective.

We showed how one can utilize the vast amount of GWAS results available to train machine learning models for variant prioritization, overcoming the problem of inaccurate labels due to confounding from LD. Such models can complement traditional fine-mapping methods, being able to reduce the number of putative variants to be analyzed, even without the access to the corresponding test statistics. Finally, by introducing base-pair level variations in the training data, this paradigm can be used to increase the robustness of existing DNA sequence models.

Supporting information

S1 Appendix. Document containing supplementary Figs A-F and supplementary Tables A-F.

Fig A in S1 Appendix Ancestry-stratified performance comparison of polygenic scores (PGS) created with MIFM and baseline methods on 5 non-European ancestries and 20 phenotypes. We counted the number of scenarios where MIFM would perform better than a baseline (in green), worse (in red), or not significantly different (in gray). Fig B in S1 Appendix Per-trait performance comparison of PGS created with MIFM and baseline methods on 5 non-European ancestries and 20 phenotypes. We counted the number of scenarios where MIFM would perform better than a baseline (in green), worse (in red), or not significantly different (in gray). Traits are sorted by the net difference in scenarios where MIFM was better, i.e., #Better - #Worse. Fig C in S1 Appendix Performance comparison of top-5 variants-per-block PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits. We created PGS using results from 20 genome-wide association studies (GWASs) performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model. For each baseline, we counted the num- ber of scenarios where MIFM would perform better than the baseline (in green), worse (in red), or not significantly different (in gray). Fig D in S1 Appendix Performance comparison of top-10 variants-per-block PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits. We created PGS using results from 20 GWASs performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model. For each baseline, we counted the number of scenarios where MIFM would perform better than the baseline (in green), worse (in red), or not significantly different (in gray). Fig E in S1 Appendix Mean performance measured by R² of top-5 variants-per-block PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits. We created PGS using results from 20 GWASs performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model. Fig F in S1 Appendix Mean performance measured by R² of top-10 variants-per-block PGS created with MIFM and 12 baseline methods on 5 non-European ancestries and 20 traits. We created PGS using results from 20 GWASs performed on European samples and evaluated them on 5 non-European samples, yielding 100 test scenarios per model. Table A in S1 Appendix Enrichment of enhancer regions in repressed-enhancer regions prioritized by MIFM. Table B in S1 Appendix Enrichment of enhancer regions in repressed regions prioritized by MIFM. Table C in S1 Appendix Enrichment of silencer elements in repressed-enhancer regions prioritized by MIFM. Table D in S1 Appendix Enrichment of silencers in repressed regions prioritized by MIFM. Table E in S1 Appendix Enrichment of silencers in enhancer regions prioritized by MIFM. Table F in S1 Appendix Transcription factor motifs matched to patterns identififed in MIFM using Transcription- Factor Motif Discovery from Importance Scores (TF-MoDISco). Pattern type denotes whether a TF-MoDISco pattern contributes positively or negatively to MIFM predictions. TF motif denotes the name of the transcription factor. No. seqlets – the total number of TF-MoDISco seqlets matching the given transcription factor (TF) motif. No. patterns – the total number of different TF-MoDISco patterns matching the given TF motif.

https://doi.org/10.1371/journal.pgen.1012208.s001

(PDF)

S1 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-1 variant-per-block setting.

https://doi.org/10.1371/journal.pgen.1012208.s002

(TSV)

S2 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-5 variants-per-block setting.

https://doi.org/10.1371/journal.pgen.1012208.s003

(TSV)

S3 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-10 variants-per-block setting.

https://doi.org/10.1371/journal.pgen.1012208.s004

(TSV)

Acknowledgments

This research has been conducted using the UK Biobank Resource under Application Number 77717.

References

1. Sinnott-Armstrong N, Naqvi S, Rivas M, Pritchard JK. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. Elife. 2021;10:e58615.
- View Article
- Google Scholar
2. "Hormozdiari F, Kostem E, kang EY, Pasaniuc B, Eskin E. Identifying causal variants at loci with multiple signals of association. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. 2014. p. 610–1. https://doi.org/10.1145/2649387.2660800
3. Benner C, Spencer CCA, Havulinna AS, Salomaa V, Ripatti S, Pirinen M. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics. 2016;32(10):1493–501. pmid:26773131
- View Article
- PubMed/NCBI
- Google Scholar
4. Wang G, Sarkar A, Carbonetto P, Stephens M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J R Stat Soc Series B Stat Methodol. 2020;82(5):1273–300. pmid:37220626
- View Article
- PubMed/NCBI
- Google Scholar
5. Kichaev G, Yang W-Y, Lindstrom S, Hormozdiari F, Eskin E, Price AL, et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 2014;10(10):e1004722. pmid:25357204
- View Article
- PubMed/NCBI
- Google Scholar
6. Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine mapping causal variants with an approximate Bayesian method using marginal test statistics. Genetics. 2015;200(3):719–36.
- View Article
- Google Scholar
7. Weissbrod O, Hormozdiari F, Benner C, Cui R, Ulirsch J, Gazal S, et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat Genet. 2020;52(12):1355–63. pmid:33199916
- View Article
- PubMed/NCBI
- Google Scholar
8. Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12(10):931–4. pmid:26301843
- View Article
- PubMed/NCBI
- Google Scholar
9. Kelley DR. Cross-species regulatory sequence activity prediction. PLoS Comput Biol. 2020;16(7):e1008050. pmid:32687525
- View Article
- PubMed/NCBI
- Google Scholar
10. Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet. 2021;53(3):354–66. pmid:33603233
- View Article
- PubMed/NCBI
- Google Scholar
11. Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, Taylor KR, et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods. 2021;18(10):1196–203. pmid:34608324
- View Article
- PubMed/NCBI
- Google Scholar
12. Kumasaka N, Knights AJ, Gaffney DJ. High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat Genet. 2019;51(1):128–37. pmid:30478436
- View Article
- PubMed/NCBI
- Google Scholar
13. Broekema RV, Bakker OB, Jonkers IH. A practical view of fine-mapping and gene prioritization in the post-genome-wide association era. Open Biol. 2020;10(1):190221. pmid:31937202
- View Article
- PubMed/NCBI
- Google Scholar
14. Cerezo M, Sollis E, Ji Y, Lewis E, Abid A, Bircan KO, et al. The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 2025;53(D1):D998–1005. pmid:39530240
- View Article
- PubMed/NCBI
- Google Scholar
15. Wang J, Huang D, Zhou Y, Yao H, Liu H, Zhai S, et al. CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies. Nucleic Acids Res. 2020;48(D1):D807–16. pmid:31691819
- View Article
- PubMed/NCBI
- Google Scholar
16. Wang J, Ouyang L, You T, Yang N, Xu X, Zhang W, et al. CAUSALdb2: an updated database for causal variants of complex traits. Nucleic Acids Res. 2025;53(D1):D1295–301. pmid:39558176
- View Article
- PubMed/NCBI
- Google Scholar
17. "Kingma DP. Adam: a method for stochastic optimization. 2014. https://arxiv.org/abs/1412.6980
18. Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: a review. Engineering Applications of Artificial Intelligence. 2022;115:105151.
- View Article
- Google Scholar
19. Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint. 2015. https://arxiv.org/abs/1503.02531
20. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. Pytorch: an imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems. 2019;32.
- View Article
- Google Scholar
21. William Falcon and The PyTorch Lightning Team. PyTorch Lightning. 2019.
22. Rakowski A, Monti R, Huryn V, Lemanczyk M, Ohler U, Lippert C. Metadata-guided feature disentanglement for functional genomics. Bioinformatics. 2024;40(Suppl 2):ii4–10. pmid:39230700
- View Article
- PubMed/NCBI
- Google Scholar
23. Boyd SP, Vandenberghe L. Convex optimization. Cambridge University Press; 2004.
24. Babenko B. Multiple instance learning: algorithms and applications. PubMed. 2008;19.
- View Article
- Google Scholar
25. Ilse M, Tomczak J, Welling M. Attention-based deep multiple instance learning. In: International Conference on Machine Learning. 2018. p. 2127–36.
26. Zacher B, Michel M, Schwalb B, Cramer P, Tresch A, Gagneur J. Accurate Promoter and Enhancer Identification in 127 ENCODE and Roadmap Epigenomics Cell Types and Tissues by GenoSTAN. PLoS One. 2017;12(1):e0169249. pmid:28056037
- View Article
- PubMed/NCBI
- Google Scholar
27. Zeng W, Chen S, Cui X, Chen X, Gao Z, Jiang R. SilencerDB: a comprehensive database of silencers. Nucleic Acids Res. 2021;49(D1):D221–8. pmid:33045745
- View Article
- PubMed/NCBI
- Google Scholar
28. Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, et al. The UCSC genome browser database: update 2006. Nucleic Acids Research. 2006;34(suppl_1):D590–8.
- View Article
- Google Scholar
29. Fisher RA. On the interpretation of χ 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society. 1922;85(1):87.
- View Article
- Google Scholar
30. Shrikumar A, Tian K, Avsec Ž, Shcherbina A, Banerjee A, Sharmin M, et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. arXiv preprint. 2018. https://arxiv.org/abs/1811.00416
31. Shrikumar A, Greenside P, Kundaje A. Learning important features through propagating activation differences. In: International conference on machine learning. 2017. p. 3145–53.
32. Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, et al. HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Research. 2013;41(D1):D195–202.
- View Article
- Google Scholar
33. Tanaka E, Bailey T, Grant CE, Noble WS, Keich U. Improved similarity scores for comparing motifs. Bioinformatics. 2011;27(12):1603–9. pmid:21543443
- View Article
- PubMed/NCBI
- Google Scholar
34. Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res. 2024;52(D1):D1143–54. pmid:38183205
- View Article
- PubMed/NCBI
- Google Scholar
35. Kelley DR, Reshef YA, Bileschi M, Belanger D, McLean CY, Snoek J. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 2018;28(5):739–50. pmid:29588361
- View Article
- PubMed/NCBI
- Google Scholar
36. Chen KM, Wong AK, Troyanskaya OG, Zhou J. A sequence-based global map of regulatory activity for deciphering human genetics. Nature Genetics. 2022;54(7):940–9.
- View Article
- Google Scholar
37. Wakefield J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am J Hum Genet. 2007;81(2):208–27. pmid:17668372
- View Article
- PubMed/NCBI
- Google Scholar
38. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. pmid:26432245
- View Article
- PubMed/NCBI
- Google Scholar
39. Lambert SA, Wingfield B, Gibson JT, Gil L, Ramachandran S, Yvon F, et al. Enhancing the polygenic score catalog with tools for score calculation and ancestry normalization. Nat Genet. 2024;56(10):1989–94. pmid:39327485
- View Article
- PubMed/NCBI
- Google Scholar
40. McFadden D. Regression-based specification tests for the multinomial logit model. Journal of Econometrics. 1987;34(1–2):63–82.
- View Article
- Google Scholar
41. Loh P-R, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Finucane HK, Salem RM, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47(3):284–90. pmid:25642633
- View Article
- PubMed/NCBI
- Google Scholar
42. Beasley TM, Erickson S, Allison DB. Rank-based inverse normal transformations are increasingly used, but are they merited?. Behav Genet. 2009;39(5):580–95. pmid:19526352
- View Article
- PubMed/NCBI
- Google Scholar
43. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. pmid:17701901
- View Article
- PubMed/NCBI
- Google Scholar
44. Cai Y, Zhang Y, Loh YP, Tng JQ, Lim MC, Cao Z, et al. H3K27me3-rich genomic regions can function as silencers to repress gene expression via chromatin interactions. Nat Commun. 2021;12(1):719. pmid:33514712
- View Article
- PubMed/NCBI
- Google Scholar
45. Gisselbrecht SS, Palagi A, Kurland JV, Rogers JM, Ozadam H, Zhan Y, et al. Transcriptional silencers in drosophila serve a dual role as transcriptional enhancers in alternate cellular contexts. Mol Cell. 2020;77(2):324-337.e8. pmid:31704182
- View Article
- PubMed/NCBI
- Google Scholar
46. Ngan CY, Wong CH, Tjong H, Wang W, Goldfeder RL, Choi C, et al. Chromatin interaction analyses elucidate the roles of PRC2-bound silencers in mouse development. Nat Genet. 2020;52(3):264–72. pmid:32094912
- View Article
- PubMed/NCBI
- Google Scholar
47. Della Rosa M, Spivakov M. Silencers in the spotlight. Nat Genet. 2020;52(3):244–5. pmid:32094910
- View Article
- PubMed/NCBI
- Google Scholar
48. Garton J, Shankar M, Chapman B, Rose K, Gaffney PM, Webb CF. Deficiencies in the DNA Binding Protein ARID3a Alter Chromatin Structures Important for Early Human Erythropoiesis. Immunohorizons. 2021;5(10):802–17. pmid:34663594
- View Article
- PubMed/NCBI
- Google Scholar
49. Saadat KASM, Lestari W, Pratama E, Ma T, Iseki S, Tatsumi M, et al. Distinct and overlapping roles of ARID3A and ARID3B in regulating E2F-dependent transcription via direct binding to E2F target genes. International Journal of Oncology. 2021;58(4):1–12.
- View Article
- Google Scholar
50. Shen M, Li S, Zhao Y, Liu Y, Liu Z, Huan L, et al. Hepatic ARID3A facilitates liver cancer malignancy by cooperating with CEP131 to regulate an embryonic stem cell-like gene signature. Cell Death Dis. 2022;13(8):732. pmid:36008383
- View Article
- PubMed/NCBI
- Google Scholar
51. Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, et al. Human demographic history impacts genetic risk prediction across diverse populations. Am J Hum Genet. 2017;100(4):635–49. pmid:28366442
- View Article
- PubMed/NCBI
- Google Scholar
52. Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nature Communications. 2019;10(1):3328.
- View Article
- Google Scholar
53. Mars N, Kerminen S, Feng Y-CA, Kanai M, Läll K, Thomas LF, et al. Genome-wide risk prediction of common diseases across ancestries in one million people. Cell Genom. 2022;2(4):None. pmid:35591975
- View Article
- PubMed/NCBI
- Google Scholar
54. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51(4):584–91. pmid:30926966
- View Article
- PubMed/NCBI
- Google Scholar
55. Sirugo G, Williams SM, Tishkoff SA. The missing diversity in human genetic studies. Cell. 2019;177(1):26–31.
- View Article
- Google Scholar
56. Fitipaldi H, Franks PW. Ethnic, gender and other sociodemographic biases in genome-wide association studies for the most burdensome non-communicable diseases: 2005-2022. Hum Mol Genet. 2023;32(3):520–32. pmid:36190496
- View Article
- PubMed/NCBI
- Google Scholar
57. Wang Y, Guo J, Ni G, Yang J, Visscher PM, Yengo L. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat Commun. 2020;11(1):3865. pmid:32737319
- View Article
- PubMed/NCBI
- Google Scholar
58. Saitou M, Dahl A, Wang Q, Liu X. Allele frequency differences of causal variants have a major impact on low cross-ancestry portability of PRS. medRxiv. 2022:2022–10.
59. Hou K, Ding Y, Xu Z, Wu Y, Bhattacharya A, Mester R, et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat Genet. 2023;55(4):549–58. pmid:36941441
- View Article
- PubMed/NCBI
- Google Scholar
60. Yang J, Ferreira T, Morris AP, Medland SE, Madden PAF, Heath AC, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44(4):369–75.
- View Article
- Google Scholar
61. Novakovsky G, Fornes O, Saraswat M, Mostafavi S, Wasserman WW. ExplaiNN: interpretable and transparent neural networks for genomics. Genome Biol. 2023;24(1):154. pmid:37370113
- View Article
- PubMed/NCBI
- Google Scholar
62. Tseng AM, Eraslan G, Biancalani T, Scalia G. A mechanistically interpretable neural network for regulatory genomics. arXiv preprint. 2024. https://arxiv.org/abs/2410.06211

[ref1] 1. Sinnott-Armstrong N, Naqvi S, Rivas M, Pritchard JK. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. Elife. 2021;10:e58615.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. "Hormozdiari F, Kostem E, kang EY, Pasaniuc B, Eskin E. Identifying causal variants at loci with multiple signals of association. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. 2014. p. 610–1. https://doi.org/10.1145/2649387.2660800

[ref3] 3. Benner C, Spencer CCA, Havulinna AS, Salomaa V, Ripatti S, Pirinen M. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics. 2016;32(10):1493–501. pmid:26773131
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref4] 4. Wang G, Sarkar A, Carbonetto P, Stephens M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J R Stat Soc Series B Stat Methodol. 2020;82(5):1273–300. pmid:37220626
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref5] 5. Kichaev G, Yang W-Y, Lindstrom S, Hormozdiari F, Eskin E, Price AL, et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 2014;10(10):e1004722. pmid:25357204
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref6] 6. Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine mapping causal variants with an approximate Bayesian method using marginal test statistics. Genetics. 2015;200(3):719–36.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Weissbrod O, Hormozdiari F, Benner C, Cui R, Ulirsch J, Gazal S, et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat Genet. 2020;52(12):1355–63. pmid:33199916
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref8] 8. Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12(10):931–4. pmid:26301843
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref9] 9. Kelley DR. Cross-species regulatory sequence activity prediction. PLoS Comput Biol. 2020;16(7):e1008050. pmid:32687525
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref10] 10. Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet. 2021;53(3):354–66. pmid:33603233
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref11] 11. Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, Taylor KR, et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods. 2021;18(10):1196–203. pmid:34608324
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref12] 12. Kumasaka N, Knights AJ, Gaffney DJ. High-resolution genetic mapping of putative causal interactions between regions of open chromatin. Nat Genet. 2019;51(1):128–37. pmid:30478436
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref13] 13. Broekema RV, Bakker OB, Jonkers IH. A practical view of fine-mapping and gene prioritization in the post-genome-wide association era. Open Biol. 2020;10(1):190221. pmid:31937202
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref14] 14. Cerezo M, Sollis E, Ji Y, Lewis E, Abid A, Bircan KO, et al. The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 2025;53(D1):D998–1005. pmid:39530240
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref15] 15. Wang J, Huang D, Zhou Y, Yao H, Liu H, Zhai S, et al. CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies. Nucleic Acids Res. 2020;48(D1):D807–16. pmid:31691819
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref16] 16. Wang J, Ouyang L, You T, Yang N, Xu X, Zhang W, et al. CAUSALdb2: an updated database for causal variants of complex traits. Nucleic Acids Res. 2025;53(D1):D1295–301. pmid:39558176
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref17] 17. "Kingma DP. Adam: a method for stochastic optimization. 2014. https://arxiv.org/abs/1412.6980

[ref18] 18. Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: a review. Engineering Applications of Artificial Intelligence. 2022;115:105151.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref19] 19. Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint. 2015. https://arxiv.org/abs/1503.02531

[ref20] 20. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. Pytorch: an imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems. 2019;32.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref21] 21. William Falcon and The PyTorch Lightning Team. PyTorch Lightning. 2019.

[ref22] 22. Rakowski A, Monti R, Huryn V, Lemanczyk M, Ohler U, Lippert C. Metadata-guided feature disentanglement for functional genomics. Bioinformatics. 2024;40(Suppl 2):ii4–10. pmid:39230700
View Article
PubMed/NCBI
Google Scholar

[70] View Article

[71] PubMed/NCBI

[72] Google Scholar

[ref23] 23. Boyd SP, Vandenberghe L. Convex optimization. Cambridge University Press; 2004.

[ref24] 24. Babenko B. Multiple instance learning: algorithms and applications. PubMed. 2008;19.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref25] 25. Ilse M, Tomczak J, Welling M. Attention-based deep multiple instance learning. In: International Conference on Machine Learning. 2018. p. 2127–36.

[ref26] 26. Zacher B, Michel M, Schwalb B, Cramer P, Tresch A, Gagneur J. Accurate Promoter and Enhancer Identification in 127 ENCODE and Roadmap Epigenomics Cell Types and Tissues by GenoSTAN. PLoS One. 2017;12(1):e0169249. pmid:28056037
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref27] 27. Zeng W, Chen S, Cui X, Chen X, Gao Z, Jiang R. SilencerDB: a comprehensive database of silencers. Nucleic Acids Res. 2021;49(D1):D221–8. pmid:33045745
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref28] 28. Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, et al. The UCSC genome browser database: update 2006. Nucleic Acids Research. 2006;34(suppl_1):D590–8.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref29] 29. Fisher RA. On the interpretation of χ 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society. 1922;85(1):87.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref30] 30. Shrikumar A, Tian K, Avsec Ž, Shcherbina A, Banerjee A, Sharmin M, et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5. arXiv preprint. 2018. https://arxiv.org/abs/1811.00416

[ref31] 31. Shrikumar A, Greenside P, Kundaje A. Learning important features through propagating activation differences. In: International conference on machine learning. 2017. p. 3145–53.

[ref32] 32. Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, et al. HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Research. 2013;41(D1):D195–202.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref33] 33. Tanaka E, Bailey T, Grant CE, Noble WS, Keich U. Improved similarity scores for comparing motifs. Bioinformatics. 2011;27(12):1603–9. pmid:21543443
View Article
PubMed/NCBI
Google Scholar

[98] View Article

[99] PubMed/NCBI

[100] Google Scholar

[ref34] 34. Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res. 2024;52(D1):D1143–54. pmid:38183205
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref35] 35. Kelley DR, Reshef YA, Bileschi M, Belanger D, McLean CY, Snoek J. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 2018;28(5):739–50. pmid:29588361
View Article
PubMed/NCBI
Google Scholar

[106] View Article

[107] PubMed/NCBI

[108] Google Scholar

[ref36] 36. Chen KM, Wong AK, Troyanskaya OG, Zhou J. A sequence-based global map of regulatory activity for deciphering human genetics. Nature Genetics. 2022;54(7):940–9.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref37] 37. Wakefield J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am J Hum Genet. 2007;81(2):208–27. pmid:17668372
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref38] 38. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. pmid:26432245
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

[ref39] 39. Lambert SA, Wingfield B, Gibson JT, Gil L, Ramachandran S, Yvon F, et al. Enhancing the polygenic score catalog with tools for score calculation and ancestry normalization. Nat Genet. 2024;56(10):1989–94. pmid:39327485
View Article
PubMed/NCBI
Google Scholar

[121] View Article

[122] PubMed/NCBI

[123] Google Scholar

[ref40] 40. McFadden D. Regression-based specification tests for the multinomial logit model. Journal of Econometrics. 1987;34(1–2):63–82.
View Article
Google Scholar

[125] View Article

[126] Google Scholar

[ref41] 41. Loh P-R, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Finucane HK, Salem RM, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47(3):284–90. pmid:25642633
View Article
PubMed/NCBI
Google Scholar

[128] View Article

[129] PubMed/NCBI

[130] Google Scholar

[ref42] 42. Beasley TM, Erickson S, Allison DB. Rank-based inverse normal transformations are increasingly used, but are they merited?. Behav Genet. 2009;39(5):580–95. pmid:19526352
View Article
PubMed/NCBI
Google Scholar

[132] View Article

[133] PubMed/NCBI

[134] Google Scholar

[ref43] 43. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. pmid:17701901
View Article
PubMed/NCBI
Google Scholar

[136] View Article

[137] PubMed/NCBI

[138] Google Scholar

[ref44] 44. Cai Y, Zhang Y, Loh YP, Tng JQ, Lim MC, Cao Z, et al. H3K27me3-rich genomic regions can function as silencers to repress gene expression via chromatin interactions. Nat Commun. 2021;12(1):719. pmid:33514712
View Article
PubMed/NCBI
Google Scholar

[140] View Article

[141] PubMed/NCBI

[142] Google Scholar

[ref45] 45. Gisselbrecht SS, Palagi A, Kurland JV, Rogers JM, Ozadam H, Zhan Y, et al. Transcriptional silencers in drosophila serve a dual role as transcriptional enhancers in alternate cellular contexts. Mol Cell. 2020;77(2):324-337.e8. pmid:31704182
View Article
PubMed/NCBI
Google Scholar

[144] View Article

[145] PubMed/NCBI

[146] Google Scholar

[ref46] 46. Ngan CY, Wong CH, Tjong H, Wang W, Goldfeder RL, Choi C, et al. Chromatin interaction analyses elucidate the roles of PRC2-bound silencers in mouse development. Nat Genet. 2020;52(3):264–72. pmid:32094912
View Article
PubMed/NCBI
Google Scholar

[148] View Article

[149] PubMed/NCBI

[150] Google Scholar

[ref47] 47. Della Rosa M, Spivakov M. Silencers in the spotlight. Nat Genet. 2020;52(3):244–5. pmid:32094910
View Article
PubMed/NCBI
Google Scholar

[152] View Article

[153] PubMed/NCBI

[154] Google Scholar

[ref48] 48. Garton J, Shankar M, Chapman B, Rose K, Gaffney PM, Webb CF. Deficiencies in the DNA Binding Protein ARID3a Alter Chromatin Structures Important for Early Human Erythropoiesis. Immunohorizons. 2021;5(10):802–17. pmid:34663594
View Article
PubMed/NCBI
Google Scholar

[156] View Article

[157] PubMed/NCBI

[158] Google Scholar

[ref49] 49. Saadat KASM, Lestari W, Pratama E, Ma T, Iseki S, Tatsumi M, et al. Distinct and overlapping roles of ARID3A and ARID3B in regulating E2F-dependent transcription via direct binding to E2F target genes. International Journal of Oncology. 2021;58(4):1–12.
View Article
Google Scholar

[160] View Article

[161] Google Scholar

[ref50] 50. Shen M, Li S, Zhao Y, Liu Y, Liu Z, Huan L, et al. Hepatic ARID3A facilitates liver cancer malignancy by cooperating with CEP131 to regulate an embryonic stem cell-like gene signature. Cell Death Dis. 2022;13(8):732. pmid:36008383
View Article
PubMed/NCBI
Google Scholar

[163] View Article

[164] PubMed/NCBI

[165] Google Scholar

[ref51] 51. Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, et al. Human demographic history impacts genetic risk prediction across diverse populations. Am J Hum Genet. 2017;100(4):635–49. pmid:28366442
View Article
PubMed/NCBI
Google Scholar

[167] View Article

[168] PubMed/NCBI

[169] Google Scholar

[ref52] 52. Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nature Communications. 2019;10(1):3328.
View Article
Google Scholar

[171] View Article

[172] Google Scholar

[ref53] 53. Mars N, Kerminen S, Feng Y-CA, Kanai M, Läll K, Thomas LF, et al. Genome-wide risk prediction of common diseases across ancestries in one million people. Cell Genom. 2022;2(4):None. pmid:35591975
View Article
PubMed/NCBI
Google Scholar

[174] View Article

[175] PubMed/NCBI

[176] Google Scholar

[ref54] 54. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51(4):584–91. pmid:30926966
View Article
PubMed/NCBI
Google Scholar

[178] View Article

[179] PubMed/NCBI

[180] Google Scholar

[ref55] 55. Sirugo G, Williams SM, Tishkoff SA. The missing diversity in human genetic studies. Cell. 2019;177(1):26–31.
View Article
Google Scholar

[182] View Article

[183] Google Scholar

[ref56] 56. Fitipaldi H, Franks PW. Ethnic, gender and other sociodemographic biases in genome-wide association studies for the most burdensome non-communicable diseases: 2005-2022. Hum Mol Genet. 2023;32(3):520–32. pmid:36190496
View Article
PubMed/NCBI
Google Scholar

[185] View Article

[186] PubMed/NCBI

[187] Google Scholar

[ref57] 57. Wang Y, Guo J, Ni G, Yang J, Visscher PM, Yengo L. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat Commun. 2020;11(1):3865. pmid:32737319
View Article
PubMed/NCBI
Google Scholar

[189] View Article

[190] PubMed/NCBI

[191] Google Scholar

[ref58] 58. Saitou M, Dahl A, Wang Q, Liu X. Allele frequency differences of causal variants have a major impact on low cross-ancestry portability of PRS. medRxiv. 2022:2022–10.

[ref59] 59. Hou K, Ding Y, Xu Z, Wu Y, Bhattacharya A, Mester R, et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat Genet. 2023;55(4):549–58. pmid:36941441
View Article
PubMed/NCBI
Google Scholar

[194] View Article

[195] PubMed/NCBI

[196] Google Scholar

[ref60] 60. Yang J, Ferreira T, Morris AP, Medland SE, Madden PAF, Heath AC, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44(4):369–75.
View Article
Google Scholar

[198] View Article

[199] Google Scholar

[ref61] 61. Novakovsky G, Fornes O, Saraswat M, Mostafavi S, Wasserman WW. ExplaiNN: interpretable and transparent neural networks for genomics. Genome Biol. 2023;24(1):154. pmid:37370113
View Article
PubMed/NCBI
Google Scholar

[201] View Article

[202] PubMed/NCBI

[203] Google Scholar

[ref62] 62. Tseng AM, Eraslan G, Biancalani T, Scalia G. A mechanistically interpretable neural network for regulatory genomics. arXiv preprint. 2024. https://arxiv.org/abs/2410.06211

Multiple instance fine-mapping: Predicting causal regulatory variants with a deep sequence model

Multiple instance fine-mapping: Predicting causal regulatory variants with a deep sequence model

This is an uncorrected proof.

Figures

Abstract

Author summary

1 Introduction

2 Description of the method

2.1 Overview of the method

2.2 Fine-mapping as a multiple instance learning problem

2.3 Dataset and model training

2.4 Functional annotation of the training data

2.5 Motif discovery

2.6 Construction and evaluation of polygenic risk scores

2.7 GWAS and conditional analyses

3 Verification and comparison

3.1 MIFM variants are enriched for enhancer, repressed, and silencer chromatin signatures

3.2 Syntax analysis of a MIFM trained model

4 Applications

4.1 Polygenic risk scores created with MIFM transfer better to non-European ancestries

4.2 MIFM enables discovery of additional GWAS signals

5 Discussion

Supporting information

S1 Appendix. Document containing supplementary Figs A-F and supplementary Tables A-F.

S1 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-1 variant-per-block setting.

S2 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-5 variants-per-block setting.

S3 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-10 variants-per-block setting.

Acknowledgments

References

This is an uncorrected proof.

Figures

Abstract

Author summary

1 Introduction

2 Description of the method

2.1 Overview of the method

2.2 Fine-mapping as a multiple instance learning problem

2.3 Dataset and model training

2.4 Functional annotation of the training data

2.5 Motif discovery

2.6 Construction and evaluation of polygenic risk scores

2.7 GWAS and conditional analyses

3 Verification and comparison

3.1 MIFM variants are enriched for enhancer, repressed, and silencer chromatin signatures

3.2 Syntax analysis of a MIFM trained model

4 Applications

4.1 Polygenic risk scores created with MIFM transfer better to non-European ancestries

4.2 MIFM enables discovery of additional GWAS signals

5 Discussion

Supporting information

S1 Appendix. Document containing supplementary Figs A-F and supplementary Tables A-F.

S1 Table. R2 scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-1 variant-per-block setting.

S2 Table. R2 scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-5 variants-per-block setting.

S3 Table. R2 scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-10 variants-per-block setting.

Acknowledgments

References

S1 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-1 variant-per-block setting.

S2 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-5 variants-per-block setting.

S3 Table. R² scores for each model-ancestry-trait combination of the cross-ancestry PGS evaluation for the top-10 variants-per-block setting.