On the cross-population generalizability of gene expression prediction models

doi:10.1371/journal.pgen.1008927

Fig 1.

A comparison of R² between prediction and measurement in SAGE, with PredictDB test metrics as benchmarks, for 11,545 genes total.

The prediction weights used here are, from left to right: GTEx v6p, GTEx v7, DGN, MESA African Americans, MESA African Americans and Hispanics, MESA Caucasians, and all MESA subjects. Test R² from model training in GTEx 7 and MESA (“test_R2_avg” in PredictDB) appear on the right and provide a performance baseline. The number of genes per weight set varies; see S1 Table.

More »

Expand

Fig 2.

Spearman correlations of measured gene expression versus predicted expression from PrediXcan.

The order of the weight sets matches Fig 1. Test correlations for GTEx v7 and MESA correspond to “rho_avg” from PredictDB.

More »

Expand

Fig 3.

A comparison of R² from SAGE and GTEx v7 training diagnostics.

The SAGE R² are computed from regressing PrediXcan predictions onto gene expression measurements. The GTEx v7 R² are taken from PredictDB (“test_R2_avg”). The red dotted line marks where R² between the two groups match, while the blue line denotes the best linear fit.

More »

Expand

Table 1.

Prediction R² between populations in GEUVADIS for genes with positive correlation between predictions and measurements.

The number of genes analyzed in each scenario varied in each case; see S5 Table. Scenarios where the training sample is contained in the testing sample cannot be accurately tested and are marked with “n/a”. EUR373 includes all 373 Europeans, EUR278 includes only the 278 non-Finnish Europeans, FIN includes only the 95 Finnish individuals, and AFR includes only the 89 Yoruba.

More »

Expand

Table 2.

Prediction R² between populations in GEUVADIS for 564 gene models that show positive correlation between prediction and measurement in all 9 train-test scenarios that were analyzed.

Scenarios that were not tested are marked with “n/a”. As before, EUR373 includes all 373 Europeans, EUR278 includes only the 278 non-Finnish Europeans, FIN includes only the 95 Finnish individuals, and AFR includes only the 89 Yoruba.

More »

Expand

Table 3.

Cross-population prediction performance across all five constituent GEUVADIS populations over genes with positive correlation between predictions and measurements.

All populations were subsampled to N = 89 individuals. The number of genes represented varies by training sample (CEU: N = 1029, FIN: N = 1320, GBR: 1436, TSI: 1250, YRI: 914).

More »

Expand

Table 4.

Cross-population prediction performance across all five subsampled GEUVADIS populations over the 142 genes with positive correlation between prediction and measurement in all 25 train-test scenarios.

As in Table 3, all populations were subsampled to n = 89 subjects.

More »

Expand

Fig 4.

Correlations between predictions and simulated gene expression measurements from simulated populations across various proportions of shared eQTL architecture with 10 causal cis-eQTLs.

Here YRI is simulated from the 1000 Genomes Yoruba, CEU is simulated from the Utahns, and AA is constructed from YRI and CEU. The black line represents the upper bound of correlation 0.387 dictated by our choice h² = 0.15 for the genetic heritability of expression. Each trend line represents an interpolation of correlation versus shared eQTL proportion. Gray areas denote 95% confidence regions of LOESS-smoothed mean correlations conditional on the proportion of shared eQTLs.

More »

Expand

Fig 5.

Curves depicting power to detect association under various TWAS scenarios.

The x-axis represents the proportion of phenotypic variance explained by gene expression. As in Fig 4, AA reflects simulated African-Americans constructed from YRI and CEU. The curves represent logistic interpolations of whether or not the causal gene was declared significant in an association test of a phenotype from the testing population with gene expression predicted from a training population into the testing population. Gray areas denote 95% confidence regions of mean power conditional on the effect size.

More »

Expand

Fig 6.

Power for phenotype-expression association tests with cross-population imputed gene expression for heritability h² = 0.205.

The cross-population scenarios are ordered left to right from least admixture (CEU to YRI, 0% admixture proportion in our simulation) to most admixture (YRI to AA, 80% admixture proportion). Power increases on two axes: (1) as the proportion of shared eQTL architecture increases, and, to a lesser extent, (2) as genetic distance decreases between reference and target populations. Power is consistently high when training and testing populations match.

More »

Expand

Fig 7.

Power for various cross-population train-test scenarios with varying YRI admixture for three phenotypic heritability levels h² = 0.06, 0.20, and 0.58, corresponding to effect sizes 0.005, 0.01, and 0.025, respectively.

Power increases as heritability increases, but also as populations become more genetically similar. Raw power estimates and 95% confidence intervals are listed in S10–S12 Tables.

More »

Expand