Fig 1.
Characterization of overlapping EBML/eSTRs.
a, We consider the set intersection of microsatellites from studies of ethnically biased microsatellites (EBML) and expression STRs (eSTRs): 24,304 microsatellites total. b, Fourfold plot reveals statistically significant overlap between EBML and eSTRs. Area of each quarter circle is proportional to the cell frequency (following marginal standardization). Color of each quarter circle corresponds with its Pearson residual: red indicates the cell entry exceeds the expected value; blue indicates less than then expected value. Confidence rings for the odds ratio allow a visual test of the null (no association); here, 99% confidence rings do not overlap indicating the null is rejected. c, Overlapping EBML/eSTRs tend to have larger effect sizes (effect on gene expression) than all other eSTRs. d, Characterization of the 313 overlapping microsatellites by ethnicity and tissue type. Note, that each microsatellite is biased in one or more ethnicities; and, each affects gene expression in one or more tissue types.
Fig 2.
Gene expression inferred from whole blood RNA-seq in cohorts of 89 Africans and 373 Europeans.
a, Volcano plot reveals 14,124 (out of 60,667) differentially expressed genes. Differential expression is validated for 32 (out of 54) genes associated with an EBML/eSTRs (p = 1.08e-9; χ2 test). Genes up regulated by EBML/eSTR microsatellites include MAEL and CYB5R2; genes down regulated by EBML/eSTR microsatellites include GPX7 and GSTM3. b, MA plot of gene expression fold change in 89 Africans and 373 Europeans. The 8,640 upregulated genes include 20 associated with an overlapping EBML/eSTR; the 5,484 downregulated genes include 12 associated with an overlapping EBML/eSTR. The null was not rejected (p = 1) indicating that EBML/eSTRs are not disproportionately associated with up regulation or down regulation.
Fig 3.
The contribution of overlapping EBML/eSTRs to differential gene expression.
a, fold change for differentially expressed genes associated with EBML/eSTRs (blue) compared to all other genes (red). The difference is statistically significant (p = .0024; two sided Kolmogorov-Smirnov test). b, fold change for differentially expressed genes associated with eSTRs (blue) compared to all other genes (red). The difference is statistically significant (p = 0; two sided Kolmogorov-Smirnov test). c, PCA plot of differentially expressed genes reveals two population clusters: African samples (red) and European samples (blue). d, PCA loadings for differentially expressed genes including those associated with EBML/eSTRs (blue) and all other genes (red). e, inset of PCA loadings for genes associated with EBML/eSTRs. f, a comparison of PCA loadings finds no significant difference for genes association with EBML/eSTRs vs all other genes (p = .465; two sided Kolmogorov-Smirnov test).
Fig 4.
Array length polymorphisms for 15 microsatellites correlate with gene expression.
Each gene is associated with an overlapping EBML/eSTR. In each case the null is rejected: no association between repeat length and gene expression. African samples (red) and European samples (blue) visually reiterate that array length polymorphisms are ethnically biased. Enrichment analysis reveals 1 significant KEGG pathway: glutathione metabolism. Genes involved in glutathione metabolism include GSTM3 and GPX7.
Fig 5.
A mathematical model of glutathione metabolism predicts differential response to oxidative stress in Africans and Europeans.
a, We modify the Reed model of glutathione metabolism to include effects of an eSTR (chr1:52525366) associated with GPX7 expression. Subsequent in silico experiments track the ratio [GSH]/[GSSG] indicative of the redox status of the cell. b, Predicted steady state concentrations of key metabolites are affected by eSTR array length. Array length deletions (shorter than 11bp) mimic the effects of oxidative stress. c, The European eSTR allele run to steady state under various concentrations of H2O2 faithfully reproduces the original Reed model. d, Steady state concentrations predicted for the African model (10bp allele) under various concentrations of H2O2 reveal differences in the oxidative stress response.
Fig 6.
Summary of our overall approach.
We characterize 24,304 overlapping regions drawn from two previous studies of short tandem repeats. We validate 32 (out of 54) genes associated with whole blood EBML/eSTRs using RNA-seq: 89 Africans and 373 Europeans. We subsequently quantify their effects on gene expression and perform KEGG pathway enrichment analysis. Two of the validated EBML/eSTRs affect glutathione metabolism. We modify the Reed model to predict how an eSTR associated with GPX7 expression affects response to oxidative stress in Africans and Europeans, respectively.
Fig 7.
GPX7 gene encodes one member in a family of eight isoenzymes (GPX 1–8).
We estimate the contribution of GPX7 to the entire GPX enzyme pool using RNA-seq. GPX7 expression accounts for approximately 2% (λ = .02) of the entire GPX enzyme pool.