Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations
Table 1
Imputation performance of publicly available reference panels when applied to the TB-DAR data based on the H3Africa array content.
Minor allele frequency (MAF) is based on the frequency observed in the TB-DAR cohort. Imputation quality (Subcolumn 1) is measured by either INFO score (AFGR and HRC; Sanger Imputation Server) or r2 (CAAPA; Michigan Imputation Server). Correlation with ground truth (Subcolumn 2) measures the correlation between the imputed dosage and the ground truth WGS dosage using the squared pearson correlation coefficient (r2). Percent of variants imputed (Subcolumn 3) represents the fraction of variants observed in the TB-DAR WGS data that were successfully imputed (Imputation Quality > 0.8).