Benign ethnic neutropenia (BEN) is a hematologic condition associated with people of African ancestry and specific Middle Eastern ethnic groups. Prior genetic association studies in large population showed that rs2814778 in Duffy Antigen Receptor for Chemokines (DARC) gene, specifically DARC null red cell phenotype, was associated with BEN. However, the mechanism of this red cell phenotype leading to low white cell count remained elusive.
We conducted an extreme phenotype design genome-wide association study (GWAS), analyzed ~16 million single nucleotide polymorphisms (SNP) in 1,178 African-Americans individuals from the Reasons for Geographic and Racial Differences in Stroke (REGARDS) study and replicated from 819 African-American participants in the Atherosclerosis Risk in Communities (ARIC) study. Conditional analyses on rs2814778 were performed to identify additional association signals on chromosome 1q22. In a separate cohort of healthy individuals with and without BEN, whole genome gene expression from peripheral blood neutrophils were analyzed for DARC.
We confirmed that rs2814778 in DARC was associated with BEN (p = 4.09×10−53). Conditioning on rs2814778 abolished other significant chromosome 1 associations. Inflammatory cytokines (IL-2, 6, and 10) in participants in the Howard University Family Study (HUFS) and Multi-Ethnic Study in Atherosclerosis (MESA) showed similar levels in individuals homozygous for the rs2814778 allele compared to others, indicating cytokine sink hypothesis played a minor role in leukocyte homeostasis. Gene expression in neutrophils of individuals with and without BEN was also similar except for low DARC expression in BEN, suggesting normal function. BEN neutrophils had slightly activated profiles in leukocyte migration and hematopoietic stem cell mobilization pathways (expression fold change <2).
These results in humans support the notion of DARC null erythroid progenitors preferentially differentiating to myeloid cells, leading to activated DARC null neutrophils egressing from circulation to the spleen, and causing relative neutropenia. Collectively, these human data sufficiently explained the mechanism DARC null red cell phenotype causing BEN and further provided a biologic basis that BEN is clinically benign.
Citation: Charles BA, Hsieh MM, Adeyemo AA, Shriner D, Ramos E, Chin K, et al. (2018) Analyses of genome wide association data, cytokines, and gene expression in African-Americans with benign ethnic neutropenia. PLoS ONE 13(3): e0194400. https://doi.org/10.1371/journal.pone.0194400
Editor: Farook Thameem, University of Texas Health Science Center at San Antonio, UNITED STATES
Received: October 25, 2017; Accepted: March 4, 2018; Published: March 29, 2018
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Data Availability: Gene expression data can be accessed from the NCBI GEO database at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108894. All other relevant data are within the paper and its Supporting Information files.
Funding: REGARDS research project is supported by a cooperative agreement U01 NS041588 from the National Institute of Neurological Disorders and Stroke, National Institutes of Health, Department of Health and Human Service. Grants and fellowships supporting the writing of the paper: CIDR contract # HHSN268201100011I, Center for Inherited Disease Research (CIDR) High Throughput Genotyping Resource Access, National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK); Intramural Research Program of the Center for Research on Genomics and Global Health supported by the National Human Genome Research Institute (NHGRI), the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, the Office of the Director at the National Institutes of Health (Z01HG200362); and the NHGRI Health Disparity Postdoctoral Fellowship. The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Total and differential white blood cell (WBC) counts are often used as measures of health, immunocompetence, and tolerance to chemotherapy with neutrophils usually accounting for 40% to 80% of total WBC. Asymptomatic or benign reductions in neutrophils (absolute neutrophil count, or ANC, <1.5×109 cells/L) are primarily associated with non-white ethnicity, hence the term benign ethnic neutropenia (BEN). BEN has been documented in individuals with ancestry from Yemen, the Middle East (Bedouin Arabs), Africa (including admixed African populations in the Americas) and Europe [1–5]. The largest population-based study of neutropenia in the United States (US) showed ethnic differences in BEN prevalence, with estimates of 4.5%, 0.79% and 0.38% among African-Americans, European-Americans, and Mexican-Americans, respectively .
There is substantial evidence for the genetic control of hematologic traits in general and BEN in particular. First, hematologic traits (including WBC and neutrophil counts) showed high heritability, with estimates of 61% to 96% in twin studies [6, 7] and 42% to 62% in other kinds of studies . Second, admixture mapping studies identified a chromosome 1q22 polymorphism within the DARC gene that strongly influenced WBC counts in persons of African ancestry. Indeed, DARC (also known as ACKR1, atypical chemokine receptor 1—Duffy blood group) explained up to 20% of the variance in WBC and neutrophil counts [8–10]. Third, GWAS have also identified genetic loci associated with hematologic traits, including WBC and neutrophil count. However, no GWAS for BEN has been performed to date.
In the present study, we conducted a GWAS of WBC in African-Americans using an extreme phenotype design. We reasoned that a study in African-Americans may provide fresh insight into the genetics of BEN, especially when genome-wide dense SNP array analysis is combined with gene expression studies.
The study protocol was approved by the Institutional Review Board (IRB) of National Human Genome Research Institute (NHGRI) and National Heart, Lung, and Blood Institute (NHLBI) at the National Institutes of Health. The use of the Atherosclerosis Risk in Communities study (ARIC) replication sample was approved by the NHGRI IRB and the National Center for Biotechnology Information (NCBI) database of Genotypes and Phenotypes (dbGaP) Data Access Committee (DAC). All study participants gave written informed consent prior to inclusion in the studies. Each study complies with the tenets of the Declaration of Helsinki.
Subjects for discovery GWAS
The REGARDS cohort comprised 30,239 self-identified African-American and white individuals, aged 45 and older at enrollment in 2003–2007. Fifty-six percent of the sample was from the eight southeastern United States comprising the ‘stroke belt’ with the remainder from the other 40 contiguous states; 42% are African-American, and 55% women . A baseline telephone interview assessed cardiovascular risk factors (including smoking history). An in-home physical assessment conducted 3–4 weeks after the telephone interview obtained blood samples.
The subset selected for this BEN discovery GWAS were self-identified African-Americans. Variation in the neutrophil count was directly reflected in WBC or total leukocyte count. Therefore, WBC, a parameter in the complete blood count testing, was a very good surrogate marker for ANC. An extreme phenotype study design was used, with 600 participants with WBC counts in the 1st–8th percentile at one extreme (low WBC or “LW”) and 600 participants with counts in the 75th–99th percentile at the other extreme (high WBC or “HW”). Participants with WBC below 1st or over 99th percentile were excluded to reduce the influence of pathologic alterations in WBC counts.
Genotyping and quality control
Genomic DNA was extracted from banked WBC using the Puregene DNA purification kit. DNA quantification was conducted using the PicoGreen method. Genotyping was conducted at the Center for Inherited Disease Research (CIDR) on 1,260 REGARDS samples (including technical replicates and related samples) using the Illumina HumanOmniExpress-12v1 array. A total of 1,247 (99.0%) of the attempted 1,260 samples were successfully assayed, representing 1,178 unrelated individuals after exclusion of technical replicates and cryptic relatives (kinship coefficient > 0.125). After application of technical filters, CIDR released 730,525 SNPs. Characteristics of the subjects in the discovery GWAS are shown in Table A in S1 File.
Stringent quality control (QC) measures were instituted at various stages of the project. First, QC samples, including 26 HapMap samples, as well as related individuals and duplicate samples, were genotyped along with the LW GWAS samples. This facilitated identification of poorly performing SNPs showing duplicate inconsistencies and/or Mendelian inconsistencies. Second, the resulting genotypes were passed through a set of filters, including departures from Hardy-Weinberg equilibrium, locus missingness, sample missingness, minor allele frequency, sex differences in allelic frequency, extreme heterozygosity (for autosomal SNPs), non-observance of founder genotypes, and observance of heterozygous haploid genotypes. Regions with large chromosomal anomalies were filtered out. A total of 22,327 SNPs of the original 730,525 SNPs were filtered out and 656,747 SNPs were carried forward for analysis. Third, population genetic characteristics of the study sample were assessed by computing principal components (PCs) of the genotypes and evaluating cluster patterns for outliers. Study sample PCs were also computed and compared to the HapMap population reference panels to verify ancestry. The study sample is African-American, an admixed African-European group. Therefore, we expect the genotypes of our sample to cluster on a continuum between West African and European reference genotypes. Fourth, an evaluation of the sample for potential population stratification was conducted as described in the next section.
We evaluated clustering of the samples and other characteristics associated with population structure by computing the PCs of the genotyped SNPs. SNPs for this analysis were obtained by linkage disequilibrium-based pruning of the dataset (parameters: variance inflation factor (VIF) of 1.1 for a 50 SNP window with a slide of 5 SNPs). The VIF is equivalent to 1/(1-R2), in which R2 represents the multiple correlation coefficient for a SNP regressed on all other SNPs in the window. Using this process, 94,360 SNPs were extracted for assessment of population structure in the REGARDS sample. We evaluated the number of significant PCs using the Minimum Partial Average (MAP) Test . This test is known to perform better than the Tracy-Widom test in identifying the number of significant principal components in admixed populations. As shown in Figure A in S1 File, the REGARDS participants clustered tightly together as a single cluster with no significant outliers. Projection of the REGARDS participants on to four HapMap/1000 Genomes samples (YRI, CEU, JPT, and CHB) demonstrated a distribution along a line spanning between the African (YRI) and European (CEU) reference samples (Figure B in S1 File), a pattern consistent with African-European admixture. Only the first PC explained a significant proportion of variance of the genotypes (Figure C in S1 File), confirmed by the MAP test. Therefore, the first PC was included as a covariate in the association analyses.
To improve genomic coverage, in silico imputation was conducted using the 1000 Genomes reference data (http://www.1000genomes.org). Imputation of non-genotyped SNPs was conducted at the University of Washington, Genetics Coordinating Center (GCC) using IMPUTE2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html), following pre-phasing with SHAPEIT2 version 2 (https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html). The 1000 Genomes Project’s worldwide reference panel Phase 1 Version 3 (ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/phase1/) comprising all 11 samples was used to impute non-genotyped SNPs. Inclusion in imputation processing required there be at least two copies of the minor allele in the African or European samples. Additionally, only SNPs were imputed (approximately 1.5 million indels and structural variants were excluded from imputation). Imputed SNPs with an “info” score <0.3 were considered low quality and filtered out from the dataset.
Logistic regression association analysis was performed under an additive genetic model with the imputed dosages using PLINK v1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/), including age, sex, smoking status, and the first PCs as covariates in the model. Annotation of significant hits was conducted using the R package NCBI2R (http://cran.r-project.org/web/packages/NCBI2R/NCBI2R.pdf). In silico lookup of previously reported genome-wide significant SNPs with WBC and/or neutrophil counts was done with the aid of the Catalog of Published Genome Wide Association Studies (http://www.ebi.ac.uk/gwas).
Replication in the Atherosclerosis Risk in Communities (ARIC) study
Replication was conducted using African-American participants from ARIC. The phenotype definitions were the same as in the discovery GWAS. Characteristics of the study subjects are shown in Table B in S1 File. These samples were genotyped on the Affymetrix Genome-Wide Human SNP Array 6.0 and imputed into the 1000 Genomes phase 1 version 3 reference panel. Details regarding genotyping, data cleaning, imputation and analyses are available in dbGaP and mirror those implemented in the discovery GWAS in the REGARDS study as described above. Phenotypes and genotypes (including imputed genotypes) were obtained from dbGAP under controlled access. Association analysis was performed under an additive genetic model, using the same covariates as in the discovery GWAS.
Gene expression analysis
Participants for the gene expression study were enrolled at the NIH Clinical Center, Bethesda, Maryland, protocol 03-H-0168 (clinicaltrials.gov, NCT00059423). This study consisted of seven individuals with absolute neutrophil count <1.5x109 cells/L and five non-BEN individuals with absolute neutrophil count >4.0x109 cells/L. All subjects were self-identified African-Americans and nonsmokers. Demographic characteristics of the subjects are shown in Table C in S1 File. Whole blood was centrifuged using density gradient separation (Ficoll) to obtain granulocytes. Approximately 107 granulocytes were mixed with 1 mL RNA Stat-60 solution. RNA was extracted using chloroform, precipitated with isopropanol, and rehydrated with DEPC-treated water.
Affymetrix Human Gene 2.0 ST arrays were used to evaluate gene and exon expression. This array has ~48,000 gene-level probe sets and ~418,000 exon-level probe sets. The array has multiple probes for each exon of each transcript and has a median of 21 unique probes for each transcript. This design permits the analysis of expression at both the gene and exon levels and facilitates the study of transcript variants and alternative splicing events. Array processing, including hybridization, scanning and washing, was done following the manufacturer’s instructions. Data were deposited in NCBI Gene Expression Omnibus (GEO), series record GSE108894, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108894. Data quality control and analysis were done using Partek Genomics 6.6 (Partek, Inc., St. Louis, MO). Affymetrix CEL files were imported into Partek, followed by probe summarization and normalization using the RMA (Robust Muti-Chip Average) algorithm. Based on PC analysis, one outlier was removed from further analysis. Probe sets with low expression (log2 value < 3.0) in all samples were excluded.
Gene-level differential expression between BEN and non-BEN was assessed by analysis of covariance, controlling for the effects of age, sex, and batch after mean summarization of exon-level probes to genes. Exon-level differential expression between BEN and non-BEN was assessed by analysis of covariance, controlling for the effects of age, sex, and batch. For both gene-level and exon-level analysis, we considered a false discovery rate (FDR) < 0.05 as significant. The Partek alternative splicing workflow ANOVA model was used to detect alternatively spliced variants, adjusting for age, sex, and batch. Since there is no consensus in the field on how to appropriately correct for multiple comparisons in alternative splicing, we considered two thresholds: an unadjusted p-value <0.05 and an FDR <0.05. To declare an alternative splicing event, we filtered for genes showing differential expression between BEN and non-BEN.
We used Thomson Reuters’ GeneGo MetaCore™ (https://portal.genego.com/) to identify pathways represented greater than expected by chance from gene lists of interest, i.e., differentially expressed and/or alternatively spliced transcripts. MetaCore is based on a high-quality, manually-curated database of transcription factors, receptors, ligands, kinases, drugs, and endogenous metabolites as well as species-specific directional interactions between protein-protein, protein-DNA, and protein-RNA, drug targeting, and bioactive molecules and their effects. We used MetaCore to search for canonical networks for which our gene sets of interest are enriched.
DARC gene expression
Total neutrophil RNA from participants enrolled at the NIH Clinical Center was transcribed to cDNA (SuperScript III First–Strand Synthesis SuperMix; Invitrogen, Carlsbad, CA). Quantitative real-time PCR was performed (SYBR Green chemistry, CFX96 Real-Time PCR Detection System; Bio-Rad, Hercules, CA, USA) to assay the two known DARC gene transcripts (NM_001122951.2, short transcript of 1258 bp and NM_002036.3, long transcript of 2024 bp). The relative expression of the transcripts was calculated using the comparative CT (2−ΔΔCT) method  with GAPDH as an internal control gene for normalization. Samples with GAPDH expression of ≤ 23 CT were included in the analysis. A Caucasian sample with the Fy(a+b+) phenotype was used as calibrator and set at 100%. All samples were tested in triplicate, and the assay was performed three times.
Trait values for cytokines previously reported to be associated with DARC or Duffy expression–namely C-reactive protein (CRP), interleukin 2 (IL-2), IL-6, IL-10, matrix metallopeptidase 3 (MMP-3) and 9 were retrieved from participants in the Howard University Family Study (HUFS, N = 1623) and Multi-Ethnic Study in Atherosclerosis (MESA, N = 1344). CRP, IL-6, and IL-10 were available from HUFS; CRP, IL-2, IL-6, MMP-3, and MMP-9 in MESA. Cytokine values were log-transformed under a recessive model and analysis adjusted for age, sex, body mass index and type 2 diabetes.
Discovery GWAS in REGARDS
The discovery GWAS included 592 LW and 586 HW individuals (Table A in S1 File). Both groups were similar in age but differed in the prevalence of smoking and CRP levels. The mean individual admixture proportion was 80.7% African ancestry. Individual admixture proportion was significantly correlated with LW, with risk increasing with increasing proportion of African ancestry (odds ratio, OR = 12.6, p = 9.84×10−7).
The Manhattan plot for the association analysis is shown in Fig 1. The top locus was a broad region on chromosome 1, with the leading SNP being DARC SNP rs2814778 (OR = 0.0641, p = 4.09×10−53). This was the SNP that has been most frequently reported in previous studies. Notably, the two top hits in DARC (SNPs rs2814778 and rs12075) defined the Duffy blood group system and its two principal antigens (Fya and Fyb) of the Duffy blood group system. The top non-DARC significant SNP was rs856046, an intronic variant located in the gene interferon gamma-inducible protein 16 (IFI16) (OR = 9.6175, p = 2.89×10−40). The leading variants associated with LW and their genic annotation are listed in Table 1. A regional plot around the index SNP is shown in Fig 2 and an annotation of the genes in this region is shown in Table D in S1 File. The only genome-wide significant hit outside of chromosome 1 was rs36076607 (OR = 1.61, p = 4.67×10−8) in the ephrin receptor A3 (EPHA3) gene.
Conditional analyses on the leading DARC variant.
Given that the association on chromosome 1 encompasses a broad region of natural selection in African-ancestry populations (Figure D in S1 File), post hoc analyses were conducted to condition on the leading variant in DARC (rs2814778). Finding genome-wide significant signal in this analysis would help resolve the issue of whether there are other loci in this region apart from DARC that influence risk of low WBC. Therefore, the GWAS was repeated with conditioning on rs2814778. The results of the conditional analysis showed no other genome-wide significant signal in the chromosome 1 region. There were also no other genome-wide significant SNPs across the genome; the best SNP was rs77998448 (p = 1.4×10−7, Table E in S1 File).
Analysis of DARC SNPs, rs12075 and rs2814778.
Two SNPs defined the DARC phenotype. Rs12075*A (FY*B) is ancestral, from which rs12075*G (FY*A) is derived. FY*01N.01, or DARC null, is independently derived from FY*B and is defined by rs2814778, with the ancestral allele T and the derived allele C. The ancestral haplotype carried allele A at rs12075 with allele T at rs2814778; two derived haplotypes were G with T (FY*A) and A with C (FY*01N.01), respectively. The predicted DARC phenotype frequency computed from the allele frequencies is shown in Table F in S1 File. As expected, the predicted frequencies of Fy(a+b+), Fy(a-b+), and Fy(a+b-) phenotypes were all higher in HW, whereas the Fy(a-b-) phenotype was predominantly found in LW. Interestingly, 37% of the HW group also carried the Fy(a-b-) phenotype.
Replication analysis in ARIC.
Replication analysis was performed in ARIC using the same methods and models as in the discovery GWAS. Characteristics of the ARIC participants are shown in Table B in S1 File. Association analysis showed a similar chromosome 1 peak, with the same leading variant: DARC SNP rs2814778 (p = 2.84×10−21) and with the same direction of effect. The second leading variant in the replication study was rs2570916 (OR = 10.7, p = 3.11×10−18), an IFI16 intronic variant. Most of the genome-wide significant chromosome 1 variants identified in the discovery sample were replicated in the ARIC sample but at different levels of significance (Table G in S1 File). However, the only non-chromosome 1 genome-wide significant hit in the discovery study—rs36076607 (EPHA3)—was not replicated in the ARIC sample (p = 0.871, Table H in S1 File).
In silico lookup of previous GWAS findings for WBC and/or neutrophil counts.
We extracted from the GWAS Catalog (accessed 7/10/2014) genome-wide significant SNPs that were associated with WBC and/or neutrophil counts. The SNPs that were significant in the present study are shown in Table 2.
Gene expression analysis
For gene-level expression, no transcript showed significantly different expression between BEN and non-BEN samples at an FDR <0.05. This finding suggested that neutrophils from BEN individuals would be functionally similar to neutrophils from non-BEN individuals. Based on nominal p-values, the most significantly differentiated genes between the two groups (Fig 3) included CRX (p = 1.04×10−6, fold change BEN cases relative to controls -1.35), LCP1 (p = 6.54×10−5, fold change -1.27), CEP95 (p = 8.30×10−5, fold change 1.65), HECTD4 (p = 9.37×10−5, fold change 1.14), and RGS20 (p = 9.59×10−5, fold change 1.21). The list of genes with unadjusted p<5×10−4 (absolute fold change between BEN cases and controls of 1.2–1.8) is shown in Table I in S1 File. Enrichment analyses of this gene set in hematologic and immune systems showed that the top scoring canonical pathway maps were: Development_EGFR signaling pathway (p = 0.009), Development_Role of proteases in hematopoietic stem cell mobilization (p = 0.041), Development_Role of G-CSF in hematopoietic stem cell mobilization (p = 0.047), and Development_Role of HGF in hematopoietic stem cell mobilization (p = 0.047, Table J and Figure P in S1 File). Repeating this analysis in all tissues showed that these four pathway maps remained among the top six scoring pathway maps (Table K in S1 File). Gene ontology (GO) enrichment scoring of hematologic and immune system processes showed that leukocyte migration had the highest enrichment score (ES 3.69, Figure Q in S1 File). For exon-level expression, no exon was differentially expressed at an FDR of 0.05 or at a Bonferroni-adjusted p-value of 1.42×10−7 (Table L in S1 File).
A. Heat plot of most differentially expressed transcripts in granulocytes of BEN compared to non-BEN individuals B. Dot plot of gene expression of DARC (the gene most significant on GWAS) and MMP9 (the gene with the largest absolute differential expression between BEN and non-BEN individuals).
An analysis of differential alternative splicing in BEN compared to non-BEN showed that, at an unadjusted p-value of 0.05, 1,945 genes had a high probability of differential alternative splicing. Adjusting for multiple comparisons using the FDR, 16 genes were significant at an FDR <0.05 (Table M in S1 File). A total of 56 genes were alternatively spliced but not differentially expressed (alternative splicing p-value < 0.05, differential expression p-value >0.95, Table N in S1 File).
We sought to identify differentially expressed transcripts that were coded for by genes within the chromosome 1 genome-wide significant association region. At p<0.05, there were seven such transcripts, including DARC (Fig 4, Table 3). Therefore, the locus that displayed the greatest evidence for statistical genetic association with BEN also showed differential expression between BEN and non-BEN controls. We then confirmed with real-time PCR that DARC expression was indeed absent in the same BEN individuals as in the gene expression cohort (Table 4).
Given the postulated role of DARC as a ‘cytokine sink’, which binds to inflammatory cytokines and attracts leukocytes/neutrophils into peripheral blood, we tested the DARC rs2814778 C allele in the homozygote state (i.e. DARC null) for association with cytokines in two cohorts: HUFS and MESA. The findings showed that rs2814778 CC was significantly associated with higher CRP levels in both HUFS (β = 0.099 (SE 0.026), p < 0.0001) and MESA (β = 0.148 (SE 0.061), p = 0.015) (Table O in S1 File). While the direction of effect was also positive for IL-2, IL-6 and IL-10, the p-values were not significant. Additionally, we also tested for metalloproteases that are commonly associated with neutrophil function. MMP-9 was associated with the Duffy null state with a negative direction of effect—β = -0.439 (SE 0.084), p < 0.001 (Table O in S1 File). This finding is consistent with our gene expression analysis which showed that MMP-9 was significantly down-regulated in BEN cases when compared with controls (Table 3, Fig 3). IL-8 and MCP1 were not available for analyses in HUFS or MESA cohorts.
We report a genome-wide analysis for LW (BEN) that combined a GWAS approach and whole genome gene expression analysis in African-Americans. BEN is disproportionately observed in specific ethnic groups, including people with African or Middle East ancestry. While BEN is not associated with infection, immunodeficiency, or any other clinical pathology, BEN may have clinical implications for chemotherapy, immunosuppressant therapy, organ transplantation, definitions for clinical conditions such as sepsis, and treatment with psychotropic medications [14, 15]. Specifically, if the etiology of LW (BEN) was benign, this would provide reassurance to continue treatment. If the etiology was otherwise, withholding treatment would be logical. Our strongest association was on chromosome 1 and centered on DARC, a similar finding to the association reported in GWAS of WBC and/or neutrophil count in African-Americans [16, 17].
However, WBC and neutrophil counts in the general population are influenced by both genetic and non-genetic factors (such as age, smoking, and use of specific medications). Loci on chromosomes 6, 12, 17, and 20 have been associated with WBC in European and/or Japanese populations, while loci on chromosome 1 (DARC), 4, and 16 have been associated with WBC and/or neutrophil count in African-Americans (Table 4 and Table P in S1 File) [16–20]. The reported chromosome 4 locus was in the gene CXCL2 and has been identified as being associated with WBC in Hispanic, Japanese, and European-ancestry individuals . DARC is an example of a genetic factor that is influenced by geography and ancestry, reflecting selective pressures on the genome. The DARC variant rs2814778 was the top ranking variant in our sample, similar to the findings of the COGENT consortium GWAS for WBC in African-Americans  and other studies . DARC resides in a region under very strong positive selection and is associated with resistance to Plasmodium vivax. Individuals who are homozygous for the C allele do not express the FYA and FYB antigens, rendering them resistant to P vivax malaria. This is possibly through inactivation of the Duffy binding protein II, via blocking IL-8 cleavage to the antigen which is required for the malaria parasite to enter red blood cells [22, 23]. Thus, as expected in our study cohort of all African-Americans, the Fy(a+b+) phenotype was rare. The Fy(a+b-) and Fy(a-b+) phenotypes were observed more commonly in HW, and the Fy(a-b-) phenotype overwhelmingly in LW. Surprisingly, the Fy(a-b-) phenotype was also observed in more than one-third of HW, suggesting the genetic influence on neutrophil/leukocyte count is not simply from the DARC null state.
The mechanism by which DARC affects neutrophil counts has progressed since the initial identification of DARC. Multiple groups have explored the cytokine sink hypothesis, which purported the presence of DARC was necessary to mediate blood pro-inflammatory chemokines and moderate leukocyte/neutrophil counts. Large cohort analyses of two such chemokines showed that rs12075 Asp42Gly (DARC positive) was associated with higher serum IL-8 levels , and higher serum monocyte chemoattractant protein 1 (MCP-1) [25, 26]. These higher blood cytokine levels partially explained the relative higher WBC counts; rs2814778 (DARC null red cells) correlated with lower chemokines levels and lower WBC counts. However, this explanation did not account for the abundant DARC expression on endothelial cells in the vasculature [27, 28]. We investigated other relevant cytokines in the cytokine sink hypothesis, and showed the levels of IL-2, 6, and 10, were similar between LW and HW individuals in the HUFS and MESA cohorts. These data suggested that this hypothesis played a minor role. Additional factors are likely present to influence WBC counts.
Recently separate investigations of DARC on HSC homeostasis and quiescence in a murine model were performed. DARC on bone marrow macrophages directly interacted with CD82 on HSCs and maintained HSC quiescence. The DARC null state, whether induced by loss of marrow macrophages through chemotherapy or by DARC knockout, led to the loss of CD82, and brought the HSCs out of quiescence into differentiation . In another study, the marrow of DARC knockout mice contained more committed myeloid progenitors, and neutrophils were activated with FcRγ and CD45 by gene expression, both important in host defense .
However, DARC null and HSC differentiation should lead to higher numbers of leukocytes (and its subsets), which is opposite of what we currently observe in BEN. Darc-/- mice have similar numbers of marrow HSCs and peripheral blood leukocyte and neutrophil counts. Transplantation of bone marrow from Darc-/- mice into DARC wildtype mice recapitulated the Darc-/- phenotype of differentiated marrow and activated neutrophils in peripheral blood. Interestingly, activated Darc-/- neutrophils preferentially egressed to the spleen, leading to relative neutropenia . In contrast, the reverse direction of transplantation, Darc wildtype marrow into Darc-/- mice, yielded a normal distribution and number of neutrophils. Thus, DARC null red cells, DARC null but activated neutrophils, and presence of DARC on the endothelium were all necessary for the neutropenia phenotype in steady state.
Duffy antigen has been known to be expressed primarily on red blood cells and endothelial cells, but not on neutrophils or other mature white blood cell subtypes. Our gene expression studies showed that DARC expression is detectable in neutrophils, and DARC was differentially expressed between BEN and non-BEN neutrophils. Our results added to a growing body of literature, demonstrating DARC expression on neutrophils, macrophages, and lymphocytes. For example, DARC was expressed at low levels in several white cell lineages, including CD19+ B cells, CD8+ T cells, CD4+ T cells, CD33+ myeloid cells, and CD14+ monocytes (http://biogps.org/#goto=genereport&id=2532, Figure Q in S1 File). In addition, Gene Expression Omnibus (GEO) profiles from several datasets (GDS2808, GDS2214, GDS3073, GDS2255; Figure R in S1 File) showed DARC expression in neutrophils at baseline and after exposure to various stimuli, as well as in case-control scenarios.
Several important themes emerged from our gene expression studies. Since gene expression in neutrophils from BEN and non-BEN individuals was similar, it is a reasonable expectation that neutrophils from BEN individuals would function similarly to neutrophils from other individuals. This finding supported the clinical observation that individuals with BEN do not have an increased incidence of infection and would mount an appropriate immunologic response to infection . This provided clinically important reassurance to BEN individuals, in addition to them having normal marrow morphology. Secondly, the differentially expressed genes were in pathways related to HSC mobilization and leukocyte migration, supporting the notion that activated neutrophils egress to the spleen (and possibly other organs), leading to relative neutropenia. In our earlier report of individuals of African or Caucasian descent undergoing bone marrow harvest to donate steady state hematopoietic cells for allogeneic transplantation, we showed that the total nucleated and CD34+ cells (a marker of HSC) were lower in those of African ancestry . Despite the lower number of these cells, they responded appropriately to G-CSF. Individuals of African ancestry receiving G-CSF increased WBC and neutrophil counts to higher numbers than Caucasians, and achieved similar numbers of hematopoietic progenitor cells, despite starting from lower baseline blood counts [32, 33].
In conclusion, our present work substantially advanced the understanding of BEN. The analysis of GWAS with extreme phenotype design, careful conditional analyses, and a large replication cohort confirmed that rs2814778 in DARC was associated with BEN. Our analysis of inflammatory cytokines, which showed similar levels in individuals homozygous for the rs2814778 C allele compared to others, indicated that the cytokine sink hypothesis played a minor role in leukocyte/neutrophil homeostasis. Whole genome expression profiling suggested that neutrophils in BEN individuals functioned similarly to neutrophils from non-BEN individuals. The subtle but activated BEN neutrophils in leukocyte migration and HSC mobilization pathways further supported the recent murine finding that the relative neutropenia in BEN individuals resulted from DARC null progenitors preferentially differentiating to myeloid cells, leading to activated neutrophils egressing from circulation to the spleen. Collectively, these data integrated with prior findings to explain the mechanism of DARC null red cells and neutrophils causing BEN and provided a biologic basis that BEN is clinically benign.
REGARDS research project is supported by a cooperative agreement U01 NS041588 from the National Institute of Neurological Disorders and Stroke, National Institutes of Health, Department of Health and Human Service.
Grants and fellowships supporting the writing of the paper: CIDR Contract # HHSN268201100011I, Center for Inherited Disease Research (CIDR) High Throughput Genotyping Resource Access, National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK); Intramural Research Program of the Center for Research on Genomics and Global Health supported by the National Human Genome Research Institute (NHGRI), the National Institute of Diabetes and Digestive and Kidney Diseases, the Center for Information Technology, the Office of the Director at the National Institutes of Health (Z01HG200362); and the NHGRI Health Disparity Postdoctoral Fellowship. The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C).
The authors thank other investigators, staff, and participants of the REGARDS study for their valuable contributions. A full list of participating REGARDS investigators and institutions can be found at http://www.regardsstudy.org. The authors also thank the staff and participants of the ARIC study for their important contributions.
Genotypic/Genomic Dataset Acknowledgment
ARIC Gene Environment Association Studies (GENEVA): Funding for GENEVA was provided by National Human Genome Research Institute grant U01HG004402 (E. Boerwinkle).
DARC expression analysis was supported by the National Institutes of Health Intramural Research Program (Project ID z99 CL999999) of the NIH Clinical Center.
- 1. Weingarten MA, Pottick-Schwartz EA, Brauner A. The epidemiology of benign leukopenia in Yemenite Jews Isr J Med Sci. 1993;29(5):297–9. pmid:8314691
- 2. Bain B. Ethnic and sex differences in the total and differential white cell count and platelet count. I Clin Pathol 1996;49:664–6.
- 3. Grann VR, Ziv E, Joseph CK, Neugut AI, Wei Y, Jacobson JS, et al. Duffy (Fy), DARC, and neutropenia among women from the United States, Europe and the Caribbean. Br J Haematol. 2008;143(2):288–93. pmid:18710383.
- 4. Hsieh MM, Everhart JE, Byrd-Holt DD, Tisdale JF, Rodgers GP. Prevalence of Neutropenia in the U.S. Population: Age, Sex, Smoking Status, and Ethnic Differences. Annals of Internal Medicine. 2007;146(7):486–92. pmid:17404350
- 5. Paz Z, Nails M, Ziv E. The genetics of benign neutropenia. The Israel Medical Association journal: IMAJ. 2011;13(10):625–9. Epub 2011/11/22. pmid:22097233.
- 6. Evans DM, Frazer IH, Martin NG. Genetic and environmental causes of variation in basal levels of blood cells. Twin research: the official journal of the International Society for Twin Studies. 1999;2(4):250–7. Epub 2000/03/21. pmid:10723803.
- 7. Garner C, Tatu T, Reittie JE, Littlewood T, Darley J, Cervino S, et al. Genetic influences on F cells and other hematologic variables: a twin heritability study. Blood. 2000;95(1):342–6. Epub 1999/12/23. pmid:10607722.
- 8. Pilia G, Chen W-M, Scuteri A, Orrú M, Albai G, Dei M, et al. Heritability of Cardiovascular and Personality Traits in 6,148 Sardinians. PLoS Genet. 2006;2(8):e132. pmid:16934002
- 9. Reich D, Nalls MA, Kao WHL, Akylbekova EL, Tandon A, Patterson N, et al. Reduced Neutrophil Count in People of African Descent Is Due To a Regulatory Variant in the Duffy Antigen Receptor for Chemokines Gene. PLoS Genet. 2009;5(1):e1000360. pmid:19180233
- 10. Shriner D, Bentley AR, Doumatey AP, Chen G, Zhou J, Adeyemo A, et al. Phenotypic variance explained by local ancestry in admixed African Americans. Front Genet. 2015;6:324. pmid:26579196; PubMed Central PMCID: PMCPMC4625172.
- 11. Howard VJ, Cushman M, Pulley L, Gomez CR, Go RC, Prineas RJ, et al. The reasons for geographic and racial differences in stroke study: objectives and design. Neuroepidemiology. 2005;25(3):135–43. Epub 2005/07/02. doi: 86678 [pii] pmid:15990444.
- 12. Shriner D. Investigating population stratification and admixture using eigenanalysis of dense genotypes. Heredity. 2011;107(5):413–20. Epub 2011/03/31. pmid:21448230; PubMed Central PMCID: PMC3128175.
- 13. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods. 2001;25(4):402–8. Epub 2002/02/16. pmid:11846609.
- 14. Ortiz MV, Meier ER, Hsieh MM. Identification and Clinical Characterization of Children With Benign Ethnic Neutropenia. J Pediatr Hematol Oncol. 2016;38(3):e140–3. pmid:26925714; PubMed Central PMCID: PMCPMC5102334.
- 15. Hsieh MM, Tisdale JF, Rodgers GP, Young NS, Trimble EL, Little RF. Neutrophil count in African Americans: lowering the target cutoff to initiate or resume chemotherapy? J Clin Oncol. 2010;28(10):1633–7. pmid:20194862; PubMed Central PMCID: PMCPMC2849762.
- 16. Reiner AP, Lettre G, Nalls MA, Ganesh SK, Mathias R, Austin MA, et al. Genome-Wide Association Study of White Blood Cell Count in 16,388 African Americans: the Continental Origins and Genetic Epidemiology Network (COGENT). PLoS Genet. 2011;7(6):e1002108. pmid:21738479
- 17. Li J, Glessner JT, Zhang H, Hou C, Wei Z, Bradfield JP, et al. GWAS of blood cell traits identifies novel associated loci and epistatic interactions in Caucasian and African-American children. Human Molecular Genetics. 2013;22(7):1457–64. pmid:23263863
- 18. Crosslin D, McDavid A, Weston N, Nelson S, Zheng X, Hart E, et al. Genetic variants associated with the white blood cell count in 13,923 subjects in the eMERGE Network. Human Genetics. 2012;131(4):639–52. pmid:22037903
- 19. Okada Y, Kamatani Y, Takahashi A, Matsuda K, Hosono N, Ohmiya H, et al. Common variations in PSMD3-CSF3 and PLCB4 are associated with neutrophil count. Hum Mol Genet. 2010;19(10):2079–85. Epub 2010/02/23. [pii]. pmid:20172861.
- 20. Kamatani Y, Matsuda K, Okada Y, Kubo M, Hosono N, Daigo Y, et al. Genome-wide association study of hematological and biochemical traits in a Japanese population. Nat Genet. 2010;42(3):210–5. Epub 2010/02/09. [pii]. pmid:20139978.
- 21. Reiner AP, Lettre G, Nalls MA, Ganesh SK, Mathias R, Austin MA, et al. Genome-wide association study of white blood cell count in 16,388 African Americans: the continental origins and genetic epidemiology network (COGENT). PLoS Genet. 2011;7(6):e1002108. Epub 2011/07/09. PGENETICS-D-11-00053 [pii]. pmid:21738479; PubMed Central PMCID: PMC3128101.
- 22. Horuk R, Chitnis CE, Darbonne WC, Colby TJ, Rybicki A, Hadley TJ, et al. A receptor for the malarial parasite Plasmodium vivax: the erythrocyte chemokine receptor. Science. 1993;261(5125):1182–4. Epub 1993/08/27. pmid:7689250.
- 23. de Carvalho GB. Duffy Blood Group System and the malaria adaptation process in humans. Revista brasileira de hematologia e hemoterapia. 2011;33(1):55–64. Epub 2011/01/01. pmid:23284245; PubMed Central PMCID: PMC3521437.
- 24. Moreno Velasquez I, Kumar J, Bjorkbacka H, Nilsson J, Silveira A, Leander K, et al. Duffy antigen receptor genetic variant and the association with Interleukin 8 levels. Cytokine. 2015;72(2):178–84. pmid:25647274.
- 25. Voruganti VS, Laston S, Haack K, Mehta NR, Smith CW, Cole SA, et al. Genome-wide association replicates the association of Duffy antigen receptor for chemokines (DARC) polymorphisms with serum monocyte chemoattractant protein-1 (MCP-1) levels in Hispanic children. Cytokine. 2012;60(3):634–8. pmid:23017229; PubMed Central PMCID: PMCPMC3501981.
- 26. Schnabel RB, Baumert J, Barbalic M, Dupuis J, Ellinor PT, Durda P, et al. Duffy antigen receptor for chemokines (Darc) polymorphism regulates circulating concentrations of monocyte chemoattractant protein-1 and other inflammatory mediators. Blood. 2010;115(26):5289–99. pmid:20040767; PubMed Central PMCID: PMCPMC2902130.
- 27. Lee JS, Wurfel MM, Matute-Bello G, Frevert CW, Rosengart MR, Ranganathan M, et al. The Duffy antigen modifies systemic and local tissue chemokine responses following lipopolysaccharide stimulation. J Immunol. 2006;177(11):8086–94. pmid:17114483; PubMed Central PMCID: PMCPMC2665269.
- 28. Lee JS, Frevert CW, Wurfel MM, Peiper SC, Wong VA, Ballman KK, et al. Duffy antigen facilitates movement of chemokine across the endothelium in vitro and promotes neutrophil transmigration in vitro and in vivo. J Immunol. 2003;170(10):5244–51. pmid:12734373; PubMed Central PMCID: PMCPMC4357319.
- 29. Hur J, Choi JI, Lee H, Nham P, Kim TW, Chae CW, et al. CD82/KAI1 Maintains the Dormancy of Long-Term Hematopoietic Stem Cells through Interaction with DARC-Expressing Macrophages. Cell Stem Cell. 2016;18(4):508–21. pmid:26996598.
- 30. Duchene J, Novitzky-Basso I, Thiriot A, Casanova-Acebes M, Bianchini M, Etheridge SL, et al. Atypical chemokine receptor 1 on nucleated erythroid cells regulates hematopoiesis. Nat Immunol. 2017;18(7):753–61. pmid:28553950; PubMed Central PMCID: PMCPMC5480598.
- 31. Shoenfeld Y, Alkan ML, Asaly A, Carmeli Y, Katz M. Benign familial leukopenia and neutropenia in different ethnic groups. European journal of haematology. 1988;41(3):273–7. Epub 1988/09/01. pmid:3181399.
- 32. Carilli AR, Sugrue MW, Rosenau EH, Chang M, Fisk D, Medei-Hill M, et al. African American adult apheresis donors respond to granulocyte-colony-stimulating factor with neutrophil and progenitor cell yields comparable to those of Caucasian and Hispanic donors. Transfusion. 2012;52(1):166–72. pmid:21790625.
- 33. Panch SR, Yau YY, Fitzhugh CD, Hsieh MM, Tisdale JF, Leitman SF. Hematopoietic progenitor cell mobilization is more robust in healthy African American compared to Caucasian donors and is not affected by the presence of sickle cell trait. Transfusion. 2016;56(5):1058–65. pmid:27167356; PubMed Central PMCID: PMCPMC5500229.