Analysis of a large single institution cohort of related donors fails to detect a relation between SDF1/CXCR4 or VCAM/VLA4 genetic polymorphisms and the level of hematopoietic progenitor cell mobilization in response to G-CSF

We studied a cohort of 367 healthy related donors who volunteered to donate their hematopoietic stem cells for allogeneic transplantation. All donors were homogeneously cared for at a single institution, and received rhG-CSF as a mobilization treatment prior to undergoing apheresis. Peripheral blood CD34+ cell counts were used as the main surrogate marker for rhG-CSF induced mobilization. We searched whether inter-individual variations in known genetic polymorphisms located in genes whose products are functionally important for mobilization, could affect the extent of CD34+ mobilization, either individually or in combination. We found little or no influence of individual SNPs or haplotypes for the SDF1, CXCR4, VCAM and VLA4 genes, whether using CD34+ cell counts as a continuous or a categorical variable. Simple clinical characteristics describing donors such as body mass index, age and possibly sex are more potent predictors of stem cell mobilization. The size of our cohort remains relatively small for genetic analyses, however compares favorably with cohorts analyzed in previously published reports suggesting associations of genetic traits to response to rhG-CSF; notwithstanding this limitation, our data do not support the use of genetic analyses when the choice exists of several potential donors for a given patient.

Introduction Allogeneic stem-cell transplantation (ASCT) is a curative treatment for patients with severe hematological malignancies. Several sources of stem cells can be used, including bone marrow (BM), peripheral blood (PB) and umbilical cord blood. PB cell collection presents several advantages: leukapheresis is a moderately invasive and semi-automated procedure that can be performed on an outpatient basis and does not require access to the operating room nor general anesthesia; infusion of PB cells to the recipient is associated with rapid engraftment and hospital discharge, both after myeloablative and reduced intensity conditioning regimen. [1,2] Although, the relationship between the numbers of infused CD34+ cells and recipient engraftment and outcome remains controversial [3,4], most transplant programs expect that the collected cell graft will contain at least a defined minimal number of progenitors, and some will possibly cap the number of infused CD34+ cells. These objectives can be reached only when donors CD34+ cells are appropriately mobilized through the administration of granulocyte colony-stimulating factor (G-CSF). The number of CD34+ cells mobilized into PB varies significantly among donors, mainly depending on clinical factors as age, gender or weight. [5,6] Genetic susceptibility may also influence the quality of mobilization. Indeed, several BM proteins are involved in stem-cell homing and G-CSF-induced mobilization, including CXCR4 and its ligand SDF1 (CXCL12) [7,8] and VCAM1 and its ligand VLA4. [9,10] Interactions of these two receptor-ligand pairs are disrupted following G-CSF treatment, and structural or functional variations in these molecules may influence response to G-CSF. [11][12][13] Some of these inter-individual variations may be appreciated through single nucleotide polymorphisms (SNPs). In contexts other than HSCT, many studies have evidenced a relation between SNPs and the variability in response to drug administration. [14,15] There are limited evidences suggesting that similar patterns may exist for CD34+ cell mobilization in response to rhG-CSF, either for patients or healthy donors who are preparing to undergo apheresis prior to autologous or allogeneic hematopoietic cell transplantation respectively. [11][12][13] We conducted a retrospective analysis of the association between SDF1/CXCR4 and VCAM/VLA4 genetic polymorphisms and CD34+ cells mobilization in healthy related donors.

Donor selection and care
The study includes three hundred and sixty-seven adult healthy donors who donated their blood mononuclear cells to related patients who received allogeneic transplantation between 1997 and 2016 and were cared for at a single institution: Institut Paoli-Calmettes, the comprehensive cancer center in Marseille (see Table 1). All donors were identified, screened and collected in accordance with national or international regulations, institutional policies, EFI and EBMT/FACT-JACIE prescriptions, including transparent information on HLA typing, donation and their rights to consent. The project was approved by the "Comité d'Orientation Stratégique" (COS; Internal Review Board) at the Direction de la Recherche Clinique et de l'Innovation (DRCI), Institut Paoli-Calmettes. All patients and donors involved in the adult transplantation program at Institut Paoli-Calmettes provided informed written consent for the use of their personal data as per EBMT and FACT-JACIE requirements, in compliance with national and European regulations.
In preparation for apheresis, donors received daily SQ injections of rhG-CSF in the evening as per institutional procedures. Prior to September 2009, all donors received a 600 μg daily dose independently of their weight. Starting in September 2009, the G-CSF daily dose was adjusted to donor's weight to approximately match the 10μg/kg/day dosing, as recommended per rhG-CSF label (daily doses were 480, 600, 780 or 900μg). Counseling was provided on expected side effects, particularly bone pain, and prophylactic paracetamol prescribed to reduce pain. Circulating CD34+ cells were first counted at day 5, after 4 evening injections of rhG-CSF, using a single-platform flow-cytometry based technique as previously described. [4] No donor infectious or inflammatory manifestations were reported in the 8 days preceding donation. All CRP were negative (<6 mg/l).
VCAM1-rs1041163 (T>C), VLA4-rs1449263 (A>G) and CXCR4-rs2680880 (A>T) were analyzed by direct sequencing after PCR amplification. PCR amplification was performed on 50 ng of gDNA in a final volume of 25 μL containing 1x PCR buffer, 1.5 mM MgCl2, 0.2 mM of dNTPs, 0.1 unit of Taq DNA-polymerase (Invitrogen, France) and 0.16 μM of each primer. Primers designed using the Primer 3 program (http://bioinfo.ut.ee/primer3-0.4.0/primer3/). The VCAM, VLA4 and CXCR4 PCR primer sequences were respectively 5'ATTGGCCATTG TCTTTGAGC3' and 5'GATGCTGTTCTAGGGTGTGG3'; 5'TGCCCACTATATGCCAA  Table 2). Amplification was carried out as follows: 1 cycle at 95˚C for 15 min; 30 cycles at 95˚C for 30 sec, 57˚C for 45 sec, and 72˚C for 60 sec; and 1 cycle at 72˚C for 10 min. After control on agarose gel, 5 μl of PCR product was incubated with 0.5 units of thermosensitive alkaline phosphatase and 1 unit of exonuclease-I (Euromedex; France) for 15 min at 37˚C followed by 15 min at 80˚C to remove unincorporated primers and dNTPs. The second step was a multiplex extension reaction performed using the SNapShot kit (Invitrogen) according to manufacturer's protocol in a final volume of 10 μL containing 3 μL of the PCR product, 5 μL of SNapShot mix, and extension primers ( Table 3). The reaction program was 25 cycles at 95˚C for 10 seconds, 50˚C for 5 seconds, 60˚C for 30 seconds. Snap Shot extension primer data were analyzed using GeneMapper v4.0 with specific detection parameters as previously described. [17]

Statistical analyses
Donor's associated data-categorized into biological and clinical data including age, height, weight, IMC, sex, G-CSF total dose, G-CSF dose/kg and peripheral blood CD34+ cell countsare described in Table 1. Allelic frequencies and haplotype estimation. Missing data at a locus led to the exclusion of the concerned sample from further analyses at the given locus. No multiple imputations were used. Allelic and two or more loci haplotype frequencies were estimated using an EM algorithm implemented in the Gene[Rate] computer tools. [12] Deviations from Hardy-Weinberg equilibrium (HWE) were tested using a nested likelihood model. [13] Haplotypes frequencies based on genotype of each SNPs of a same gene, i.e. SDF1 and CXCR4, were estimated by Gene[rate] computer tool package with no a priori. For allelic and two or more loci frequency estimations, all putative homozygotes were considered either true homozygotes or heterozygotes for the observed allele, and an undefined or undetectable ('blank') allele as previously described. [16] Based on this haplotype estimation, main haplotypes, with a cumulated frequency higher than 98%, were a priori encoded for each gene and genotype data were reanalyzed according to this new nomenclature. Using an in-house computer program, data output files (.txt) were formatted into files readable by the "Phenotype" application of the Gene[rate] computer tool package.
Biological parameters statistical testing. Statistical analyses were performed using SPSS software (SPSS 19.0 for Windows; SPSS Inc., Chicago, IL) and the R software version 3.0.3 associated to random forest SRC package [18]. The primary endpoint was influence of SNPs on peripheral blood CD34+ cell count/mL on day 5 of G-CSF treatment; CD34+ cell counts were considered as continuous and categorical variables in separate analyses. For continuous variables, median and extreme values are presented. Differences in medians have been analyzed with the t-test for comparisons of two independent samples in univariate analyses or with one-way anova for multiple comparisons. t-tests were considered as significant when two-tailed p-values were < 0.05, except for clinical or molecular factors that has already been associated with a modification with mobilization in previous studies (age, gender, BMI and VCAM1-rs1041163 CC homozygous variant). [14] In these case, given the expected influence on mobilization, we used a one-tailed p-value < .05. No adjustment for multiple tests was performed.
Linear regression. Two random forests for linear regression [19] were used to evaluate importance of clinical and molecular variables on a higher CD34+ cells count. The first model considered SNP classification as molecular variables and the second used haplotypes Table 3. Extension primers sequence and concentration used to genotype CXCR4-rs12691874 A>G, CXCR4-rs16832740 T>C, CXCR4-rs2228014 C>T, SDF1-rs1413519 G>C, SDF1-rs1801157 G>A, SDF1-rs2297630 G>A, SDF1-rs266085 C>T, SDF1-rs266087 G>A.

SNP
Primer sequence, Forward (F) and Reverse (R) classification. The measure of the prediction accuracy of the Random Forest models was given by the mean squared error (MSE); variables importance (VIMP) was determined using permutation importance measure for Random forest, based on out-of-bag (OOB) estimate of prediction error: for a given variable, OOB cases (the original data left out from the bootstrap sample used to grow the tree; approximately 1/3 of the original sample) are randomly permuted in this variable and the prediction error is recorded. The VIMP of this variable is defined as the difference between the perturbed and unperturbed error rate averaged over all trees. The larger value this difference, the more predictive the variable.

Impact of VCAM/VLA4 and SDF1/CXCR4 SNPs on CD34 positive cell mobilization
We next searched for a relation between each genetic SNP and PB CD34+ cell counts was analyzed in univariate analyses. Allelic frequencies are described in Table 5 Table 6). We did not find any other significant association between SNPs and mobilization ( Table 6). We also used the CD34 count as a categorical variable taking 30 CD34+ cells/μl as a cut-off, assuming it is clinically relevant in pinpointing the poor mobilizer (<30 CD34+ cells/μl). This analysis failed to find a significant association between VCAM/VLA4 or SDF1/CXCR4 polymorphism and mobilization ( Table 7).

Impact of VCAM/VLA4 and SDF1/CXCR4 haplotypes on CD34+ cell mobilization
Allelic frequencies and haplotype estimation. Haplotypes were also investigated to identify better mobilizers. A haplotype is a group of gene variants that are inherited together from  Tables 8 and 9. Nine and 8 haplotypes were respectively estimated for SDF1 and CXCR4, and for both genes, 5 haplotypes displayed a cumulated frequency of 99.3% and 98.5% respectively, encoded SDF1-A to SDF1-E and CXCR4-A to CXCR4-E. Genotype data reanalyzed according to this a priori nomenclature are described in Tables 10 and 11. With this new coding, blank haplotype represented respectively 5.8% and 6.1% for SDF1 and CXCR4.

Clinical versus biological factors in predicting CD34+ cells mobilization
Two random forests for linear regression model were built in order to study importance of clinical and molecular variables on CD34+ cell mobilization [20,21]. As shown in Fig 2, clinical factors were strongly associated with CD34+ cell mobilization contrary to molecular data, either taking SNP or haplotypes into account. Thus, we concluded that age and BMI data alone are sufficient to predict CD34+ cell mobilization in the context of ASCT.

Discussion
ASCT is used for the treatment of patients with a variety of severe malignant or non-malignant hematological disorders, either constitutional or acquired. A main challenge in ASCT is to rapidly identify and select a suitable donor, from whom to collect sufficient numbers of hematopoietic cells and progenitors; when apheresis is used to collect allogeneic peripheral blood stem cells, most transplant programs have defined minimal number of CD34+ cells to procure as a mean to ensure rapid engraftment and establish hematopoietic chimerism in the recipient. The extent of CD34+ cell mobilization varies significantly among donors, mainly depending on age, gender or weight, possibly also on genetic variations. [11][12][13] However, few published studies conducted on CD34+ cell mobilization included a multivariate analysis simultaneously considering biological and genetic variables. Here, we analyzed biological and genetic parameters described to influence CD34+ cell mobilization after G-CSF administration in 367 consecutive volunteer healthy donors that were homogeneously cared for at a single institution. Our results suggest that age, BMI, and possibly sex mostly influence the response to G-CSF.
Larger registry studies conducted in unrelated-and thus on younger-donors already identified the influence of these factors on stem cell mobilization; our single-institution cohort of related donors offers the advantage of harmonized mobilization and collection procedures. In addition, published studies on unrelated donors do not include the analysis of genetic variants. In our cohort, genetic variations in genes whose products are known to play an important role in stem cell egress out of the bone marrow little affects the results of mobilization and collection procedures used in the clinical context of ASCT. Such associations were evidenced in a much smaller cohort of 112 donors in a previously published report [11]; donors in this   study were however younger than in our cohort (38 years old vs 50 years old), which may affect the results of such studies. While the number of analyzed individuals may appear small, the cohort was large enough to allow for the confirmation of the predictive value of age and BMI, variables whose predictive value for mobilization was already demonstrated in other contexts. The random forest analysis that was performed suggests that addition of genetic factors will not add to the predictive value of the model, and that increasing the size of the cohort is unlikely to change our conclusions. Given these uncertainties, it is unlikely that screening donors for these individual SNPs could produce relevant information to guide clinical practice. In addition, we further analyzed whether haplotypes could be associated with different levels of response to rhG-CSF, but failed to evidence such a relation. Nevertheless, we found that the VCAM1-rs1041163 CC homozygous variant is associated with a lower mobilization, similarly to what was previously described in a 112 healthy individual cohort in a study by Martin-Antonio et al; [11] the authors also found that this variant was associated with a lower PB CD34+ cells mobilization after G-CSF treatment. By contrast, in a recent study on a smaller cohort of 46 patients, the frequency of this VCAM1 CC allele was higher in the good mobilizer group. [13] Our study also confirms previous negative findings on the influence of the SDF1-rs1801157 polymorphism in two other cohort of 463 and 515 donors on CD34+ cell mobilization. [15,22] CD44, another gene that encodes a molecule involved in adhesive and chemotactic interactions of CD34+ cells within the bone marrow niche [14], or DGKB, a crucial regulator of glycerolipid metabolism [13] are also involved in stem cell retention and mobilization, and deserve further exploration.
In conclusion, our study provides additional evidences supporting a relation between clinical donor characteristics such as BMI, age, and possibly sex and the biological response to rhG-CSF used as a CD34+ cell mobilization agent in view of cell procurement for ASCT. Together with previously published work, it does not support a strong relationship between genetic polymorphisms in the sequence coding for functionally important molecules-and pharmacological targets-involved in hematopoietic progenitor cell trafficking. Since interactions of stem cells with the bone marrow niches involve multiple molecular actors, it is possible that a more comprehensive exploration such as GWAS could identify genetic patterns associated with more or less profound response to rhG-CSF. Existing evidences however do not support donor explorations for clinical applications. To ensure rapid hematopoietic recovery after HSCT, optimization of CD34+ cell collection during apheresis mostly relies on tailoring procedural parameters to donor characteristics, including immediate pre-apheresis measurement of CD34+ cell numbers in the peripheral blood.