Attenuated Monocyte Apoptosis, a New Mechanism for Osteoporosis Suggested by a Transcriptome-Wide Expression Study of Monocytes

Background Osteoporosis is caused by excessive bone resorption (by osteoclasts) over bone formation (by osteoblasts). Monocytes are important to osteoporosis by serving as progenitors of osteoclasts and produce cytokines for osteoclastogenesis. Aim To identify osteoporosis-related genes, we performed microarray analyses of monocytes using Affymetrix 1.0 ST arrays in 42 (including 16 pre- and 26 postmenopausal) high hip BMD (bone mineral density) vs. 31 (including 15 pre- and 16 postmenopausal) low hip BMD Caucasian female subjects. Here, high vs. low BMD is defined as belonging to top vs. bottom 30% of BMD values in population. Method Differential gene expression analysis in high vs. low BMD subjects was conducted in the total cohort as well as pre- and post-menopausal subjects. Focusing on the top differentially expressed genes identified in the total, the pre- and the postmenopausal subjects (with a p <5E-03), we performed replication of the findings in 3 independent datasets of microarray analyses of monocytes (total N = 125). Results We identified (in the 73 subjects) and successfully replicated in all the 3 independent datasets 2 genes, DAXX and PLK3. Interestingly, both genes are apoptosis induction genes and both down-regulated in the low BMD subjects. Moreover, using the top 200 genes identified in the meta-analysis across all of the 4 microarray datasets, GO term enrichment analysis identified a number of terms related to induction of apoptosis, for which the majority of component genes are also down-regulated in the low BMD subjects. Overall, our result may suggest that there might be a decreased apoptosis activity of monocytes in the low BMD subjects. Conclusion Our study for the first time suggested a decreased apoptosis rate (hence an increased survival) of monocytes, an important osteoclastogenic cell, as a novel mechanism for osteoporosis.


Introduction
Osteoporosis is a prevalent bone metabolic disease characterized by bone fragility and an increased risk of low trauma fractures, especially in elderly females. As a key pathophysiological mechanism, the disease is caused by excessive bone resorption (by osteoclasts) over bone formation (by osteoblasts). Here, a low trauma fracture is defined as a fracture resulting from a fall from a standing position or lower [1], which is in contrast to a high trauma fracture defined as a fracture due to motor vehicle accidents or falls from greater than a standing height [2].
Genetics plays an important role in development of osteoporosis, as evidenced by a high heritability (~60%) of bone mineral density (BMD) [3][4][5], a key phenotype for osteoporosis. Over the last decade, a large number of genes have been identified for osteoporosis through a variety of genetics and genomics approaches [6].
Peripheral blood monocytes (PBMs) are a nice model for osteoporosis study. They are osteoclast progenitor cells [7][8][9][10] and produce cytokines for osteoclastogenesis and bone resorption [11,12]. Through PBMs gene/protein expression studies, a number of novel genes/proteins and new pathophysiological mechanisms have been identified for osteoporosis [13][14][15][16][17], which expanded our understanding of this complex disease. These studies demonstrated that genomic analyses of PBMs may reveal genes for osteoporosis at the osteoclast progenitor stage.
Here with a sample of 73 Caucasian females, stratified by hip BMD and menopause status, we performed a new microarray-based transcriptome-wide gene expression study of osteoporosis using the latest Affymetrix Human Exon 1.0 ST arrays. Our working hypothesis is that those genes differentially expressed in PBMs in high vs. low BMD subjects may underlie BMD variation. To replicate the findings from the 73 samples, we performed in silico replication using 3 existing monocyte microarray datasets for osteoporosis study [13,15,16]. We also performed meta-analysis across all the four microarray datasets and selected the top 200 genes with the most significant meta-analysis p values for GO (gene ontology) term enrichment analysis. In addition, we conducted verification of the identified genes using osteoporosis GWAS datasets, including the largest osteoporosis GWAS meta-analysis dataset, the GEFOS [18].

Subjects for the microarray study
The study was approved by Institutional Review Board or Research Administration of the involved institutions. Signed informed-consent documents were obtained from all study participants before entering the study. Exclusion criteria were used to exclude the comorbidities of osteoporosis from this study, e.g., severe diabetes (that can only be controlled with insulin).
Our discovery cohort (discovery dataset) contains 73 Caucasian females, with the age from 47 to 56. The sample is stratified by hip BMD and menopausal status. Their detailed characteristics are presented in Table 1. In brief, the high BMD group contains 42 subjects, with 16 pre-and 26 postmenopausal subjects. The low BMD group contains 31 subjects, with 15 pre-and 16 postmenopausal subjects. Menopause is defined as cessation of regular menses for at least one year. The raw microarray data for this cohort were submitted to GEO (Gene Expression Omnibus) under the accession number GSE56814.
For replication purpose, 3 independent cohorts were involved in this study, which were used in several previous monocyte microarray studies for osteoporosis [13,15,16]. Replication cohort I contains 80 Caucasian females, including 40 high (20 pre-and 20 postmenopausal) and 40 low hip BMD (20 pre-and 20 postmenopausal) subjects [16]. Replication cohort II contains 19 Caucasian females, including 10 high (5 pre-and 5 postmenopausal) and 9 low hip BMD (4 pre-and 5 postmenopausal) subjects [13]. Replication cohort III contains 26 Chinese females, all premenopausal and including 14 high and 12 low hip BMD subjects [15]. Definition of menopause for replication cohorts I and II is the same as for the discovery cohort. The basic characteristics for the 3 replication cohorts are also presented in Table 1. The raw microarray data for Replication cohorts I to III are archived in GEO under accession numbers GSE56815, GDS1287 and GSE7158, respectively.

BMD measurements
For our discovery cohort, hip BMD (g/cm 2 ) was measured using a Hologic dual energy x-ray absorptiometer (DXA) scanner (Hologic Corp., Waltham, MA). The machine was calibrated daily. The coefficient of variation of the dual energy x-ray absorptiometer measurements for BMD was 0.9%. None of our subjects had any known conditions that might artificially increase BMD values, such as osteophytes and facet sclerosis. The obtained BMD value was then transformed into a Z score, which is the units of standard deviations above or below the mean of a healthy, ethnic-, gender-, and age-matched reference population. We used the same method for BMD measurement for the 3 replication cohorts. The detailed description can be found in the published studies using the three cohorts [13,15,16].

Experimental procedures
The experimental procedures for our discovery cohort samples are briefly described as follows. The corresponding procedures for our replication cohort samples were similar and are described in the published studies using the samples [13,15,16].
Monocyte isolation. Monocyte isolation from whole blood was performed with a monocyte-negative isolation kit (Miltenyi Biotec Inc, Auburn, CA) following manufacturer's recommendation.
Total RNA extraction. Total RNA from monocytes was extracted using Qiagen RNeasy Mini kit (Qiagen, Inc., Valencia, CA). We used Agilent Bioanalyzer (Agilent, Santa Clara, CA) to control the RNA quality before each array experiment, where RNA integrity number (RIN) should be no less than 7.0 [19,20].
Array experiments. Preparation of cDNA, hybridization, and scanning of the GeneChip Human Exon 1.0 ST Array was performed according to the manufacturer's protocol for the Exon 1.0 Array (Affymetrix, Santa Clara, CA), which was published online via Affymetrix's website: http://www.affymetrix.com/estore/catalog/131452/AFFY/Human+Exon+ST+Array#1_3.

Microarray data analysis
On both the discovery cohort and replication cohort datasets, we used RMA (robust multiarray average) method [21] to normalize the array signals through the Bioconductor's Oligo package [22]. After normalization of a certain dataset, we performed differential expression analysis using t test through the Bioconductor's LIMMA (linear models for microarray data) package [23][24][25].

GO term enrichment analysis using DAVID
We performed GO (gene ontology) term enrichment analysis using DAVID (Database for Annotation, Visualization and Integrated Discovery) v6.7 software package (http://david.abcc. ncifcrf.gov/) [26]. Our specific focus is on apoptosis-induction-related GO terms to see if these terms are significantly enriched.
To select those genes that are submitted to DAVID for enrichment analysis, we performed meta-analysis across all the four microarray datasets. We first selected those genes that have the same direction of regulation (i.e., down-or up-regulation in low BMD subjects) across all the four datasets. On those selected genes we then performed meta-analysis using Fisher's method [27,28] to summarize the p values across all the four datasets. We then ranked the meta-analysis p values and selected the top 200 genes with the most significant p values. These 200 genes were then submitted to DAVID software for enrichment analysis.
The above analyses (gene selection, meta-analysis and enrichment analysis) were also performed in pre-and postmenopausal subjects separately across all the four microarray datasets.

GWAS datasets confirmation
We took advantage of 7 GWAS datasets, either from our own group or selected from dbGAP (the Database of Genotype and Phenotype, http://www.ncbi.nlm.nih.gov/gap/, to confirm several genes (DAXX, PLK3, PDCD5 and VDAC1) for association with hip BMD at the population level. The related information is detailed as follows.
GWAS datasets. Seven GWAS datasets were involved in the analysis, of which 3 were from our own studies and 4 were selected from the dbGAP. All the studies related to the datasets were approved by the respective institutional ethics review boards and all participants provided written informed consent.
Our own GWAS datasets include: 1) OOS (Omaha osteoporosis study) [29] with 987 unrelated Caucasians; 2) KCOS (Kansas City Osteoporosis Study) with 2,250 unrelated Caucasians; and 3) COS (Chinese Osteoporosis Study) with 1,547 unrelated Chinese of Han ethnicity. The basic characteristics of the 3 cohorts are shown in Table 2.
The 4 dbGAP datasets include: 1) FHS (Framingham Heart Study), a longitudinal and prospective cohort comprising over 16,000 individuals of European ancestry spanning three generations [30]. Focusing on the first two generations, we identified 3,747 phenotyped individuals. 2) IFS (Indiana Fragility Study), a cross-sectional cohort comprising 1,493 premenopausal sister pairs of European ancestry [18]. After quality control, 1,488 subjects were selected for analysis. 3) WHI-AA (Women's Health Initiative, African American ancestry): WHI is an observational study of a partial factorial randomized and longitudinal cohort with over 12,000 genotyped women aging 50-79 years, of African-American or Hispanic ancestry [31]. The WHI-AA dataset contains 712 phenotyped individuals of African-American ancestry. 4) WHI-HIS (WHI, Hispanic ancestry): containing 409 phenotyped individuals of Hispanic ancestry from the WHI cohort. The basic characteristics of the above 4 cohorts are shown in Table 2.
Phenotype measurements and modeling. Hip BMD was measured with DXA scanners following the manufacturer protocols; either Lunar scanners (Lunar Corp., Madison, WI, USA) or Hologic scanners (Hologic Inc., Bedford, MA, USA) were used for measuring individual cohorts. Covariates were screened among gender, age, age square, weight, and height, with the step-wise linear regression model. Raw hip BMD measurements were adjusted by significant covariates. To adjust for potential population stratification, the first five principal components derived from genome-wide genotype data were also included as covariates [32]. Residual phenotypes after adjustment were normalized by inverse quantile of the standard normal distribution to impose a normal distribution on phenotypes that were analyzed subsequently [33].
Genotyping and quality control (QC). All the GWAS cohorts were genotyped by highthroughput SNP genotyping arrays (Affymetrix Inc., Santa Clara, CA; or Illumina Inc., San Diego, CA, USA for individual cohorts) following the manufacturer's protocols. QC of genotype data were implemented with Plink [34], with the following criteria applied: individual missingness < 5%, SNP call rate > 95%, and Hardy-Weinberg equilibrium (HWE) p-value > E-05. For familial samples (FHS and IFS), all genotypes with Mendel error were set to missing. Association testing. Each GWAS cohort was tested for association between hip BMD and SNPs under an additive mode of inheritance. For unrelated samples, association was examined by the linear regression model with MACH2QTL [35], in which allele dosage was taken as the predictor for the phenotype. For familial samples (FHS and IFS), a mixed linear model was used in which the effect of genetic relatedness within each pedigree was also taken into account [36]. Genomic control inflation factor was estimated for each individual GWAS [37].
Meta-analysis. Summary statistics of associations from each GWAS were combined to perform weighted fixed-effect meta-analysis with METAL [38], in which weights were proportional to the square-root of each sample size. Cochran's Q statistic and I 2 were calculated as measures of between-study heterogeneity [39].

GEFOS replication
We checked the SNPs of the four apoptosis genes (DAXX, PLK3, PDCD5 and VDAC1) for their association signals in the GEFOS (Genetic factors for osteoporosis consortium) GWAS datasets, which were downloaded from www.gefos.org. The datasets include GWAS meta-analysis p values for >2 million SNPs for association with femoral neck (in male, female and all subjects) and lumbar spine BMD (for male, female and all subjects).

Real-time PCR confirmation
We performed real-time RT-PCR in a subset of the discovery cohort samples (n = 51). The samples are made up of 30 high BMD subjects and 21 low BMD subjects, among whom, 13 are premenopausal high BMD subjects, 17 postmenopausal high BMD subjects, 10 premenopausal low BMD subjects and 11 postmenopausal low BMD subjects. To confirm differential expression for the VDAC1 gene, the experiment was performed in the total 51 subjects since the differential expression for this gene was discovered by microarray in the total discovery cohort. To confirm differential expression for the DAXX, PLK3 and PDCD5 genes, the experiment was performed in the 13 high vs. 10 low BMD premenopausal subjects since the differential expression for the three genes was discovered by microarray in the premenopausal subjects of the discovery cohort. For the experiment, we used TaqMan Assays and StepOne Plus Real-Time PCR system from Life Technologies (Grand Island, NY). We used GAPDH as the house-keeping gene. The experimental data were analyzed using the StepOne Software v.2.2.2, which produced a ΔCT value as a quantitative measure for the target gene transcript abundance for each sample. We used the ΔCT value from each sample and performed student t test to compare the expression levels in samples from high vs. low BMD subjects.

Other analyses
Using the NIH SNP Function Prediction database (http://snpinfo.niehs.nih.gov/snpinfo/ snpfunc.htm), we checked the potential functions for the SNPs nearby the two genes, which were replicated in the 7 GWAS datasets for association with hip BMD or associated with BMD in the GEFOS dataset.
Using SPSS (IBM, Armonk, NY), we performed normality test for the ΔCT values from the real-time PCR experiment used for the t test.
A classical pathway for osteoblast precipitated osteoclast activity is through RANKL-RANK and RANKL-OPG interactions [40]. Using UniHI7, a database for human molecular interaction networks [41], we retrieved the potential interacting partners of RANKL, RANK and OPG.

Results
In the discovery cohort (n = 73) microarray dataset (GSE56814), we identified a large number of genes differentially expressed (with p < 0.05) in high (n = 42) vs. low BMD (n = 31) subjects, in premenopausal high (n = 16) vs. low BMD (n = 15) subjects, and in postmenopausal high (n = 26) vs. low (n = 16) BMD subjects. To avoid false positive findings, we focused only on the top well-annotated differentially expressed genes (which have a p value of < 5E-03) identified in the total group as well as in the subgroup (pre-or postmenopausal groups) analyses and performed replication analysis to validate these genes' differential expression in terms of BMD in our replication cohort datasets. By successful replication, we mean that a gene must achieve a p value </ = 0.10 in a replication cohort dataset with the same direction of effect in both the discovery and replication cohort datasets (e.g., upregulated in low BMD subjects in both the discovery and replication cohorts).
Using replication cohorts I (n = 80) (GSE56815), II (n = 19) (GDS1287) and III (n = 26) (GSE7158), we successfully replicated 2 genes (DAXX and PLK3). They are both apoptosis induction genes [42,43] and both down-regulated in the low BMD subjects. The p values and direction of regulation for the 2 genes in our discovery and replication cohorts are listed in Table 3. We also used Fisher's method [27,28] for meta-analysis to combine the p values achieved in the discovery and replication cohort datasets and the meta-analysis p values are also shown in the table.
In Table 3, we also include 2 other apoptosis induction genes, PDCD5 [44,45] and VDAC1 [46,47], which are replicated in the largest replication cohort, replication cohort I (n = 80). In total, there are 98 genes replicated in replication cohort I, and PDCD5 and VDAC1 are ranked at 7 th and 17 th , respectively, according to their meta-analysis p values. The detailed information of these 98 replicated genes is included in S1 Table. We selected those genes that have the same direction of regulation (i.e., down-or up-regulation in low BMD subjects) across all the four microarray datasets and performed meta-analysis 3. Comparison group means that the differential expression analysis was performed in the total or a subgroup (e.g., premenopausal group). For replication cohorts II and III, the differential expression analysis was performed in the total group. 4. NS: non-significant. across the four datasets on the selected genes using Fisher's method [27,28]. Those top 200 genes with the most significant meta-analysis p values were submitted to DAVID (Database for Annotation, Visualization and Integrated Discovery) v6.7 software package (http://david. abcc.ncifcrf.gov/) [26] for GO term enrichment analysis. The above analyses (gene selection, meta-analysis and enrichment analysis) were performed in both the total subjects as well as pre-and postmenopausal subjects separately across all the four microarray datasets.
Through the analysis, we identified five significantly enriched GO terms (enrichment p values ranging from 6.90E-04 to 6.40E-03) that are related to induction of apoptosis in the premenopausal subjects, i.e., GO:0006917~induction of apoptosis (p = 6.90E-04), GO:0012502~induction of programmed cell death (p = 7.09E-04), GO:0043065~positive regulation of apoptosis (p = 1.01E-03), GO:0043068~positive regulation of programmed cell death (p = 1.08E-03) and GO:0010942~positive regulation of cell death (p = 1.12E-03). More detailed information for these GO terms is presented in Table 4. Interestingly, among the 15 component genes for these GO terms, the majority (9 genes) were down-regulated in the low BMD subjects, suggesting an overall trend of down-regulation of apoptosis-induction in the low BMD subjects, which is consistent with the pattern as suggested in the four individual apoptosis genes as discussed above (DAXX, PLK3, PDCD5 and VDAC1).
We also submitted to DAVID for enrichment analysis the 98 differentially expressed genes that were identified in the discovery cohort and replicated in the replication cohort I (S1 Table). Through DAVID analysis of these genes, we identified a GO term "cell cycle arrest (GO:0007050)", which achieved a p value of 2.60E-04. The term contains six genes (HBP1, CUL1, CDKN2D, GAS2L1, PKD1, PPP1R15A), among which, five genes (HBP1, CDKN2D, GAS2L1, PKD1, PPP1R15A) are down-regulated in low vs. high BMD subjects (S1 Table), suggesting an overall decreased cell cycle arrest (i.e., increased proliferation) in low vs. high BMD subjects. Therefore, our data also supported "increased monocyte hyperplasia" in low BMD vs. high BMD subjects.
Using the 7 GWAS datasets, we assessed for association with hip BMD at the population level for the 4 apoptosis genes, among which, 3 genes, VDAC1, DAXX and PLK3, appear to be associated with hip BMD. The 3 genes have SNPs achieving p values of < 0.05 according to meta-analysis over the 7 GWAS datasets. The detailed information of the SNPs is presented in Table 4. GO term enrichment analysis using DAVID.  In the GEFOS dataset, the SNPs from the DAXX, PLK3 and VDAC1 genes also achieved signals of association with BMD phenotypes. The results are summarized in Table 6 Using the NIH SNP Function Prediction database (http://snpinfo.niehs.nih.gov/snpinfo/ snpfunc.htm), we checked the potential functions for the SNPs nearby the two genes DAXX and PLK3, including rs11873, rs1059231, rs2073524, rs2073525, and rs2239839 for DAXX, and rs17883049, rs6676749, rs11211036, and rs11584440 for PLK3, which were replicated in the 7 GWAS datasets for association with hip BMD (Table 5) or associated with BMD in the GEFOS dataset (Table 6).
A classical pathway for osteoblast precipitated osteoclast activity is through RANKL-RANK and RANKL-OPG interactions [40]. Using UniHI7 [41], a database for human molecular interaction networks, we retrieved the potential interacting partners of RANKL, RANK and OPG. Among these retrieved interactions, two interacting partners of RANKL, i.e., CALCR and NFKBIA, were found differentially expressed in our discovery cohort and their differential expression was also replicated in the replication cohort I (S1 Table). The first gene, CALCR, is upregulated in low vs. BMD subjects and the 2 nd gene, NFKBIA, is down-regulated in low vs. high BMD subjects (S1 Table). These two genes' potential interaction with RANKL was supported by previous studies [49,50] as well as additional bioinformatics analyses at gene co-expression and GO co-annotation levels [41]. Through real-time PCR and using a subset of the samples from our discovery cohort, we successfully confirmed the differential expression of the PLK3 gene in our premenopausal subjects' samples (p = 0.016) based on the t test. The result is plotted in Fig. 1. As shown in Fig. 1, the higher mean delta CT value in low vs. high BMD subjects indicates a lower average expression level of the PLK3 gene in low vs. high BMD subjects.
Using SPSS (IBM, Armonk, NY), we performed normality test for the ΔCT values from the real-time PCR experiment used for the t test. For example, for the PLK3 gene, for which we performed t test and found statistical significance, the normality test based on the Shapiro-Wilk test did not reach significance (p = 0.280), suggesting that the ΔCT data do not deviate from normal distribution. Therefore, the t test should be appropriate for analysis of our realtime PCR data (the ΔCT values) since the data appear to follow a normal distribution.
We have not confirmed with real-time PCR the differential expression of DAXX, PDCD5 and VDAC1. It might be due to the fact that only a subset of the samples was used for real-time PCR confirmation, which decreased the power for detecting differential expression as in our microarray experiment. The reason that we did not use all the samples as those used in our microarray experiment is that some samples did not have sufficient total RNA for real-time PCR reactions. However, the differential expression of DAXX, PDCD5 and VDAC1 was supported across-cohort by other independent microarray datasets as shown in Table 3. Indeed, acrosscohort replication using independent datasets may represent more convincing evidence than real-time PCR confirmation, which was performed only within-cohort.

Discussion
We here report a transcriptome-wide gene expression study of monocytes for osteoporosis in 73 Caucasian females (the discovery cohort) with extremely high or low hip BMD. Focusing on the top differentially expressed genes (p < 5E-03) in high vs. low BMD subjects in the total, pre-and postmenopausal groups, we attempted to replicate the genes in 3 existing independent monocyte microarray datasets (the replication cohorts I, II and III) for osteoporosis study [13,15,16]. Two genes, DAXX and PLK3, were successfully replicated across all the 3 replication cohorts, with significant (p <0.05) or marginally significant (p <0.10) p values as well as the same direction of regulation as in our discovery cohort (Table 3). Impressively, the replication was achieved not only across cohort (in replication cohorts I and II, which are made up of Caucasians), but also across "ethnicity" (in replication cohort III that is made up of Chinese subjects). The result is significant since among~20,000 human genes genome-wide under study only 2 stood out based on our stringent criteria for discovery and replication. Importantly, the two replicated genes, DAXX and PLK3, are both apoptosis induction genes [42,43] and both down-regulated in the low BMD subjects. DAXX is a well-known gene for apoptosis. By binding to Fas death domain, it can enhance Fas-mediated apoptosis [42]. DAXX is also key to the apoptosis cascade initiated by other factors and pathways, such as TGF-beta [51] and ZIP kinase [52]. PLK3 is a member of the "polo" family of serine/threonine kinases [53]. Overexpression of PLK3 can lead to incomplete cytokinesis and apoptosis [54], which might be due to PLK3's perturbation of microtubule integrity [43] and its function of centrosome localization [55] during cytokinesis.
This pattern of down-regulation of apoptosis induction genes in the low BMD subjects is also shown in those genes that were replicated in the largest replication cohort containing 80 subjects (S1 Table). Among the top 20 replicated genes (as ranked by meta-analysis p values), the 7 th and 17 th genes (PDCD5 and VDAC1, respectively) were also apoptosis genes, which were also down-regulated in the low BMD subjects. PDCD5 was found to be an early marker for apoptosis as its expression level was observed to be significantly increased in apoptotic cells and its nuclear translocation preceded cellular morphological changes of apoptosis, such as chromosome fragmentation [45]. VDAC1 plays an essential role in apoptosis in mammalian cells, in particular, the apoptosis signaling initiated by Bax-induced cytochrome c release from mitochondria [46,47]. When microinjected into cells, anti-VDAC antibodies were found to inhibit apoptosis of various forms [47].
Interestingly, through GO term enrichment analysis of the top 200 genes selected through meta-analysis of the 4 microarray datasets, we identified 5 GO terms related to induction of apoptosis ( Table 4). The pattern of down-regulation of apoptosis induction in the low BMD subjects was again supported by the fact that among the 15 component genes for these 5 GO terms the majority (9 genes) were down-regulated in the low BMD subjects. In addition, through GO term enrichment analysis of the 98 genes identified in the discovery cohort and replicated the replication cohort I, we also identified a GO term "cell cycle arrest" (p = 2.60E-04) with the majority of the component genes (5 out of 6) down-regulated in low vs. high BMD subjects. Therefore, our data may also suggest "decreased cell cycle arrest" and hence "increased monocyte hyperplasia" in low BMD vs. high BMD subjects, which provided additional support to our primary findings of decreased monocyte apoptosis in low vs. high BMD subjects. Taken together, our data suggest a decreased monocyte apoptosis level as a novel mechanism for female osteoporosis.
Apoptosis of bone-related cells, including osteoclasts, osteoblasts and osteocytes, has been recognized as a key working mechanism for bone mass regulation and osteoporosis. For example, the well-known drug for osteoporosis, bisphosphonates, can inhibit pathologic bone resorption by inducing osteoclast apoptosis [56]. Another important osteoporosis-related agent, glucocorticoids, may cause bone loss by inhibiting osteoclast apoptosis [57] and stimulating osteoblast and osteocyte apoptosis [58]. Estrogen's beneficial influence on bone may also be due to its pro-apoptotic effect on osteoclasts [59] and anti-apoptotic effect on osteoblasts [60]. To the best of our knowledge, our study is the first one to provide strong evidence for "reduced monocyte apoptosis" as another potential mechanism for osteoporosis. Given the known functions of monocytes in osteoclastogenesis [7][8][9][10] and inflammatory bone resorption [11,12], a decreased level of monocyte apoptosis (and a corresponding higher level of the cell's survival), as suggested by our gene expression data in the low BMD subjects, may contribute to more active osteoclastogenesis and bone resorption, and therefore increase osteoporosis risk. Although the mechanism suggested by our study needs further validation, its soundness is supported by previous apoptosis studies on other types of bone cells and the well-established functions of monocytes.
Among the 4 microarray datasets we used in this study, the "most significant" genes are all different. This is not unexpected since in a genomic study, where a large number of genes are tested in parallel, the strongest signal (effect) is detected quite often due to overestimation of the effect size based on the sample itself rather than the population, a common phenomenon often referred as "winner's curse" [61]. Because of that, the strongest hits in genomic studies are very difficult to replicate as those hits are quite often false positives. Due to the above consideration, our replication effort in this study does not focused on those strongest signals.
The four apoptosis genes as identified in our microarray studies are not among the list of osteoporosis genes as summarized in a Nature Reviews article [62] that reviewed the major GWAS of osteoporosis. Apart from the difficulty to replicate across studies due to an insufficient power to identify all of the causal genes at the same time in a study [63], the non-replication may also be due to the different approach used in this study (i.e., analysis at the gene expression level) from GWAS (i.e., analysis at the gene polymorphism level). Nevertheless, using the 7 GWAS datasets (Table 5) and the GEFOS dataset, we did confirm in silico association signals for the DAXX, PLK3 and VDAC1 genes, suggesting that the genes identified in this study may also contribute to BMD variation at the broad population level. In particular, the GEFOS is the largest dataset for osteoporosis GWAS meta-analysis [18]. It contains 57 distinct cohorts (http://www.gefos.org/?q=studies) for osteoporosis GWAS from Europe, Australia and Asia, with a total sample size of~33,000 [18]. Therefore, the GEFOS data used in our in silico replication here are robust and may even represent a benchmark data for SNP association signals for osteoporosis.
We used BH (Benjamini Hochberg) method [64] to control FDR (false discovery rate) of the differentially expressed genes identified in the discovery cohort. Specifically, the four DEx genes (DAXX, PLK3, PDCD5 and VDAC1) replicated in several replication cohorts (Table 3) all achieved FDR values > 0.10 in our discovery cohort, which are not significant in terms of FDR. However, our major aim here is not to look at a transcript's statistical significance in an individual dataset, but to take advantage of multiple datasets and check a transcript's consistent signals across all the datasets. In terms of the latter aspect, these four genes are indeed significant as all of them achieved nominally significant (p < 0.05) or marginally significant (p < 0.10) p values in our discovery and replication datasets. In our study, we did not use FDR for selecting genes for replication since we would like to increase the likelihood (i.e., the power) for identifying genes that achieve consistent signals in multiple datasets, as the four genes shown in Table 3.
A classical pathway for osteoblast precipitated osteoclast activity is through RANKL-RANK and RANKL-OPG interactions [40]. Using UniHI7 [41], a database for human molecular interaction networks, we retrieved the potential interacting partners of RANKL, RANK and OPG. Interestingly, among these retrieved interactions, two interacting partners of RANKL, i.e., CALCR and NFKBIA, were found differentially expressed in our discovery cohort and their differential expression was also replicated in the replication cohort I (S1 Table). However, due to the descriptive nature of our data and the limited previous research in this field, the overall effects of these two genes' (CALCR and NFKBIA) interaction with RANKL to osteoblast precipitated osteoclastogenesis cannot be fully determined in this study.
In summary, we have performed a systemic investigation of monocytes for osteoporosis risk. Our study, involving 4 independent microarray datasets for monocyte transcriptomewide gene expression analyses, provides convincing evidence for the four apoptosis induction genes' (DAXX, PLK3, VDAC1 and PDCD5) importance to BMD variation. So far, in the field of monocyte genomic analyses for osteoporosis, our study has the largest sample size and is the first one to perform replication across independent microarray datasets and also across ethnicity to support the discovery in our primary dataset, and hence the findings are robust. Moreover, the apoptosis genes identified were also confirmed by the GWAS signals from GEFOS [18], which further verified the genes' relevance to osteoporosis risk. Above all, our findings suggested a novel mechanism for osteoporosis, which is attenuation of monocyte apoptosis, as supported by the 4 apoptosis genes' (DAXX, PLK3, VDAC1 and PDCD5) down-regulation in the low BMD subjects. Our study highlights the functional importance of monocyte in bone metabolism and contributes to the novel insights into the genetic basis and pathophysiological mechanism of osteoporosis.
Supporting Information S1 Table. Genes identified in the discovery cohort (n = 73) and replicated in the replication cohort I (n = 80). (DOCX) 1. Direction of regulation is the up-or down-regulation of a gene in the low BMD subjects.
2. Comparison group indicates the total, pre-or postmenopausal subgroups, where the differential expression is detected and replicated.
3. Meta-analysis p value was calculated by Fisher's method [27,28] to combine the p values achieved in discovery cohort and replication cohort I. The genes in the table, 102 in total, are ranked by the meta-analysis p value.
5. VDAC1 is ranked at 17 th and genotype cleaning, as well as with general study coordination, was provided by the NIA Division of Geriatrics and Clinical Gerontology and the NIA Division of Aging Biology. Support for the collection of datasets and samples were provided by the parent grant, Genetic Determinants of Bone Fragility (P01-AG018397). Funding support for the genotyping which was performed at the Johns Hopkins University Center for Inherited Diseases Research was provided by the NIH NIA. The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119, 32122, 42107-26, 42129-32, and 44221. This manuscript was not prepared in collaboration with investigators of the WHI, has not been reviewed and/or approved by the Women's Health Initiative (WHI), and does not necessarily reflect the opinions of the WHI investigators or the NHLBI.WHI PAGE is funded through the NHGRI Population Architecture Using Genomics and Epidemiology (PAGE) network (Grant Number U01 HG004790). Assistance with phenotype harmonization, SNP selection, data cleaning, meta-analyses, data management and dissemination, and general study coordination, was provided by the PAGE Coordinating Center (U01HG004801-01).The datasets used for the analyses described in this manuscript were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbGaP accession phs000200.v6.p2.