Free Testosterone Drives Cancer Aggressiveness: Evidence from US Population Studies

Cancer incidence and mortality are higher in males than in females, suggesting that some gender-related factors are behind such a difference. To analyze this phenomenon the most recent Surveillance, Epidemiology and End Results (SEER) database served to access cancer survival data for the US population. Patients with gender-specific cancer and with limited information were excluded and this fact limited the sample size to 1,194,490 patients. NHANES III provided the distribution of physiologic variables in US population (n = 29,314). Cox model and Kaplan-Meier method were used to test the impact of gender on survival across age, and to calculate the gender-specific hazard ratio of dying from cancer five years following diagnosis. The distribution of the hazard ratio across age was then compared with the distribution of 65 physiological variables assessed in NHANES III. Spearman and Kolmogorov-Smirnov test assessed the homology. Cancer survival was lower in males than in females in the age range 17 to 61 years. The risk of death from cancer in males was about 30% higher than that of females of the same age. This effect was present only in sarcomas and epithelial solid tumors with distant disease and the effect was more prominent in African-Americans than Caucasians. When compared to the variables assessed in the NHANES III study, the hazard ratio almost exactly matched the distribution of free testosterone in males; none of the other analyzed variables exhibited a similar homology. Our findings suggest that male sex hormones give rise to cancer aggressiveness in patients younger than 61 years.


Introduction
In the human species, females have longer life expectancies than males. The most recent US census data (http://www.census.gov) reports that males have a life expectancy of 75.5 years and females 80.5 years. Throughout this manuscript, we will define this phenomenon as the ''gender effect''. This gender effect was masked in the past due to high rates of maternal death from childbirth [1]. The effect is now clearly visible throughout the developed world, with the only exceptions being underdeveloped countries where health systems are not capable to limit maternal deaths and the life expectancy is still that observed in developed countries one century ago [1].
Over the last two decades, the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute has collected information on cancer incidence, prevalence and survival in the United States. The SEER database is freely accessible and comprises geographic areas representing 28 percent of the US population. In our opinion, this database represents a useful source to address the gender effect in cancer. An analysis using the SEER database by Cook et al. in 2009 focused on gender differences in the incidence of cancer [2]. This study clearly demonstrated that the risk of malignancy is higher in males, relative to females, for a majority of cancers at most ages. A second study by Cook et al addressed cancer mortality rate and noted a trend toward worse survival in men for a number of cancers. The authors noted that this trend tended to reflect the previously described pattern in cancer incidence [3]. One limitation of these studies is that the authors considered the gender effect as constant throughout lifetime. Indeed, at birth the differences by gender are minimal. At puberty, however, with the acquisition of sexual maturity, gender differences start to appear and ultimately peak during young adulthood. These differences begin to decrease in middle to advanced adulthood, with the decrease in gonadic sex hormone production.
The National Health and Nutrition Examination Survey (NHANES) is a survey research program conducted by the National Center for Health Statistics to assess the health and nutritional status of US population. The survey combines interviews and physical examinations, including medical, dental, and physiological measurements, as well as laboratory tests administered by medical personnel, thus providing a snapshot of the health status of the US population.
We sought to address this gap in the field by determining the relevance of the gender effect on survival analysis with respect to age as a continuous variable and possible relation to physiological variables assessed in the NHANES III population study.

SEER database
The April 2012 release of the 1973-2009 SEER-18 Research Data in SEER*Stat version 7.0.9 was used for this study. Information from 3,133,120 patients was initially collected and used to analyze five year cause-specific survival for all cancer sites defined in the database. Case selection was defined as actively followed cases in the research database with malignant behavior and age at diagnosis of 1 to 84 years of age. Cases with death certificate only or autopsy only, cases based on multiple primaries and cases alive with no survival time were excluded (n = 189,718). Only patients for whom information was available about race, tumor stage, tumor type, gender and age at diagnosis were included. Analysis excluded gender specific sites (ovary, endometrial, vaginal, testis and prostate cancer). Breast cancer was not included because of the disproportionate frequency by gender. Details of the ICD codes of the excluded diseases are provided in Table 1. This limited the sample size to 1,194,490 patients. The SEER cause-specific death classification was set as the definition of cause of death. The primary endpoint was cause-specific survival of each patient's originally diagnosed cancer site. Cause-specific survival was censored at the last follow-up, December 31, 2009, or

NHANES III dataset
NHANES III is the seventh in a series of surveys that began in 1960 to examine the health of the US population. NHANES III sampled approximately 40,000 individuals from 1988 through 1994. One-hundred thirty variables related to human health were included in the analysis. All the variables were computed as provided in the dataset. The variables taken into consideration are included in the lab file available at http://www.cdc.gov/nchs/ nhanes/nh3data.htm. Such variables are reported with two different scales, one in the native scale and the other one after conversion in the international system of units, thus representing 65 independent variables in two different measurement units. In particular free testosterone index was calculated according to the formula FTI (Free Testosterone Index) = [(TT/SHBG) * 100], as suggested by the document file attached to the dataset ftp://ftp. cdc.gov/pub/health_statistics/nchs/nhanes/nhanes3/25a/ sshormon.pdf.

Statistical analysis
Cox proportional hazards models were used to estimate the male to female hazard of cause-specific mortality, defined here as . HR is not significant in the age range 1-17. In the age 18-61 it is constantly higher than 1.126 while after 62 is lower. At age 74 is not longer significant, to become again significant after 83 with a value lower than 1. doi:10.1371/journal.pone.0061955.g001 the cause of death being the specific cancer originally diagnosed and death being within five years of cancer diagnosis. A Hazard Ratio (HR) value of 1 means no difference compared to the reference, while a value lower or higher than 1 means decreased or increased risk, respectively. Multivariate analysis model included the following variables: age at diagnosis, tumor stage, cancer type (sarcoma, solid tumor or hematologic malignancy), race and gender.
Overall Survival (OS) was calculated from the date of diagnosis to the date of death or five years after diagnosis. Medians and life tables were computed using the product-limit estimate by the Kaplan-Meier method, and the Log-Rank test was employed to assess statistical significance. Analysis was performed using the same variables described above.
To assess the homology between the distribution of the HR across age and gender, the distribution of each parameter analyzed in the NHANES III dataset was computed across the available age range and the Spearman correlation test was computed to detect the presence of a statistically significant correlation. To further assess the homology between the variables an additional analysis was conducted. Two samples kolmogorov-smirnov (KS) test assessed the homology between the HR distribution and the distribution of a given variable in the NHANES III dataset. The null distribution of this statistic was calculated under the null hypothesis that the samples were drawn from the same distribution. Since the HR and the NHANES III variables have different scales, the z-score was computed for each variable according to the following equation:z~(x{m)=s, where m and s are mean and standard deviation of the whole population, respectively. Due to the differences of size of the two databases used for this study (SEER n = 1,194,490; NHANES III n = 29,314) we used the technique of bootstrapping (n = 10,000) to sample from the SEER database an equal number of patients capable to match for each age the size of the NHNAES III database, using the R function censboot [5]. For each bootstrap, a KS test was made and the results are expressed as % of homology, which was the % of KS tests demonstrating that the two samples were coming from the same distribution. In all cases the level of significance was set at a p value ,0.05.

Results
A multivariate Cox proportional hazard model was generated with gender, race, stage, tumor type as categorical variables and age as a continuous variable. The outcome variable was five year survival. After excluding for gender-specific cancers (  Thereafter, we adopted the same Cox proportional hazard model and calculated the HR over the entire age range (0 to 84 years). As depicted in Fig. 1, we used females as reference (HR = 1). No significant effects were noticed in the age range 0-17 years. From 18 to 41 years, the HR increased to average at about 1.5 and began to decrease thereafter. At the age of 61 years, the HR was below 1.13, not significant at the age of 74 years and significantly less than 1 at the age of 83 years. This led us to stratify patients in two age ranges: 17-61 years (Table 3) and 62-84 years (Table 4). We applied the same model and again, found that all variables were highly significant at p,0.00001. The gender effect was more prominent in the age range 17-61 years, with a difference of about 30% in terms of the HR compared with patients over the age of 62 years. To further investigate this phenomenon, Kaplan-Meier analysis was conducted with the same dataset. Differences in survival between females and males were computed at each age and Log-Rank test was used to assess if variation was significant at a p value ,0.05 (Figure 2, Video S1). Until the age of 17 years, no significant changes were noticed. Starting from 18 years of age, an increasing and statistically significant difference was found. This effect peaked around 27 years and slightly decreased thereafter, remaining significant until the age of 63 years. After 70 years, the opposite phenomenon was noticed, with males having a slight but significant survival advantage. In terms of racial groups, the gender effect was more prominent in African-Americans than in Caucasians (Fig. 3A). In terms of tumor type, the gender effect was equally represented in sarcomas and epithelial solid tumors, but not in hematopoietic malignancies (Fig. 3B). In terms of staging, the gender effect was maximal in the most aggressive tumors with metastatic disease, slightly displayed in tumors with regional involvement and inverted in patients with localized disease, with males barely outliving females (Fig. 3C).
To identify potential biological causes of the gender effect, we analyzed a series of physiologic variables assessed in the NHANES     Table 5). The strongest correlation was noticed for Free Testosterone Index (FTI) in males with an R value of 0.9 (Fig. 4). The homology between HR and the NHANES III was then computed for all the available variables across the available age range. KS test assessed the hypothesis that HR and a given NHANES III parameter followed in whole or in part the same distribution. This analysis was performed independently for each gender (Table 6). A striking full homology (100%) was observed for FTI in males, as the two distributions did not differ significantly across the entire age range (Fig. 5A). None of the other variables exhibited a similar degree of concordance with the HR distribution. The second strongest homology was observed for Hemoglobin in males (11.9%, Fig. 5B), where the homology was mostly confined to the age range 17-27. Noteworthy, the behavior of hemoglobin in females did not show a comparable homology with HR (Fig. 5B).

Discussion
In this era of personalized medicine, research is now focused on identifying specific biomarkers to tailor the therapeutic approach to both the disease and the unique genetic makeup of the patient. This concept is rapidly advancing in oncology, where differences in the genetic composition of a single tumor may be exploited to select individual targeted therapies. Until now, the search for personalized therapeutic strategies has not taken the impact of gender into consideration. Our study emphasizes that gender may be responsible for significant differences in cancer outcome prevalently in patients 17-61 years of age. These differences have been underestimated in previous studies that did not consider the significance of age for the gender effect. To our knowledge, this is the first study that systematically investigates the gender effect stratified over age as continuous variable using US population data. The only other study that has investigated the gender effect with reference to age was conducted in Europe and found a 5% gender-based survival difference [6], as compared to the 30% effect reported here. This discrepancy could be explained by the heterogeneous European database, overrepresentation of older patients or the study design in which age ranges were chosen arbitrarily [6]. Our initial hypothesis was that, if present, the gender effect would be influenced by age, since the hormonal differences between males and females are maximal in the fertile years; similarly, we expected the gender effect to decrease in influence following those years. This hypothesis was confirmed by our analysis since the gender effect peaked during the fertile years, when hormonal differences are maximal by gender. What determines the gender effect? So far, the concept of hormone-dependent disease has been confined to prostate and breast cancer, where anti-hormone strategies are principal modalities of therapy. Our findings suggest that sex hormones, more generally, could be key drivers of a malignancy's aggressiveness, particularly when cancer is developed at a young age, and may thus be exploited to increase cancer survival rates.
Another important finding in our study is that out of 65 physiologic variables [7], free testosterone displayed the strongest homology to that of the HR. The gender effect has traditionally been explained to result from differences in estrogen levels in the female population, with focus on a female's pre-or postmenopausal status. Our study strongly suggests that also androgens could be involved in driving the gender effect. Indeed, the amount of free testosterone in males is not constant throughout life [8]. Rather, levels increase at around 17 years, peak in the midtwenties and gradually decrease thereafter until returning to prepuberty levels at the age of 61 years.
In our study, the gender effect was more prominent in African-American than in Caucasian. In the US population, young African-Americans exhibit higher bone density and muscle mass [9], all parameters which have been related to increased androgen levels [10]. At the same time, there is also an increased risk of prostate cancer in African-Americans which has been correlated with higher levels of androgens [11]. For these reasons, African-Americans may benefit more of a therapeutic manipulation of the hormonal levels aimed at increasing the effects of metastatic cancer treatments. In addition to free testosterone, we noticed that also the amount of hemoglobin displayed a significant correlation with the gender effect in males but not in females. Hemoglobin levels are known to depend on free testosterone levels in males [12], thus strengthening the biological link between HR trend in males and circulating androgen levels.
How are androgens involved in cancer mortality? Here we reported that the gender effect is not visible in all the patients, but only when the disease is solid (epithelial and sarcomas but not hematological malignancies) and at an advanced stage which would require additional treatments. This fact suggests the presence of a relationship between gender effect and response/ resistance to treatments used for metastatic cancers. Recently, androgens have been reported to activate a prosurvival pathway in colorectal cancer through the overexpression of class III and V btubulin isotypes [13]. Class III b-tubulin is an adaptive survival pathway to a harsh microenvironment featured by hypoxia [14] and poor nutrient supply [15]. In this context, androgens could activate a survival pathway regardless of exposure to such a microenvironment, making a cancer more aggressive and resistant to anoikis, which occurs in the setting of low oxygen and nutrient supply. This would enable cancer cells to metastasize locally and distantly and escape from cancer treatments. These processes could establish a biological ground to explain an androgendependent gender effect.
But, do estrogen levels exert some protective effects for cancer survival? This hypothesis, originated by Adami and coll. in 1990 [16], cannot be directly excluded in our study, as the NHANES III dataset did not analyze estrogen levels in its female population. However, other female-specific sex hormones whose expression is directly related to estrogen levels, such as FSH and LH [17], were investigated in the NHANES III population. None of these hormones exhibited a direct relationship with the HR distribution with both Spearman and KS-test. Moreover, estrogen production in females peaks at around 12 years of age [18] and decreases at around 50-51 years of age [19], which is earlier than the pattern of free testosterone in males. Such physiological observations suggest that the curve of estrogen production does not match the HR distribution over age reported here.
The major limitation of our study is that all the results are driven from patient population studies and that SEER database and NHANES III include cancer patients and healthy subjects, respectively. Therefore, our analysis was made from two independent subsets and data did not come from the same patients. However, such risk is partially mitigated by the size of the studied populations and the fact that they were coming across US, thus decreasing the risk to be affected by specific treatments delivered in a single Institution.
Nevertheless, our data emphasize the new hypothesis that androgens, rather than estrogens, could be drivers of the gender   effect. An array of antiandrogen therapies has been developed for the management of prostate cancer, including drugs that also decrease tissue production of androgens [20]. Our population study supports the need of prospective clinical trials to test whether young male cancer patients (aged less than 61 years) with metastatic disease could benefit from therapeutic modulation of male hormone levels.

Supporting Information
Video S1 Survival analysis from age 0 to 84. The video is generated by Kaplan-Meier analysis from the data presented in the manuscript from the age 1 to 84 and the animation is obtained with the overlapping of the 84 images. For each age, blue and red lines indicate the survival curve for male and females, respectively. (MP4)