Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Reclassification of breast cancer: Towards improved diagnosis and outcome

  • Alexander P. Landry ,

    Contributed equally to this work with: Alexander P. Landry, Zsolt Zador

    Roles Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliations Department of Surgery, St. Michael’s Hospital, Toronto, ON, Canada, Faculty of Medicine, University of Toronto, Toronto, ON, Canada

  • Zsolt Zador ,

    Contributed equally to this work with: Alexander P. Landry, Zsolt Zador

    Roles Conceptualization, Formal analysis, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation Department of Surgery, St. Michael’s Hospital, Toronto, ON, Canada

  • Rashida Haq,

    Roles Supervision, Writing – review & editing

    Affiliation Division of Hematology/Oncology, St. Michael’s Hospital, Toronto, ON, Canada

  • Michael D. Cusimano

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    Affiliations Department of Surgery, St. Michael’s Hospital, Toronto, ON, Canada, Faculty of Medicine, University of Toronto, Toronto, ON, Canada, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada

Reclassification of breast cancer: Towards improved diagnosis and outcome

  • Alexander P. Landry, 
  • Zsolt Zador, 
  • Rashida Haq, 
  • Michael D. Cusimano



The subtyping of breast cancer based on features of tumour biology such as hormonal receptor and HER2 status has led to increasingly patient-specific treatment and thus improved outcomes. However, such subgroups may not be sufficiently informed to best predict outcome and/or treatment response. The incorporation of multi-modal data may identify unexpected and actionable subgroups to enhance disease understanding and improve outcomes.


This retrospective cross-sectional study used the cancer registry Surveillance, Epidemiology and End Results (SEER), which represents 28% of the U.S. population. We included adult female patients diagnosed with breast cancer in 2010. Latent class analysis (LCA), a data-driven technique, was used to identify clinically homogeneous subgroups (“endophenotypes”) of breast cancer from receptor status (hormonal receptor and HER2), clinical, and demographic data and each subgroup was explored using Bayesian networks.


Included were 44,346 patients, 1257 (3%) of whom had distant organ metastases at diagnosis. Four endophenotypes were identified with LCA: 1) “Favourable biology” had entirely local disease with favourable biology, 2) “HGHR-” had the highest incidence of HR- receptor status and highest grade but few metastases and relatively good outcomes, 3) “HR+ bone” had isolated bone metastases and uniform receptor status (HR+/HER2-), and 4) “Distant organ spread” had high metastatic burden and poor survival. Bayesian networks revealed clinically intuitive interactions between patient and disease features.


We have identified four distinct subgroups of breast cancer using LCA, including one unexpected group with good outcomes despite having the highest average histologic grade and rate of HR- tumours. Deeper understanding of subgroup characteristics can allow us to 1) identify actionable group properties relating to disease biology and patient features and 2) develop group-specific diagnostics and treatments.


Significant research efforts have been undertaken to identify markers of breast cancer biology and prognosis. Gene expression profiling has elucidated five “intrinsic” subgroups of the disease: luminal A, luminal B, HER2-overexpressing, basal-like, and normal-like[1,2]. Conveniently, hormone receptors (HR; estrogen and progesterone) and human epidermal growth factor receptor 2 (HER2) function as classification biomarkers; their status can approximate intrinsic subtypes without necessitating cumbersome genetic profiles (luminal and normal-like are HR+, HER2-overexpressing is HR-/HER2+, and basal-like is HR-/HER2-)[3,4]. The HR-/HER2- subtype, also called “triple-negative” (TN) disease since it lacks all three receptors, is associated with the worst prognosis[2,5]. Importantly, the molecular classification of breast cancer has led to the development of targeted therapies which have improved survival in primary disease[6]. However, while HER2-targeted treatments also show promise in the setting of metastasis[7], outcomes in metastatic disease have not improved significantly over the last several years[7,8] and only about 30% of stage 4 breast cancers are HER2+[4] and thus amenable to this therapy. Further, the effectiveness of HER2 treatment in the setting of brain metastasis is limited by blood brain barrier (BBB) impermeability[9]. Novel classification systems have the potential to improve our understanding of metastatic breast cancer and further inform the clinical management of this challenging disease state.

The clinical course and prognosis of breast cancer are highly variable[10,11]. Identifying mutually exclusive subgroups of patients within a heterogeneous patient population is called endophenotype discovery[1214], and has translated into practice-changing recommendations in the management of asthma[12], ARDS[13] and sepsis[14]. Latent class analysis (LCA) has been used for this purpose in the clinical setting[13,15,16], and is preferred over conventional clustering[17]. This technique is based on the assumption that there exists an unobserved categorical variable, inferable from a set of observed variables, which groups subjects into “latent” classes based on individual similarities[18]. The technique of endophenotype discovery can ultimately inform subgroup-specific treatment plans (e.g. identifying tumour-specific drugs), trial design (e.g. testing prophylactic irradiation strategies for high risk groups) and prognostication. Bayesian Network Analysis (BNA), a machine learning based method for computing and visualizing probabilistic associations within a dataset[19], is useful for further data exploration and hypothesis generation, and has been applied to hepatocellular carcinoma[20], lung cancer[21], traumatic brain injury[19], and subarachnoid hemorrhage[22]. Network output is visualized with a directed acyclic graph (DAG), wherein probabilistic associations between nodes (parameters) are indicated by edges (arrows). The utility of these novel approaches in breast cancer has not been adequately explored, and it may provide further insight into this heterogeneous disease.

The purpose of this work is to identify novel subgroups of breast cancer using a large cancer repository, and explore groups for whom current classification systems are inadequate. This will serve as an important guide for future work aimed at constructing deeper, more informative endophenotypes of this disease.


Data collection

Data was extracted from the 2016 edition of the open-source cancer registry Surveillance, Epidemiology and End Results (SEER). SEER encodes data on individual cancer cases in the USA from 18 population-based registries and represents approximately 28% of the country’s population[23]. Sites of distant metastases and hormone receptor/HER2 statuses of primary breast tumours have been recorded since 2010. Consequently, we limited our study to cases diagnosed in 2010 to maximize follow up and minimize time-related cohort heterogeneity. Demographic information for each case included age (<50 vs. 50–69 vs. ≥70 years old)[4], race (white vs. black vs. other), marital status (married vs. unmarried) and health insurance status (insured vs. uninsured). Disease biology at diagnosis was described by estrogen and progesterone receptor status (present vs. absent vs. borderline), HER2 status (present vs. absent), and grade (1–4) of the primary tumour. Primary tumour size (<2 cm vs. 2–4.9 cm vs. ≥5 cm)[4] as well as lymph node and distant organ (brain, liver, lung, bone) metastases (present vs. absent) served to describe disease burden at the time of diagnosis. Finally, we assessed overall survival based on the most recent documented follow-up (dead vs. alive). Following a recent SEER-based study from Gong et al[4], estrogen receptor (ER) and progesterone receptor (PR) status coded as “borderline” in the database were defined as positive in our study.

Inclusion/Exclusion criteria

This study included female patients from the 2016 SEER registry who were diagnosed with breast cancer during the year 2010. All included patients were at least 18 years old at the time of diagnosis. We required cases to have been diagnosed by histology and excluded those with a diagnosis made at autopsy or based on a death certificate alone[4]. We also excluded cases if any included parameters (defined in the previous subsection) were coded as “unknown” in the database. Similarly, cases with “borderline” HER2 status were not included in the analysis[4].

Endophenotype discovery and hypothesis generation

Comparison was first made between patients with and without distant organ metastases at diagnosis using contingency tables for each parameter and applying a chi-squared test of independence. Building and testing an LCA model then served as the primary tool for endophenotype discovery. Since these models are best suited for binary classification variables, we used the status of receptors (HR, HER2) and metastases (lymph nodes, brain, lung, liver, and bone) for this purpose. The optimal number of classes was determined by computing the Bayesian Information Criterion (BIC) for models with between 2 and 7 classes and selecting the configuration which minimized the BIC. Contingency tables were generated for each parameter and a chi-square test used to probe inter-class differences. For descriptive purposes, the LCA was repeated 10 times to observe the degree of reproducibility.

Bayesian networks were used to explore variable interactions within each endophenotype and thus further probe the characteristics of each class. A hill-climbing algorithm was used to train the networks[19], and BIC values were used to represent association strengths (with an inverse proportionality). In the DAG representation, arrows with tails at the “survival” node were omitted, as a distal outcome cannot logically influence a patient’s state at diagnosis. Results from the BNA were validated using structural equation modelling (SEM), a statistical tool capable of assessing the fit of a theoretical model on a dataset using inter-related multiple regression models. Root mean squared error of approximation (RMSEA) and standardized root mean squared residual (SRMR), both common measures of absolute model fit (wherein the former includes a penalty for model complexity and the latter does not) were used to assess the Bayesian model configurations. We consider fits to be adequate when both RMSEA and SRMR are < 0.05 (based on recommended cutoffs of 0.06 and 0.08, respectively[24]).

Notably, both LCA and BNA were used to explore potential metastatic “drivers” (factors which predispose to metastasis) and “organ-selectors” (factors which influence the location of a metastasis, without necessarily affecting the probability of one actually occurring). A p-value ≤ 0.001 was considered significant in this study.

Computational platform

All analysis and modelling used R, an open-source platform for statistical computation and graphics[25]. The packages “poLCA”[26], “bnlearn”[27], and “lavaan”[28] were used for LCA, BNA, and SEM, respectively.


Patient characteristics

An overview of patient characteristics is presented in Table 1. The majority are white (81.1%), insured (98.4%) and married (72.1%). Most primary tumours are <2 cm (57.2%) and HR+/HER2- (73.3%). About one third of patients have lymph node metastases at diagnosis (32%) but distant organ metastases are rare, with bone being the most common site (2.1%) and brain being the least common (0.2%). Most patients are alive at the most recent documented follow up (86.3%). In the comparison between those with and without organ metastases at diagnosis, all parameters except age and marital status are significantly different (p<0.001, chi square). Patients presenting with metastasis are more likely to be black, uninsured, and unmarried but these differences are small in absolute terms. The metastasis group also has larger and higher grade primary tumours which are more likely to be HR-, and be associated with greater lymph node burden and comparatively poor survival.

Endophenotype discovery with latent class analysis

A four-latent-class model was selected based on BIC values, which demonstrate good reproducibility (Table 2). LCA outcome probabilities are shown in Fig 1, and a frequency table of class compositions is presented in Table 3. We observe unambiguous HR/HER2 statuses in classes 1 and 4 (high likelihood of HR+/HER2-) yet greater heterogeneity in classes 2 and 3, a high probability of metastases in group 3, and varying probabilities of lymph node metastasis between groups (Fig 1). In terms of case distribution (Table 3), class 4 accounts for the majority of cases, and is associated with the best survival. Patients are more likely to be older and white and none have lymph node or distant organ metastases at diagnosis. This group also has the smallest, lowest grade primary tumours which are mostly HR+/HER2-. Class 2 is the youngest, with higher grade tumours which are more likely to be HR-/HER2+ or TN. Despite higher grade and HR/HER2 status, tumour size and lymph node involvement are moderate, there are very few distant organ metastases (none to bone; no patients have metastases to more than one site), and survival is quite good. Class 1, which is entirely HR+/HER2-, has an especially high lymph node burden and contains approximately half the total number of bone metastases (without any metastases to other sites). Class 3 has the worst prognosis and accounts for the vast majority of metastatic burden at diagnosis (all patients have distant organ involvement); patients are also more likely to have larger primary tumours and to be black, uninsured, and unmarried. Interestingly, this group has fewer lymph node metastases than class 1 and tumours are lower grade and are less likely to be HR-/HER2+ or TN compared to class 2. Based on these observations, we will refer to these endophenotypes as follows: favourable biology (class 4), HGHR- (class 2), HR+ bone (class 1), and distant organ spread (class 3). In Table 3, there are highly significant differences (p<0.001, chi square) between classes for all parameters. Notably, similar LCA outputs were observed in 9/10 repetitions.

Fig 1. LCA model output.

A: Probabilistic distribution of patients by latent class. Note that this represents the probability of a patient being classified in each subgroup, and is not equivalent to the distribution of cases in this study (Table 3). B: Outcome probabilities for each class. Each bar represents the probability of a parameter being positive in a particular class. Bars are colour coded based on the following groupings: HR/HER2 subtype (blue), distant organ metastases (orange), and lymph node metastases (green).

Table 2. LCA model optimization.

Note that BIC is minimized with 4 classes.

Bayesian network analysis

We present the DAG outputs of the BNA in Fig 2. In class 1, age and race are proximal features which are associated with disease-specific parameters (grade, spread) and survival. Class 2 is notable for having tumour size and spread (nodes, liver, lung) as proximal variables influencing HR/HER2 subtype, demographic features and survival. Age and HER2 status are directly associated with survival, though the hormone receptor status has no significant effect. In class 3, demographic features don’t appear to have a significant effect on disease or survival despite the increased proportion of patients who are black, uninsured, and unmarried. Receptor subtype affects survival and HR status influences site of metastasis (bone/lung), though there is no significant association between site of metastasis and mortality. Finally, age directly influences all variables (except HR status) in class 4, and survival is also a direct function of size and HR status.

Fig 2. DAG plots for each latent class.

Arrow thickness represents the strength of association, which is inversely proportional to BIC. Dotted lines represent associations which are ≤5% of the strength of the strongest association. Node colours are grouped as follows: host features (blue), tumour biology (orange), disease burden (green), and overall survival (grey). Red boxes indicate interactions which were not validated by the SEM regression (p > 0.001). RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; “surv” = survival; “insur” = insurance.

Structural equation modelling

A single discrepancy is noted between BNA and SEM in classes 2 and 4 (Fig 2). However, each Bayesian model achieved an adequate fit (with the discrepant associations removed) based on the RMSEA and SRMR values, indicating they are plausible constructs with which to describe this data.


Overview of key results

Latent class analysis identified four endophenotypes of breast cancer: favourable biology, with biologically favourable and locally contained tumours; HGHR- (“high grade HR-“), with the highest grade tumours on average and largely HR- receptor subtypes, yet limited disease spread and good survival; HR+ bone, described by HR+/HER2- subtype, lymph node and isolated lung metastases, and good survival; distant organ spread, which represents the vast majority of metastatic burden and death despite less aggressive tumour grade and HR/HER2 status than the HGHR- cohort. Bayesian networks identified differences in variable interactions among classes as well.

The “HGHR-” class has unexpected features

The HGHR- group is of particular importance due to its unexpected features. As the cohort with the highest grade tumours which are predominantly HR-/HER2+ and TN, its histopathology would classically be considered unfavourable. It is therefore surprising that this is not the group with the highest metastatic burden and lowest survival. It seems that while HR/HER2 status may be an “organ-selector” of metastasis it may not be a key “driver” of them; i.e. cannot be used to reliably predict whether or not a metastasis will occur in the first place. This postulate is based on the fact that this endophenotype has so few metastases and that HR status does not significantly drive metastases in this cohort’s BNA. Alternatively, these patients may still develop metastases with time but have prolonged survival nevertheless.

The relationship between hormone receptor and HER2 subtype and preferential site of metastasis is not new, and it is known that HR+ tumours are associated with spread to bone while HR- tumours primarily distribute to liver, lung, or brain[4]. This is corroborated by observing differences between HR+ bone (HR+ only, bone is the only site of metastasis) and HGHR- (mostly HR-, no bone metastases) and by noting the influence of HR status on site of metastasis in the distant organ spread BNA (Fig 2; since all patients have metastases, influence is purely on location). Nevertheless, there is considerable overlap in sites of tumour spread[4] and predictions are largely limited to “bone vs. non-bone”. These observations indicate that HR/HER2 subtype alone is not a sensitive nor specific predictor of metastasis, and clearly suggest a need for further exploration of prediction models.

Social factors may play a role in metastasis and outcome

There appears to be a relationship between socioeconomic status and disease burden/outcome as observed in the distant organ spread endophenotype (Table 3). Additionally, race directly influences tumour grade in the HR+ bone BNA (Fig 2) and the relationship between race and HR (which, in turn, is related to tumour grade) has been suggested to be mediated by socioeconomic factors rather than genetics alone[29]. The relevance of host features is not surprising as it has previously been shown that indicators of socioeconomic status including age, marital status, and occupational status are predictors of breast cancer mortality[4,30,31] and stage at diagnosis[30]. This may be mediated by a lack of access to healthcare, differences in exposures/behaviours, and/or underlying biology. For example, better access to care may have played a role in the HGHR- group’s favourable outcomes. Addressing barriers of care will be critical in this subset of patients in order to prevent early disease spread and consequent poor outcomes.


Several limitations of this study must be considered. This study is cross sectional in nature. SEER does not record data beyond the time of diagnosis, making it impossible to track disease evolution (i.e. the development of new metastases with time). This hindered our ability to assess metastatic risk based on a patient’s state at diagnosis. The nature of the dataset, with the rarity of distant organ metastases at diagnosis, may also have limited the statistical power of our results, particularly in the BNA. This limited our abilities to draw conclusions, particularly relating to brain and liver metastases which are associated with particularly dismal outcomes and are therefore very important. In addition, molecular classification available in SEER was based on HER2 and HR status alone, which is not fully concordant with gene expression-derived intrinsic subtypes[2]. Similarly, we were not able to include potentially important parameters such as gene expression data and more extensive host information due to limited data availability. This also precluded the analysis of metastasis biology, allowing the possibility of receptor conversion from the primary tumour[9,32]. Finally, despite its size, SEER may not be fully representative of the US breast cancer population and does not consider populations outside the US. Nevertheless, the purpose of this work was to generate hypotheses using data covering significant segments of the population in order to guide the construction of deeper endophenotypes using other, more complete data sources.

Future directions

Validation of our results by independently detecting similar subgroups based on prospectively gathered clinical, demographic, and molecular data will be an important next step. Subsequently, deriving endophenotypes of breast cancer from comprehensive gene expression profiles, in tandem with richer host phenotypes (i.e. patient characteristics), may lead to more holistic and informative classification models. Inclusion of all previously identified genes of significance in these models should be attempted in order to ensure completeness. Consequent identification of the precise phenotypes which are associated with metastases on a longitudinal basis may improve our understanding of this disease state, allow for better prevention and screening, and ultimately lead to more effective treatments.


Data-driven techniques using receptor (HR and HER2) subtypes, host factors and metastatic potential from the SEER dataset identified novel unexpected endophenotypes of breast cancer and suggest the need for better classification using deeper models. Future work to expand on the endophenotypes we have described using more complete data and longitudinal designs will lead to a better ability to understand, predict, and treat metastases and ultimately improve outcomes in these patients.


  1. 1. Perou CM, Sørile T, Eisen MB, Van De Rijn M, Jeffrey SS, Ress CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406: 747–752. pmid:10963602
  2. 2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98: 10869–10874. pmid:11553815
  3. 3. Dai X, Li T, Bai Z, Yang Y, Liu X, Zhan J, et al. Breast cancer intrinsic subtype classification, clinical use and future trends. Am J Cancer Res. 2015;5: 2929–2943. pmid:26693050
  4. 4. Gong Y, Liu Y-R, Ji P, Hu X, Shao Z-M. Impact of molecular subtypes on metastatic breast cancer patients: a SEER population-based study. Sci Rep. 2017;7: 45411. pmid:28345619
  5. 5. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DSA, Nobel AB, et al. Concordance among Gene-Expression–Based Predictors for Breast Cancer. N Engl J Med. 2006;355: 560–569. pmid:16899776
  6. 6. Lim YJ, Lee SW, Choi N, Kwon J, Eom KY, Kang E, et al. Failure patterns according to molecular subtype in patients with invasive breast cancer following postoperative adjuvant radiotherapy: long-term outcomes in contemporary clinical practice. Breast Cancer Res Treat. 2017;163: 555–563. pmid:28315066
  7. 7. Gobbini E, Ezzalfani M, Dieras V, Bachelot T, Brain E, Debled M, et al. Time trends of overall survival among metastatic breast cancer patients in the real-life ESME cohort. Eur J Cancer. 2018;96: 17–24. pmid:29660596
  8. 8. Dawood S, Broglio K, Gonzalez-Angulo AM, Buzdar AU, Hortobagyi GN, Giordano SH. Trends in survival over the past two decades among white and black patients with newly diagnosed stage IV breast cancer. J Clin Oncol. 2008;26: 4891–8. pmid:18725649
  9. 9. Shao MM, Liu J, Vong JS, Niu Y, Germin B, Tang P, et al. A subset of breast cancer predisposes to brain metastasis. Med Mol Morphol. 2011;44: 15–20. pmid:21424932
  10. 10. Molnár IA, Molnár BÁ, Vízkeleti L, Fekete K, Tamás J, Deák P, et al. Breast carcinoma subtypes show different patterns of metastatic behavior. Virchows Arch. 2017;470: 275–283. pmid:28101678
  11. 11. Nieder C, Spanne O, Mehta MP, Grosu AL, Geinitz H. Presentation, patterns of care, and survival in patients with brain metastases: What has changed in the last 20 years? Cancer. 2011;117: 2505–2512. pmid:24048799
  12. 12. Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008;372: 1107–1119. pmid:18805339
  13. 13. Calfee CS, Delucchi K, Parsons PE, Taylor B, Ware LB, Matthay MA. Latent Class Analysis of ARDS Subphenotypes: Analysis of Data From Two Randomized Controlled Trials. Lancet Respir Med. 2014;2: 611–620. pmid:24853585
  14. 14. Scicluna BP, van Vught LA, Zwinderman AH, Wiewel MA, Davenport EE, Burnham KL, et al. Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med. 2017;5: 816–826. pmid:28864056
  15. 15. Depner M, Fuchs O, Genuneit J, Karvonen AM, Hyvärinen A, Kaulek V, et al. Clinical and epidemiologic phenotypes of childhood asthma. Am J Respir Crit Care Med. 2014;189: 129–138. pmid:24283801
  16. 16. Ezzedine K, Le Thuaut A, Jouary T, Ballanger F, Taieb A, Bastuji-Garin S. Latent class analysis of a series of 717 patients with vitiligo allows the identification of two clinical subtypes. Pigment Cell Melanoma Res. 2014;27: 134–139. pmid:24127636
  17. 17. Schreiber JB, Pekarik AJ. Technical Note: Using Latent Class Analysis versus K-means or Hierarchical Clustering to Understand Museum Visitors. Curator Museum J. 2014;57: 45–59.
  18. 18. Lanza ST, Rhoades BL. Latent Class Analysis: An Alternative Perspective on Subgroup Analysis in Prevention and Treatment. Prev Sci. 2013;14: 157–168. pmid:21318625
  19. 19. Zador Z, Sperrin M, King AT. Predictors of outcome in traumatic brain injury: New insight using receiver operating curve indices and Bayesian network analysis. PLoS One. 2016;11: 1–18. pmid:27388421
  20. 20. Cai Z, Si S, Chen C, Zhao Y, Ma Y, Wang L, et al. Analysis of Prognostic Factors for Survival after Hepatectomy for Hepatocellular Carcinoma Based on a Bayesian Network. PLoS One. 2015;10: e0120805. pmid:25826337
  21. 21. Berkan Sesen M, Nicholson AE, Banares-Alcantara R, Kadir T, Brady M. Bayesian networks for clinical decision support in lung cancer care. PLoS One. 2013;8. pmid:24324773
  22. 22. Zador Z, Huang W, Sperrin M, Lawton MT. Multivariable and Bayesian Network Analysis of Outcome Predictors in Acute Aneurysmal Subarachnoid Hemorrhage: Review of a Pure Surgical Series in the Postinternational Subarachnoid Aneurysm Trial Era. Oper Neurosurg. 2017; pmid:28973260
  23. 23. Nih . Surveillance epidemiology and end results. 2012;2: 2–4. Available:
  24. 24. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model. 1999;6: 1–55.
  25. 25. R Development Core Team. R: A Language and Environment for Statistical Computing. R Found Stat Comput Vienna Austria. 2016;0: {ISBN} 3-900051-07-0.
  26. 26. Linzer DA, Lewis JB. poLCA: An R Package for Polytomous Variable Latent Class Analysis. J Stat Softw. 2011;42: 1–29.
  27. 27. Scutari M. Learning Bayesian Networks with the bnlearn R Package. J Stat Softw. 2010;35: 1–22.
  28. 28. Rosseel Y. lavaan: An R Package for Structural Equation Modeling. J Stat Softw. 2012;48: 1–36.
  29. 29. Gordon NH. Socioeconomic factors and breast cancer in black and white Americans. Cancer Metastasis Rev. 2003;22: 55–65. pmid:12716037
  30. 30. Gentil-Brevet J, Colonna M, Danzon A, Grosclaude P, Chaplain G, Velten M, et al. The influence of socio-economic and surveillance characteristics on breast cancer survival: A French population-based study. Br J Cancer. 2008;98: 217–224. pmid:18182980
  31. 31. Ren Z, Li Y, Shen T, Hameed O, Siegal GP, Wei S. Prognostic factors in advanced breast cancer: Race and receptor status are significant after development of metastasis. Pathol Res Pract. Elsevier GmbH.; 2016;212: 24–30. pmid:26616114
  32. 32. Hoefnagel LDC, van de Vijver MJ, van Slooten HJ, Wesseling P, Wesseling J, Westenend PJ, et al. Receptor conversion in distant breast cancer metastases. Breast Cancer Res. BioMed Central Ltd; 2010;12: R75. pmid:20863372