Systems epidemiology offers a more comprehensive and holistic approach to studies of cancer in populations by considering high dimensionality measures from multiple domains, assessing the inter-relationships among risk factors, and considering changes over time. These approaches offer a framework to account for the complexity of cancer and contribute to a broader understanding of the disease. Therefore, NCI sponsored a workshop in February 2019 to facilitate discussion about the opportunities and challenges of the application of systems epidemiology approaches for cancer research. Eight key themes emerged from the discussion: transdisciplinary collaboration and a problem-based approach; methods and modeling considerations; interpretation, validation, and evaluation of models; data needs and opportunities; sharing of data and models; enhanced training practices; dissemination of systems models; and building a systems epidemiology community. This manuscript summarizes these themes, highlights opportunities for cancer systems epidemiology research, outlines ways to foster this research area, and introduces a collection of papers, “Cancer System Epidemiology Insights and Future Opportunities” that highlight findings based on systems epidemiology approaches.
Citation: Barajas R, Hair B, Lai G, Rotunno M, Shams-White MM, Gillanders EM, et al. (2021) Facilitating cancer systems epidemiology research. PLoS ONE 16(12): e0255328. https://doi.org/10.1371/journal.pone.0255328
Editor: Yi Jiang, Georgia State University, UNITED STATES
Published: December 31, 2021
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This research was supported by the Division of Cancer Control and Population Sciences at the National Cancer Institute (NCI), National Institutes of Health (NIH). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the National Cancer Institute.
Competing interests: The authors have declared that no competing interests exist.
Epidemiology research has been successful at identifying many risk factors for complex diseases such as cancer, but much of the etiology remains unexplained. This may be due, in part, to the limited focus of many studies on a small number of risk factors or contributors to disease within specific domains or measures. Moreover, many studies fail to evaluate the complexities and interrelations among multiple risk factors on each other and the study outcomes. Each individual risk factor, such as a single dietary component or genetic polymorphism, occurs in a broader biological or societal context that may modulate the effect of individual risk factors on cancer. Many risk factors for cancer are also highly correlated with possible interactive, additive, synergistic, or attenuating effects. Additionally, many risk factors are dynamic and time-varying: changes over the life course and the timing of exposure may modify cancer risk [1, 2].
Several groups have advocated for a more holistic or comprehensive analytic approach to the study of disease in populations [3–10]. This type of approach may lead to a better understanding of the mechanisms of disease and has been described by investigators using terminologies such as: eco-epidemiology , populomics , globolomics , systems medicine , and systems epidemiology [13, 15–18]. The term “systems epidemiology” conceptually borrows from fields such as systems biology, considering epidemiology research in a systems framework. For the purposes of this paper, we define systems epidemiology as an approach to study risk and outcomes that incorporates high-dimensional measurements from multiple domains, assesses the inter-relationships between risk factors, and considers changes over time. Our definition was adapted from Damman (2014) , however, we emphasize the importance of dynamism within a systems approach. Systems epidemiology research may leverage advanced computational simulation and modeling techniques to assess these complex networks and perform comprehensive analyses. Importantly, systems approaches are not limited to any single analytical method, but constitute a framework to account for complexity and understand the broader context of disease [7, 17, 19, 20].
Due to the complexity of cancer, a systems epidemiology approach may complement more traditional methods and lead to insights in disease etiology. To facilitate discussion about the application of systems modeling approaches for cancer epidemiology, NCI sponsored a workshop in February 2019 (https://epi.grants.cancer.gov/events/systems-epidemiology/) with presentations and discussions by experts in diverse fields to gain broader perspectives (S1 Table). In this manuscript, we summarize eight major themes from the workshop that will facilitate systems epidemiology research (Table 1) and discuss opportunities for this approach as exemplified by the accompanying papers in this PLOS Collection, “Cancer System Epidemiology Insights and Future Opportunities”.
Themes to facilitate systems epidemiology research
1. Transdisciplinary collaboration and a problem-based approach
To more holistically study cancer, collaboration across disciplines is required. Traditionally, there has been a tendency when studying complex diseases for researchers to focus on data from individual disciplines. Focusing on a problem-based approach could bridge scientists across disciplines and integrate unique perspectives to improve understanding [21, 22]. Specifically, systems epidemiology would benefit from building linkages between disease content experts and computational modelers and informaticists who can build informed computational models.
There are several examples of how transdisciplinary collaborations and a problem-based approach can lead to scientific insights. In one transdisciplinary collaboration, researchers developed improved methods of differentiating between benign and aggressive cancer lesions . Transdisciplinary collaboration also allows for methods to be developed and shared across fields. For example, dynamic agent-based modeling was developed for infectious disease modeling but now is commonly used in other complex disease analyses in public health [2, 24].
Several mechanisms, including previous NIH funding initiatives [21, 25–30], have encouraged transdisciplinary collaborations and problem-based approaches, which may serve as models to support this type of work. Another opportunity to foster transdisciplinary collaborations is to bring researchers together prior to applying for research funding. This type of process was used by the Cancer Research UK-NCI “Sandpit” workshop  and National Science Foundation ideas labs .
Challenges to be considered in developing transdisciplinary collaborations include sustainability and lack of a shared language among different scientific fields. Individuals with familiarity or training in multiple disciplines can serve as “translators” or “connectors” and facilitate interactions between distinct fields.
2. Methods and modeling considerations
Two complementary strategies were discussed to support systems epidemiology research: hypothesis-driven and data-driven strategies. In a hypothesis-driven strategy, researchers focus on the data necessary to address a specified scientific hypothesis and analyze the data to test that hypothesis. In a data-driven strategy, the most likely hypothesis is identified based on a more agnostic, algorithm-based data exploration of several hypotheses. If the goal is to understand the overall mechanism, a hypothesis-driven strategy may be preferable. At the same time, given that a hypothesis-driven strategy is limited by current knowledge and assumptions, a data-driven strategy may gain new knowledge by finding unexpected relationships through a more agnostic approach.
The application of a systems approach to epidemiology questions should be considered an iterative process involving several steps, including identifying the problem, determining the model to test, obtaining the data, analyzing results, refining the model, and repeating as necessary based on the results. At times, these steps can occur concurrently. An example of how modeling was used to inform data collection was demonstrated by the Cancer Intervention and Surveillance Modeling Network (CISNET) breast consortium. Using simulation modeling, the CISNET teams examined the need for radiotherapy in women assessed as low risk based on genomic testing. These results were useful for informing the design of clinical trials by identifying those populations where data from the trial would be most informative .
Applying a systems approach to epidemiology research may be supported by several types of existing methods including systems dynamics, network analysis, agent-based modeling, and others [19, 24, 34–41] (S2 Table). Regardless of the specific analytical method, the unique aspect of a systems epidemiology approach is accounting for the complexity of the system by considering multiple domains, inter-relationships between risk factors, and dynamism. Needs for additional methods were noted, in particular for models incorporating time and space, dealing with the unknown contribution of chance, and bridging multiple scales (e.g., protein, cell, tissue, individual, neighborhood, community, ecosystem). It is critical for researchers to understand the underlying assumptions in methods and the strengths or weaknesses of particular models for different situations and questions. The variety of methods makes it challenging to interpret and compare results obtained using different approaches. Therefore, an important component of applying systems approaches is the validation and evaluation of models.
3. Interpretation, validation, and evaluation of models
As the complexity of a model increases with additional variables, the sparsity of data increases, thereby reducing the ability to make predictions or classifications. The large number of attributes in a model can also result in overfitting, which leads to biases in a model and makes it difficult to generalize or apply the model to another population. These issues with complex modeling are often referred to as the curse of dimensionality problem . Furthermore, deep epidemiology data, including repeated measurements over time and assessments of multiple domains, is usually only available on smaller populations, limiting generalizability to other populations. Therefore, to adequately interpret the data and assess causality using these models, an iterative process is needed that includes well-designed validation and evaluation steps which will lead to model refinement and attenuation of these issues.
Model validation is the process of checking if all technical aspects (e.g., parameters definitions, coding, etc.) are done adequately or need refining and is preferably performed by an independent party . Model evaluation is the process of assessing the performance and reproducibility of a complete model to discover its likelihood to perform in real world conditions (e.g., training and testing, cross-validation, etc.) . To implement the validation and evaluation of their models, CISNET uses a comparative modeling approach where multiple research groups examine the same research questions using different models and identical predictors, and evaluate the results against real data trends. The consistency across models provides support for model predictions . When comparing results from complex models, care must be taken in the selection of the evaluation metric or the control used for comparison purposes. The dataset selected as a control could be biased in favor of the model being evaluated based on assumptions inherent in the model and the control dataset. Evaluating these complex models using common control datasets  could reduce this potential issue.
Given the complexity of methods for systems approaches, it may be challenging to reproduce all aspects of the analysis. One possible solution suggested at the workshop was to develop a reproducibility pipeline, or clear documentation for other researchers to apply the models on other populations. Notably, lack of reproducibility may also be due to intrinsic differences in the studied populations (e.g., by racial/ethnic or exposures distributions) . Fortunately, there is guidance for validating and evaluating complex models [43, 47]. Further emphasis on the best practices for application of systems models to epidemiology research may help advance the use of these models in epidemiology studies.
4. Data needs and opportunities
Sufficient data (real and simulated) is required to effectively characterize a system . Though workshop participants identified several potential data resources which could be used to support systems epidemiology research (S3 Table), gaps remain. One critical gap is the lack of inclusion of understudied groups, including racial and ethnic, socioeconomic, and geographic diversity and sexual/gender minorities [48, 49]. Insufficient racial and ethnic diversity is also apparent in genomics research [50, 51] and genomic catalogues [52, 53].
Other needs and opportunities identified by participants were for quality information about health behaviors, the built environment, and health care provider data. Systems epidemiology research could be enhanced by improving access or utilization of data sources such as: wearable devices (i.e. Fitbits), electronic health records , and large initiatives such as the All of Us cohort  and Environmental Influences on Child Health Outcomes (ECHO) Program . As the understanding of a system develops and new hypotheses emerge, data needs may change. Collecting broad and multiple data types may enable the examination of multiple hypotheses without going back to data collection, which is particularly challenging in population-based studies. Such a strategy was used by the Community of Mine study . Moreover, biobanks linked to medical record data provide another potential resource for systems epidemiology research  and could be leveraged to estimate risk factor or biomarker distributions in a target population missing that information [59, 60].
Characterizing the system requires combining or integrating several sources of data such as measures from different domains (e.g. genetics and behavioral) or spatial (e.g. cell to tissue) and temporal (e.g. day vs. year) scales. Often data is formatted uniquely, stored with different levels of metadata, or located in diverse databases. In fact, it was suggested that the resources required for integrating diverse, large-scale data types surpasses the resources required for generating these data . For multi-omic data, several software frameworks have been developed to address some of these challenges, including Galaxy, Taverna, KNIME, and bioKepler . Additional work is needed in this area.
5. Sharing of data and models
Improved methods for data linkages and model sharing across disciplines can facilitate systems epidemiology research by enabling a) analyses incorporating information from multiple domains; b) validation and evaluation of models and results; c) efficiency by avoiding duplication of efforts. Effective sharing and reuse of data and models requires adequate documentation (including metadata and descriptors) and mechanisms to assess quality and findability , which can be costly. The NIH has worked to provide additional funding support for data and model sharing [64–67]. Moreover, according to the NIH Genomic Data Sharing Policy (GDS), costs for sharing of data should be included in the project budget . Implementing carefully curated datasets or model resources and standardizing data quality indicators can increase confidence in methods and aid reproducibility and reusability. To address difficulty in finding the appropriate data set or method (i.e. findability), databases or resources that list and describe models are needed, such as the NCI Genetic Simulation Resources . Several platforms and infrastructures were discussed that support sharing data and/or analytical models (Table 2).
6. Enhanced training practices
Participants noted that encouraging a systems framework for epidemiology research will require improvements in training, including more opportunities focused on systems science such as the Systems Science for Social Impact program . Meeting participants suggested changes to the current epidemiology academic curriculum to incorporate systems training with more emphasis on complexity, transdisciplinary research, computational modeling, and informatics throughout the training continuum for epidemiologists. The current academic infrastructure is designed to develop scientists that are experts in specific fields/disciplines [22, 71], whereas systems approaches to epidemiology research require breadth of training across disciplines. Several training programs have supported this multidisciplinary model [72–74]. Another important training need is in the areas of data science, informatics, and computational modeling, particularly for population scientists. Training the next generation of data scientists and integrating these researchers into biomedical and public health fields is a priority within the NIH strategic plan for data science . Finally, continuing education programs for epidemiologists , along the lines of Continuing Medical Education (CME) course work for physicians, could also broaden use of systems methods. Topics suggested for this type of training included advanced modeling techniques and managing and interpreting uncertainty in models.
7. Dissemination of systems models
For systems epidemiology modeling to be useful for research and policy, models and results using these methods need to be disseminated and accepted.
Stakeholders (e.g., patients, providers, payers, policy makers) should be involved early in model development to inform parameters and priorities. Incorporating stakeholder feedback can improve model quality by better defining the system and increase stakeholders’ adoption of such models. Obtaining feedback on models as they are being developed through early publication may also lead to better models. However, journals may be reluctant to publish conceptual models in the absence of application results. A venue allowing for publication of early conceptual models could promote feedback (e.g., the preprint server bioRxiv https://www.biorxiv.org/).
Another key component to dissemination of models is effectively communicating models to the community for researchers, clinicians, policymakers, and the general public. Making the results interpretable regardless of model complexity would build confidence in the model and results . Moreover, it is important to explain that uncertainty in the results remains even though systems models are sophisticated . Effective communication could be enhanced by encouraging media training for scientists.
8. Building a systems epidemiology community
Growth in the application of systems approaches to epidemiology research will require building a community of systems epidemiology researchers. Workshop participants noted that the current workshop was unique and expressed enthusiasm for bringing together researchers from disparate fields to address complex problems. In addition to periodic in-person meetings or workshops such as the one described by this paper, one strategy to build this type of community is to establish organizations, interest groups, or social platforms that can bring many different scientists together to share ideas and discuss and compare models, such as the Interdisciplinary Association for Public Health Science (IAPHS) . Opportunities to promote cross-fertilization of ideas would be a systems epidemiology-focused journal, or a journal collection such as this one, where researchers from different disciplines can publish papers in this arena.
Another strategy to build a systems epidemiology community is through tailored grant reviews and specialized funding opportunities. Some meeting participants suggested that the non-linear design and the multidisciplinarity underlying complex modeling and systems approaches do not easily fit into the traditional three-aim structure of R01 applications, making it more challenging for this type of research to compete for funding. Special funding opportunities that support more complex projects and are amenable to non-linear aims, feedback loops and iterative processes may be helpful for this field. Special review panels for systems epidemiology applications could include reviewers from different disciplines, with at least one reviewer with computational modeling expertise, assigned to review each application.
Opportunities for systems epidemiology research
In addition to the eight themes highlighted which would facilitate systems epidemiology, several research opportunities that may be addressed using a systems epidemiology approach were discussed by workshop participants (Table 3).
One research opportunity that received substantial attention was to use systems approaches to help understand and alleviate health disparities. Complex social, behavioral, environmental, biological, and ecological contributions to disparities vary by context, impact multiple scales, and involve nonlinear and multidirectional associations (or feedback loops). The systems nature of health disparities may explain their persistence across different diseases. A systems approach may thus provide valuable insights into the etiology of disparities to highlight sources of inequities, identify data needs, and improve interventions .
These research opportunities and the papers in this “Cancer System Epidemiology Insights and Future Opportunities” collection illustrate the promise of systems epidemiology approaches. However, a portfolio analysis by Shams-White et al. found that despite specific systems and computational modeling funding announcements, the representation of systems epidemiology grants in cancer research remains low . Together the above examples and these results suggest that many cancer-related research questions addressable using a systems approach may therefore benefit from tailored funding opportunities.
Conclusions and next steps
At the outset of the workshop, several participants expressed uncertainty about the definition of systems epidemiology. Nevertheless, there was overall agreement about the need for the general approach. Some participants suggested that it was important to emphasize the time element, or dynamism, within the definition as changes over time are critical to consider and are often missing in traditional studies. Others underlined the importance of data as the availability of high throughput data can help support more systems-based approaches.
To conclude, workshop participants supported a more comprehensive approach to population-based research studies and identified several considerations to facilitate the field of systems epidemiology. The workshop identified several themes or considerations for facilitating systems epidemiology research and exemplified research opportunities. These themes included: transdisciplinary collaboration and a problem-based approach; methods and modeling considerations; interpretation, validation, and evaluation of models; data needs and opportunities; sharing of data and models; enhanced training practices; dissemination of systems models; and building a systems epidemiology community. As a first step to continue the conversation, several researchers participated in this collection of papers, outlining research opportunities and findings using systems epidemiology approaches. Our intent is that this collection will further spark discussion and foster continued research in this area.
S1 Table. Expertise of workshop participants.
S2 Table. Example methods applicable for systems epidemiology.
- 1. Colditz GA, Wei EK. Preventability of cancer: the relative contributions of biologic and social and physical environmental determinants of cancer mortality. Annu Rev Public Health. 2012;33:137–56. Epub 2012/01/10. pmid:22224878; PubMed Central PMCID: PMC3631776.
- 2. Hiatt RA, Porco TC, Liu F, Balke K, Balmain A, Barlow J, et al. A multilevel model of postmenopausal breast cancer incidence. Cancer Epidemiol Biomarkers Prev. 2014;23(10):2078–92. Epub 2014/07/16. pmid:25017248.
- 3. Burke TA, Cascio WE, Costa DL, Deener K, Fontaine TD, Fulk FA, et al. Rethinking Environmental Protection: Meeting the Challenges of a Changing World. Environ Health Perspect. 2017;125(3):A43–a9. Epub 2017/03/02. pmid:28248180; PubMed Central PMCID: PMC5332174.
- 4. Diez Roux AV. Complex systems thinking and current impasses in health disparities research. Am J Public Health. 2011;101(9):1627–34. Epub 2011/07/23. pmid:21778505; PubMed Central PMCID: PMC3154209.
- 5. Hu FB. Metabolic profiling of diabetes: from black-box epidemiology to systems epidemiology. Clin Chem. 2011;57(9):1224–6. Epub 2011/06/22. pmid:21690202.
- 6. Lee BY, Bartsch SM, Mui Y, Haidari LA, Spiker ML, Gittelsohn J. A systems approach to obesity. Nutr Rev. 2017;75(suppl 1):94–106. Epub 2017/01/05. pmid:28049754; PubMed Central PMCID: PMC5207008.
- 7. Lich KH, Ginexi EM, Osgood ND, Mabry PL. A call to address complexity in prevention science research. Prev Sci. 2013;14(3):279–89. Epub 2012/09/18. pmid:22983746.
- 8. Mabry PL, Kaplan RM. Systems science: a good investment for the public’s health. Health Educ Behav. 2013;40(1 Suppl):9s–12s. Epub 2013/10/23. pmid:24084406.
- 9. Orr MG, Kaplan GA, Galea S. Neighbourhood food, physical activity, and educational environments and black/white disparities in obesity: a complex systems simulation analysis. J Epidemiol Community Health. 2016;70(9):862–7. Epub 2016/04/17. pmid:27083491.
- 10. Weed DL. Beyond black box epidemiology. Am J Public Health. 1998;88(1):12–4. Epub 1998/05/16. PubMed Central PMCID: PMC1508377. pmid:9584017
- 11. Susser M, Susser E. Choosing a future for epidemiology: I. Eras and paradigms. Am J Public Health. 1996;86(5):668–73. Epub 1996/05/01. PubMed Central PMCID: PMC1380474. pmid:8629717
- 12. Gibbons MC. Populomics. Stud Health Technol Inform. 2008;137:265–8. Epub 2008/06/19. pmid:18560087.
- 13. Lund E, Dumeaux V. Systems epidemiology in cancer. Cancer Epidemiol Biomarkers Prev. 2008;17(11):2954–7. Epub 2008/11/08. pmid:18990736.
- 14. Nielsen J. Systems Biology of Metabolism: A Driver for Developing Personalized and Precision Medicine. Cell Metab. 2017;25(3):572–9. Epub 2017/03/09. pmid:28273479.
- 15. Cerdá M, Keyes KM. Systems Modeling to Advance the Promise of Data Science in Epidemiology. Am J Epidemiol. 2019;188(5):862–5. Epub 2019/03/17. pmid:30877289; PubMed Central PMCID: PMC6494667.
- 16. Cornelis MC, Hu FB. Systems Epidemiology: A New Direction in Nutrition and Metabolic Disease Research. Curr Nutr Rep. 2013;2(4). Epub 2013/11/28. pmid:24278790; PubMed Central PMCID: PMC3837346.
- 17. Dammann O, Gray P, Gressens P, Wolkenhauer O, Leviton A. Systems Epidemiology: What’s in a Name? Online J Public Health Inform. 2014;6(3):e198. Epub 2015/01/20. pmid:25598870; PubMed Central PMCID: PMC4292535.
- 18. Haring R, Wallaschofski H. Diving through the "-omics": the case for deep phenotyping and systems epidemiology. Omics. 2012;16(5):231–4. Epub 2012/02/11. pmid:22320900; PubMed Central PMCID: PMC3339382.
- 19. Luke DA, Stamatakis KA. Systems science methods in public health: dynamics, networks, and agents. Annu Rev Public Health. 2012;33:357–76. Epub 2012/01/10. pmid:22224885; PubMed Central PMCID: PMC3644212.
- 20. McGuire S. Institute of Medicine. 2012. Accelerating progress in obesity prevention: solving the weight of the nation. Washington, DC: the National Academies Press. Adv Nutr. 2012;3(5):708–9. Epub 2012/09/18. pmid:22983849; PubMed Central PMCID: PMC3648752.
- 21. Patterson RE, Colditz GA, Hu FB, Schmitz KH, Ahima RS, Brownson RC, et al. The 2011–2016 Transdisciplinary Research on Energetics and Cancer (TREC) initiative: rationale and design. Cancer Causes Control. 2013;24(4):695–704. Epub 2013/02/05. pmid:23378138; PubMed Central PMCID: PMC3602225.
- 22. Pohl C, Hirsch Hadorn G. Principles for designing transdisciplinary research. Munich: Oekom Verlag; 2007.
- 23. Frankhauser DE, Jovanovic-Talisman T, Lai L, Yee LD, Wang LV, Mahabal A, et al. Spatiotemporal strategies to identify aggressive biology in precancerous breast biopsies. Wiley Interdiscip Rev Syst Biol Med. 2020:e1506. Epub 2020/10/02. pmid:33001587.
- 24. Tracy M, Cerdá M, Keyes KM. Agent-Based Modeling in Public Health: Current Applications and Future Directions. Annu Rev Public Health. 2018;39:77–94. Epub 2018/01/13. pmid:29328870; PubMed Central PMCID: PMC5937544.
- 25. Bian J, Xie M, Topaloglu U, Hudson T, Eswaran H, Hogan W. Social network analysis of biomedical research collaboration networks in a CTSA institution. J Biomed Inform. 2014;52:130–40. Epub 2014/02/25. pmid:24560679; PubMed Central PMCID: PMC4136998.
- 26. Bures RM, Mabry PL, Orleans CT, Esposito L. Systems science: a tool for understanding obesity. Am J Public Health. 2014;104(7):1156. Epub 2014/05/17. pmid:24832433; PubMed Central PMCID: PMC4056227.
- 27. Hammond RA. Complex systems modeling for obesity research. Prev Chronic Dis. 2009;6(3):A97. Epub 2009/06/17. pmid:19527598; PubMed Central PMCID: PMC2722404.
- 28. Llewellyn N, Carter DR, DiazGranados D, Pelfrey C, Rollins L, Nehl EJ. Scope, Influence, and Interdisciplinary Collaboration: The Publication Portfolio of the NIH Clinical and Translational Science Awards (CTSA) Program From 2006 Through 2017. Eval Health Prof. 2020;43(3):169–79. Epub 2019/03/29. pmid:30917690; PubMed Central PMCID: PMC7781230.
- 29. Mabry PL, Olster DH, Morgan GD, Abrams DB. Interdisciplinarity and systems science to improve population health: a view from the NIH Office of Behavioral and Social Sciences Research. Am J Prev Med. 2008;35(2 Suppl):S211–24. Epub 2008/08/23. pmid:18619402; PubMed Central PMCID: PMC2587290.
- 30. Schmitz KH, Gehlert S, Patterson RE, Colditz GA, Chavarro JE, Hu FB, et al. TREC to WHERE? Transdisciplinary Research on Energetics and Cancer. Clin Cancer Res. 2016;22(7):1565–71. Epub 2016/01/23. pmid:26792261; PubMed Central PMCID: PMC4956346.
- 31. National Cancer Institute. NCI-CRUK Sandpit Workshops 2019 [cited 2020 5/27/2020]. Available from: https://cancercontrol.cancer.gov/brp/hbrb/sandpit.html.
- 32. Collins T, Kearney M, Maddison D. The Ideas Lab Concept, Assembling the Tree of Life, and AVAToL. PLoS Curr. 2013;5. Epub 2013/09/21. pmid:24045602; PubMed Central PMCID: PMC3770768.
- 33. Jayasekera J, Schechter CB, Sparano JA, Jagsi R, White J, Chapman JW, et al. Effects of Radiotherapy in Early-Stage, Low-Recurrence Risk, Hormone-Sensitive Breast Cancer. J Natl Cancer Inst. 2018;110(12):1370–9. Epub 2018/09/22. pmid:30239794; PubMed Central PMCID: PMC6292790.
- 34. Fallah-Fini S, Adam A, Cheskin LJ, Bartsch SM, Lee BY. The Additional Costs and Health Effects of a Patient Having Overweight or Obesity: A Computational Model. Obesity (Silver Spring). 2017;25(10):1809–15. Epub 2017/09/28. pmid:28948718; PubMed Central PMCID: PMC5679120.
- 35. Jones AP, Homer JB, Murphy DL, Essien JD, Milstein B, Seville DA. Understanding diabetes population dynamics through simulation modeling and experimentation. Am J Public Health. 2006;96(3):488–94. Epub 2006/02/02. pmid:16449587; PubMed Central PMCID: PMC1470507.
- 36. Langellier BA, Bilal U, Montes F, Meisel JD, Cardoso LO, Hammond RA. Complex Systems Approaches to Diet: A Systematic Review. Am J Prev Med. 2019;57(2):273–81. Epub 2019/07/22. pmid:31326011; PubMed Central PMCID: PMC6650152.
- 37. Lee BY, Adam A, Zenkov E, Hertenstein D, Ferguson MC, Wang PI, et al. Modeling The Economic And Health Impact Of Increasing Children’s Physical Activity In The United States. Health Aff (Millwood). 2017;36(5):902–8. Epub 2017/05/04. pmid:28461358; PubMed Central PMCID: PMC5563819.
- 38. Lee BY, Ferguson MC, Hertenstein DL, Adam A, Zenkov E, Wang PI, et al. Simulating the Impact of Sugar-Sweetened Beverage Warning Labels in Three Cities. Am J Prev Med. 2018;54(2):197–204. Epub 2017/12/19. pmid:29249555; PubMed Central PMCID: PMC5783749.
- 39. Li Y, Lawley MA, Siscovick DS, Zhang D, Pagán JA. Agent-Based Modeling of Chronic Diseases: A Narrative Review and Future Research Directions. Prev Chronic Dis. 2016;13:E69. Epub 2016/05/30. pmid:27236380; PubMed Central PMCID: PMC4885681.
- 40. Powell-Wiley TM, Wong MS, Adu-Brimpong J, Brown ST, Hertenstein DL, Zenkov E, et al. Simulating the Impact of Crime on African American Women’s Physical Activity and Obesity. Obesity (Silver Spring). 2017;25(12):2149–55. Epub 2017/11/01. pmid:29086471; PubMed Central PMCID: PMC5705259.
- 41. Price ND, Magis AT, Earls JC, Glusman G, Levy R, Lausted C, et al. A wellness study of 108 individuals using personal, dense, dynamic data clouds. Nat Biotechnol. 2017;35(8):747–56. Epub 2017/07/18. pmid:28714965; PubMed Central PMCID: PMC5568837.
- 42. Altman N, Krzywinski M. The curse(s) of dimensionality. Nat Methods. 2018;15(6):399–400. Epub 2018/06/02. pmid:29855577.
- 43. Eddy DM, Hollingworth W, Caro JJ, Tsevat J, McDonald KM, Wong JB. Model transparency and validation: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force—7. Value Health. 2012;15(6):843–50. Epub 2012/09/25. pmid:22999134.
- 44. Khurana Y. Medium [Internet]2019. Available from: https://medium.com/yogesh-khuranas-blogs/difference-between-model-validation-and-model-evaluation-1a931d908240.
- 45. Alagoz O, Berry DA, de Koning HJ, Feuer EJ, Lee SJ, Plevritis SK, et al. Introduction to the Cancer Intervention and Surveillance Modeling Network (CISNET) Breast Cancer Models. Med Decis Making. 2018;38(1_suppl):3s–8s. Epub 2018/03/20. pmid:29554472; PubMed Central PMCID: PMC5862043.
- 46. Mechanic LE, Chen HS, Amos CI, Chatterjee N, Cox NJ, Divi RL, et al. Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genet Epidemiol. 2012;36(1):22–35. Epub 2011/12/08. pmid:22147673; PubMed Central PMCID: PMC3368075.
- 47. 2020 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation. [Internet]. 2020 [cited 2020 5/6/2020]. Available from: http://sbp-brims.org/2020/cfp/.
- 48. Martin DN, Lam TK, Brignole K, Ashing KT, Blot WJ, Burhansstipanov L, et al. Recommendations for Cancer Epidemiologic Research in Understudied Populations and Implications for Future Needs. Cancer Epidemiol Biomarkers Prev. 2016;25(4):573–80. Epub 2016/05/20. pmid:27196089; PubMed Central PMCID: PMC4874661.
- 49. Swerdlow AJ, Harvey CE, Milne RL, Pottinger CA, Vachon CM, Wilkens LR, et al. The National Cancer Institute Cohort Consortium: An International Pooling Collaboration of 58 Cohorts from 20 Countries. Cancer Epidemiol Biomarkers Prev. 2018;27(11):1307–19. Epub 2018/07/19. pmid:30018149.
- 50. Bustamante CD, Burchard EG, De la Vega FM. Genomics for the world. Nature. 2011;475(7355):163–5. Epub 2011/07/15. pmid:21753830; PubMed Central PMCID: PMC3708540.
- 51. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature. 2016;538(7624):161–4. Epub 2016/10/14. pmid:27734877; PubMed Central PMCID: PMC5089703.
- 52. Iorio A, De Angelis F, Di Girolamo M, Luigetti M, Pradotto LG, Mazzeo A, et al. Population diversity of the genetically determined TTR expression in human tissues and its implications in TTR amyloidosis. BMC Genomics. 2017;18(1):254. Epub 2017/03/25. pmid:28335735; PubMed Central PMCID: PMC5364715.
- 53. Spratt DE, Chan T, Waldron L, Speers C, Feng FY, Ogunwobi OO, et al. Racial/Ethnic Disparities in Genomic Sequencing. JAMA Oncol. 2016;2(8):1070–4. Epub 2016/07/02. pmid:27366979; PubMed Central PMCID: PMC5123755.
- 54. Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. Curr Protoc Hum Genet. 2019;100(1):e80. Epub 2018/12/06. pmid:30516347; PubMed Central PMCID: PMC6318047.
- 55. Denny JC, Rutter JL, Goldstein DB, Philippakis A, Smoller JW, Jenkins G, et al. The "All of Us" Research Program. N Engl J Med. 2019;381(7):668–76. Epub 2019/08/15. pmid:31412182.
- 56. Environmental Influences on Child Health Outcomes (ECHO) Program 2020 [cited 2020 3/16/2020]. Available from: https://www.nih.gov/research-training/environmental-influences-child-health-outcomes-echo-program.
- 57. Jankowska MM, Sears DD, Natarajan L, Martinez E, Anderson CAM, Sallis JF, et al. Protocol for a cross sectional study of cancer risk, environmental exposures and lifestyle behaviors in a diverse community sample: the Community of Mine study. BMC Public Health. 2019;19(1):186. Epub 2019/02/15. pmid:30760246; PubMed Central PMCID: PMC6375220.
- 58. Wang K, Gaitsch H, Poon H, Cox NJ, Rzhetsky A. Classification of common human diseases derived from shared genetic and environmental determinants. Nat Genet. 2017;49(9):1319–25. Epub 2017/08/08. pmid:28783162; PubMed Central PMCID: PMC5577363.
- 59. Franks PW, McCarthy MI. Exposing the exposures responsible for type 2 diabetes and obesity. Science. 2016;354(6308):69–73. Epub 2016/11/16. pmid:27846494.
- 60. Sanna S, van Zuydam NR, Mahajan A, Kurilshikov A, Vich Vila A, Võsa U, et al. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. Nat Genet. 2019;51(4):600–5. Epub 2019/02/20. pmid:30778224; PubMed Central PMCID: PMC6441384.
- 61. Palsson B, Zengler K. The challenges of integrating multi-omic data sets. Nat Chem Biol. 2010;6(11):787–9. Epub 2010/10/27. pmid:20976870.
- 62. Boekel J, Chilton JM, Cooke IR, Horvatovich PL, Jagtap PD, Käll L, et al. Multi-omic data analysis using Galaxy. Nat Biotechnol. 2015;33(2):137–9. Epub 2015/02/07. pmid:25658277.
- 63. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. Epub 2016/03/16. pmid:26978244; PubMed Central PMCID: PMC4792175 Academic Editor and consultant.
- 64. Department of Health and Human Services. Research Supplements to Promote Data Sharing in Cancer Epidemiology Studies (Admin Supp Clinical Trial Not Allowed) 2018. Available from: https://grants.nih.gov/grants/guide/pa-files/PA-18-748.html.
- 65. Department of Health and Human Services. Notice of Special Interest (NOSI): Administrative Supplements to Support Enhancement of Software Tools for Open Science 2020 [cited 2020 5/7/2020]. Available from: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-20-073.html.
- 66. Department of Health and Human Services. Biomedical Data Repository (U24—Clinical Trials Not Allowed) 2020 [cited 2020 7/8/2020]. Available from: https://grants.nih.gov/grants/guide/pa-files/PAR-20-089.html.
- 67. Department of Health and Human Services. Biomedical Knowledgebase (U24—Clinical Trials Not Allowed) 2020 [cited 2020 7/8/2020]. Available from: https://grants.nih.gov/grants/guide/pa-files/PAR-20-097.html.
- 68. Paltoo DN, Rodriguez LL, Feolo M, Gillanders E, Ramos EM, Rutter JL, et al. Data use under the NIH GWAS data sharing policy and future directions. Nat Genet. 2014;46(9):934–8. Epub 2014/08/28. pmid:25162809; PubMed Central PMCID: PMC4182942.
- 69. Peng B, Leong MC, Chen HS, Rotunno M, Brignole KR, Clarke J, et al. Genetic Simulation Resources and the GSR Certification Program. Bioinformatics. 2019;35(4):709–10. Epub 2018/08/14. pmid:30101297; PubMed Central PMCID: PMC6378936.
- 70. Washington University in St. Louis. Systems Science for Social Impact: Summer Training Institute 2021 [cited 2021 5/31/2021]. Available from: https://systemsscienceforsocialimpact.wustl.edu/.
- 71. Golembiewski EH, Holmes AM, Jackson JR, Brown-Podgorski BL, Menachemi N. Interdisciplinary Dissertation Research Among Public Health Doctoral Trainees, 2003–2015. Public Health Rep. 2018;133(2):182–90. Epub 2018/02/14. pmid:29438623; PubMed Central PMCID: PMC5871143.
- 72. James AS, Gehlert S, Bowen DJ, Colditz GA. A Framework for Training Transdisciplinary Scholars in Cancer Prevention and Control. J Cancer Educ. 2015;30(4):664–9. Epub 2014/12/17. pmid:25510368; PubMed Central PMCID: PMC4469633.
- 73. Krause K. 2019. [cited 2020]. Available from: https://asunow.asu.edu/20191010-bold-restructuring-asu-college-health-solutions-results-growth-and-innovation.
- 74. Realmuto L, Daniel S, Jasani F, Weiss L, Bachrach C. Developing population health scientists: Findings from an evaluation of the Robert Wood Johnson Foundation Health & Society Scholars Program. SSM Popul Health. 2019;7:100373. Epub 2019/02/28. pmid:30809585; PubMed Central PMCID: PMC6374691.
- 75. National Institutes of Health. NIH Strategic Plan for Data Science. 2018.
- 76. Bensyl DM, King ME, Greiner A. Applied Epidemiology Training Needs for the Modern Epidemiologist. Am J Epidemiol. 2019;188(5):830–5. Epub 2019/03/17. pmid:30877297; PubMed Central PMCID: PMC6608580.
- 77. Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, et al. Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface. 2018;15(141). Epub 2018/04/06. pmid:29618526; PubMed Central PMCID: PMC5938574.
- 78. Chatterjee N. Towards Data Science [Internet]2020. [cited 2020]. Available from: https://towardsdatascience.com/transparency-reproducibility-and-validity-of-covid-19-projection-models-78592e029f28.
- 79. Interdisciplinary Association for Population Health Science (IAPHS). Available from: https://iaphs.org/about-iaphs/.
- 80. Shams-White MM, Barajas R, Jensen RE, Rotunno M, Dueck H, Ginexi EM, et al. Systems epidemiology and cancer: A review of the National Institutes of Health extramural grant portfolio 2013–2018. PLoS One. 2021;16(4):e0250061. Epub 2021/04/16. pmid:33857240; PubMed Central PMCID: PMC8049352.