Simulation modeling to assess performance of integrated healthcare systems: Literature review to characterize the field and visual aid to guide model selection

Background The guiding principle of many health care reforms is to overcome fragmentation of service delivery and work towards integrated healthcare systems. Even though the value of integration is well recognized, capturing its drivers and its impact as part of health system performance assessment is challenging. The main reason is that current assessment tools only insufficiently capture the complexity of integrated systems, resulting in poor impact estimations of the actions taken towards the ‘Triple Aim’. We describe the unique nature of simulation modeling to consider key health reform aspects: system complexity, optimization of actions, and long-term assessments. Research question How can the use and uptake of simulation models be characterized in the field of performance assessment of integrated healthcare systems? Methods A systematic search was conducted between 2000 and 2018, in 5 academic databases (ACM D. Library, CINAHL, IEEE Xplore, PubMed, Web of Science) complemented with grey literature from Google Scholar. Studies using simulation models with system thinking to assess system performance in topics relevant to integrated healthcare were selected for revision. Results After screening 2274 articles, 30 were selected for analysis. Five modeling techniques were characterized, across four application areas in healthcare. Complexity was defined in nine aspects, embedded distinctively in each modeling technique. ‘What if?’ & ‘How to?’ scenarios were identified as methods for system optimization. The mean time frame for performance assessments was 18 years. Conclusions Simulation models can evaluate system performance emphasizing the complex relations between components, understanding the system’s adaptability to change in short or long-term assessments. These advantages position them as a useful tool for complementing performance assessment of integrated healthcare systems in their pursuit of the ‘Triple Aim’. Besides literacy in modeling techniques, accurate model selection is facilitated after identification and prioritization of the complexities that rule system performance. For this purpose, a tool for selecting the most appropriate simulation modeling techniques was developed.

interventions on the performance of the system over time. Combining expert opinion with observational and experimental results, SM provides a relatively inexpensive way to estimate individual and population-level effects of changes in the system's determinants of performance.
There is extensive literature reviewing simulation models in the healthcare sector. Salleh et al. [18] published an umbrella review including 37 reviews, that together cover articles from 1950 to 2016 and explore the wide range of applications in healthcare, software tools and data sources used in the field of healthcare simulation. Meanwhile, the paper by Günal & Pidd [19] starts by narrating the historic progression of simulation modeling an its applications in healthcare, giving some idea of the long history of the field. Most recently (2021), Roy et al. [17] analyzed healthcare simulation literature of the past decade, addressing issues in various healthcare service delivery levels and categorizing the literature accordingly. Altogether, literature in the field provides a comprehensive characterization of simulation models in healthcare, including; the areas and types of application where the discipline has been used, the techniques available, data sources, simulation software [18,20,21], type of outputs and level of insight, inputs and resources required [22], relative frequency of use and level of implementation [23] and specific aspects of a care facility operations where techniques are most common [17,24,25]. These topics are most commonly analyzed following a structure similar to the one best represented by Mielczarek et al. [25], who creates a system of classification of health care topic areas assessed with simulation methods. The objective is to investigate the usefulness of modeling techniques and their correlation with a corresponding health care application. While authors add innovations to this common structure, such as the identification of research gaps influencing the limited uptake of the discipline [20,23,24,26] or exploring the link between interventions and key performance indicators (KPI) [26], complexities in the relationships of system components have been heavily underassessed. Roy [17] recognizes the complexity of the health system and the ability of simulation modeling to address this complexity, but his review focuses on capturing specific health issues addressed, operations management concepts applied, simulation methods used, and identifying major research gaps-a framework similar to the one by Mielczarek et al. [25]. Vanbrabant et al. [26] also acknowledges simulations as the technique most suitable to capture the randomness and complexity of patient flow through the emergency department. But the analysis is limited to providing insights into which interventions influence which KPI. In the same line, Laker [27] also recognizes the usefulness of simulation models to integrate complexity, and provides an excellent summary of the properties of four simulation techniques. However, it fails on providing a common framework to characterize and contrast the complexities that can be represented in each technique. Complexity is also indirectly mentioned in identified research gaps, when both Vanbrand et al. and Yousefi et al. [20] state the underuse of simulation models in multi-objective evaluations and Brailsford et al. and Roy et al. [17,24] suggests that healthcare is an area of application for hybrid simulation due in part to increasing system complexity.
By overlooking complexity, the advantages of simulation modeling and the challenges of IHS performance assessment remain unmatched. Furthermore, simulation time frames and optimization capabilities, standard knowledge for simulation experts but not for healthcare managers [28], are also overlooked in reviews summarizing the use of SM in healthcare. The gap results in simulation models not been systematically picked up by integrated healthcare managers to assess performance of IHS. The issue was partially addressed in 2015 by the ISO-POR task force [14,29], who published a series of papers describing how three of the most common simulation modeling techniques can be used to evaluate complex health systems and provide descriptions and tools to implement them accurately. However, at the time there was no common understanding of the drivers affecting IHS performance hence a clear explanation and exemplification of how these particularly complex health systems could make use of simulation models to assess performance was not possible.
This literature review is intended to bring together the field of performance assessment of integrated healthcare systems and the discipline of simulation modeling. We contribute to the vast literature characterizing the use of simulation modeling in health system performance assessment by focusing specifically on the discipline's ability to implement a complex system perspective in topics relevant to IHS. Our research is directed to readers that seek to expand performance assessment tools while considering the enhanced complexity embedded in the integrated care approach. We conclude our analysis with the creation of a practical tool for selecting the most appropriate simulation modeling technique depending on the characteristics of the system to be modeled.

Search strategy
A comprehensive search strategy was performed directed to find articles that allowed us to understand how simulation modeling techniques implemented a complex system perspective in topics relevant to IHS. The systematic search was conducted in 5 academic databases (ACM Digital Library, CINAHL, IEEE Xplore, PubMed & Web of Science). Grey literature was searched for in Google Scholar and only considered if articles complied with all the criteria in the AACODS checklist for critically appraising grey literature [30]. Finally, papers were also added through snowballing. The search was conducted for the period 01/01/2000-31/12/2018 as an increased interest in SM has been documented after this starting date, supported by technology advances [18]. The review was registered in PROSPERO (Registration number: CRD42020149658).
A Boolean search code was developed with three scopes of terms. The first scope, "Technique", filters for simulation modeling techniques and combines 17 systematic search strategies extracted from the umbrella review by Salleh et al. [18] added to the list of simulation modeling techniques described in Jun et al. [22]. The second scope, "Integrated healthcare systems topics of interest" is defined by 76 search terms, extracted from the indicator types and domains stated in the framework for performance assessment of IHS developed by the Expert Group of the European Commission [3,6,7], the systematic review of methods for IHS performance assessment by Strandberg-Larsen et al. [31] and the "Care Coordination Measures Atlas" by McDonald et al. [32]. Finally, the third scope refers to the healthcare sector. Terms in the first scope ("Techniques") were restricted to the title, and terms in the other scopes were restricted to title/abstract. The complete list of terms can be found in S1 Table.

Selection criteria
Inclusion criteria. Only health system evaluations taking a complex system perspective were considered. Furthermore, we only included articles that used a simulation model in the list of techniques described in Salleh et al. [18] or in Jun et al. [22] as SM techniques or self-identified as such. Finally, articles further had to address the performance assessment of an IHS topicof-interest. The lists of SM techniques and IHS topics of interest can be found in S2 Table. Exclusion criteria. We excluded studies that described non-computer-based simulation models. Also, we excluded studies that were not calibrated and validated against data from a real situation. Finally, we excluded from the data extraction and analysis studies whose reporting standards were insufficient to replicate the assessment or did not fully enable reviewers the complete understanding of the implementation of systems thinking. To comply with the latter criteria, only studies graded 'A' in quality assessment were selected.

Quality assessment
Two independent reviewers (SW & NL) assessed the quality of papers during the screening process. Using the quality assessment tool developed by Fone et al. [33] to appraise simulation modeling studies, reviewers gave a score of 0, 1, or 2 in ten criteria and created four quality groups (A to D). The quality assessment was followed with an assessment of the credibility and relevance of the articles for the purpose of this review and aided reviewers to select articles for revision. Given the focus of the review, an assessment of the risk of bias in the study's results was not considered.

Data extraction and analysis
Data extraction was made by the main author (NL), based on the template used by Brailsford et al. in their analysis of simulation and modeling techniques for healthcare [23]. The final extraction sheet was modified focusing on two main topics. First, to characterize the different modeling techniques, their area of application, key features for implementation, together with data requirements and outcomes. Second, to characterize the complex aspects of the health system that each technique can represent. The detailed data extraction sheet can be found in S3 Table. The analysis was conducted in two phases. First, using a 'Deductive a priori template approach' [34] articles were classified and characterized according to previous assessments of SM made by Jun et al. [22], Salleh et al. [18], and Rueckel et al. [21]. Subsequently, in a 'Datadriven inductive approach' [35], simulation modeling techniques were re-characterized in five items following the objectives of this review. Item (1.) presents the IHS topics-of-interest where SM has been successfully applied. The item aims to inform and exemplify in what situation of interest to IHS can the discipline be useful, in a similar structure of the analysis of previous literature. Even though the selection of articles is primarly intended to understand simulation models' ability to integrate the shortcomings of IHS performance assessment, and not to identify the link between simulation technique and helathcare area, a similar analysis to that of previous literature will allow us to validate our findinds when compared to conclusions of other authors. In item (2.) we supported the analysis of the reviewed papers with further literature and present an introductory description of the identified simulation techniques, explaining how they are applied in the topics of interest to IHS. The last three items were selected to explore how simulation models deal with challenges that are particularly harmful to integrated care and are not yet mirrored in traditional assessment tools [8]. Item (3.) presents the modeled complexities in relationships between system components, summarized per modeling technique. The item allows us to understand the capacities of each technique to correctly model causality paths and co-existing effects, essential concerns for several integrated care interest areas [8,10]. Item (4.) presents the identified optimization capabilities, essential function of any tool guiding healthcare to the Triple Aim [3,10,11,36]. Preventive medicine and overall population health improvements are known to show effects only after several years after intervention [7] and as they comprise an essential part of IHS, Item (5.) presents the time frame of the selected papers to understand the capacity for long-term assessments.

Results
The search resulted in 2271 unique articles. Screenings at title/abstract and full text were made by two separate reviewers (SW & NL) and resulted in seventy-six articles selected for quality assessment. Out of these, thirty studies were included for data extraction and analysis because of their reporting quality and detailed description of system thinking. Fig 1 presents the PRISMA diagram of the selection process. Selected papers are described in Table 1.

IHS topics-of-interest
Eleven IHS topics-of-interest were identified and classified in four areas of assessment. The first area of assessment covers simulation models of Policy and Strategy. This comprises studies that use simulation modeling for evaluating health policies and interventions directed to change or improve the structure, assess incentives, goals, or values in the overall system; such as (1.) pay for performance incentive scheme or (2.) national health reform evaluation. The second area of assessment covers Chronic Disease Management. Studies in this area evaluated the effectiveness of interventions or the evolution of chronic conditions, such as (3.) evaluating care management and interventions of chronic conditions and (4). diabetes population dynamics. The third area of assessment addresses Lifestyle Interventions, including evaluation of interventions directed at lifestyle behavior, health risks, and social determinants of health, such as (5.) tobacco harm policies, market control, and interventions or (6.) evaluation of public health interventions. The last area of assessment addresses Health Resource Management and comprises studies that use SM for resource management or system design to optimize healthcare service flow or forecast demands. In this area topics were (7.) performance evaluation of community health, (8.) performance measures evaluation in outpatient center, (9.) health facility operations simulation, (10.) planning health force, and (11.) evaluation of information systems.    [47] A system dynamic modeling approach to assess the impact of launching a new nicotine product on population health outcomes. Event Simulations (DES), and two Agent-Based Models (ABM). Finally, one paper combined three techniques, adding a sixth, Hybrid models (HM). Supported by complementary literature, we describe each technique features and use the selected papers to exemplify their Construction of a simulation model and evaluation of the effect of potential interventions on the incidence of diabetes and initiation of dialysis due to diabetic nephropathy in Japan. implementation in integrated care. Table 2 summarizes the simulation techniques in terms of strengths, limitations, and estimation considerations and provides references for complementary literature. Markov models. Markov models are state transition models. They have clearly defined, exclusive states, and transitions between states are defined as quantities per cycle. States cannot happen simultaneously for the same agent and transitions from one state to another depend only on the current state (Markovian property). Time can be continuous or discrete, but in the case of this review, both papers use discrete time. Markov models can define transition probabilities differently for each time step, allowing the inclusion of trend factors, and together with 'tunnel states' (states with no possibility of remaining in the said state in time) time-depending dynamism and partial influence of historic events are enabled. Laurence et al. [53] explore the complexity of state transitions by constructing a model comprised of four separate parts (demand, supply, productivity, and training) of the system determining the health force gap, a common topic on integrated care initiatives. The demand and training parts of the model define partial outcomes dependent on several variables. These outcomes are then used in a second stage for the supply and productivity parts of the model, resulting in further partial outcomes. The third stage studies the main outcome (workforce gap) influenced by the outcomes of the previous stages. The structure enables the inclusion of mediated relationships between the initial variables, their interaction with partial outcomes, and the main outcome. The SimSmoke simulation, presented in Levy et al. [56] was developed in the early 2000s to estimate the smoking population and the effects of possible lifestyle interventions. The model distinguishes a population by age and gender evolving through birth and death rates. The population is further divided into never, current, and former smokers. By differentiating models for different strata of the population and including tunnel states, the author can represent the influence of historical events, having portions of the population 'jumping' to the next model when an event happens. System dynamics. The objective of system dynamics is to capture all determinant variables, causal pathways, and feedback loops of the system to be analyzed [48]. In SD structure determines performance, and the primarily goal is to evaluate the effect of an intervention over the qualitative nature of system performance (e.g. growth function, overshoot and collapse, oscillations, chaotic response, etc.) [27,68]. To conceptualize the structure, relevant elements and the direction and nature of their inter-relations must be known. This information is extracted from the system's stakeholders underlying knowledge of the way the system operates [37,59]. This way, Homer et al. [48] and Loyo et al. [57] integrate the most important risk factors of several chronic diseases in a single model. The model calculates the expected prevalence and indirect cost effect of these diseases in the population. Milstein et al. [59] include all relevant causal pathways related to health reform policies in the US. Kang et al. [51] and Sugiyama et al. [65] use the same approach to model the care of chronic kidney disease and the effect of interventions over diabetes and dialysis. The inclusion of all known determinants and causal pathways is complemented with the possibility to include "soft" variables, enabling the exploration of aspects of a system behavior particularly relevant to integrate care such as "Gaming", "Extrinsic motivation" [37], "Insurance complexity", "Care coordination" [59], "Staff resistance to new policies" or "Workload pressure" [63]. This flexibility is essential to capture the influence of important variables but limits the statistical validity of the results [69]. Loyo et al. [57] undermine this limitation stating that 'community decisions need to be made even though the data are disparate and incomplete'.

Description of simulation modeling techniques in IHS
The model structure is represented in a causal loop diagram. There is a special focus on capturing the correct feedback loops affecting the system behavior. Feedback loops are what makes the system dynamic, by influencing the nature of the relationship between variables as the system progresses.
In the area of chronic disease management, Jones et al. [49] use causal loops to model the states of the disease itself, understanding that a key determinant in diabetes care is the reinforcement loop generated by the relation between the disease diagnosis behavior and detrimental consequences. When assessing the effect of a new nicotine product, Hill et al. [47] integrate the feedback effect of 'normality of smoking' to predict smoking initiation and quitting rates, while Alonge et al. [37] introduce the negative feedback loop of gaming to understand the failure of a pay for performance incentive scheme in Afghanistan.
The structure of the system is transformed into a stock-and-flow-diagram, defining the nature of the elements presented in the causal loop diagram. Stocks (elements that accumulate value) and variables that influence flows (functions that determine the growth or decline of the value in stock) are differentiated. Functions are established for flows and initial quantities are assigned to stocks, so that differential equations can be used to determine the values in the stocks over time. Ansah et al. [38] uses this structure to set up the labor market for long term care, and uses a deterministic approach to study the effect of policies to reduce unwanted market disturbances. de Andrade et al.
[41] use system dynamics to represent the different stages of the maturing process related to the management of a myocardial infarction case in a hospital environment. This type of structure is known as "Aging Chains" and is useful to gather information about how long the modeled entity stays in each stage and test delays-improving policies.
Microsimulations. As Markov Models, microsimulations are also state transition models, but they describe the population dynamics at individual levels and can be used to describe interactions between policies and individual decision-making units [70]. As state transition models, they are structured by clearly defined states. Transitions between states are generated by stochastic processes out of the parametrization of transition evidence, differentiating from the rational responses following an objective of Agent-Based Models or the time to event of Discrete Event Simulations [70,71]. Even though the structure is similar to Markov models, they do not share some of the limitations. Besides the interaction of relevant variables, the individual approach adds the possibility of including 'tracking variables', to account for historical occurrences. Modeling the complexity of factors contributing to health care cost is the key objective of the "Future elderly model" created by Goldman et al. [45]. In said model, individualization and influence of historical occurrences allows for the inclusion of a multidimensional characterization of health status accounting for risk factors such as smoking, weight, age and education, along with lagged health and financial states. In their dynamic form, microsimulation models allow individuals to change their characteristics due to endogenous factors within the model [72]. In this sense, they are more suitable for modeling processes and large population dynamics, like the model Lay-Yee et al. [54] uses for estimating child health utilization. The authors modeled a child with a set of attributes as a starting point. Using equations derived from statistical analysis of real longitudinal data, they set the rules for the individual in the system and stochastically simulate changes in status over time. In other words, the model generates a set of diverse synthetic health histories for a starting sample of children. Then it uses the simulated sample as a counterfactual for estimation including the effect of interventions.
Discrete event simulation. Discrete event simulation is a process-centric simulation methodology that describes a chronological sequence of events affecting an entity. The entity (e.g., patients) carries its information, individualizing the type of relationship with each event. Vataire et al. [66] and Cooper et al.
[40] use this characteristic for individualizing treatments for major depressive disorder, and to realistically assess the response to the prescription of prevention drugs for cardiovascular disease, respectively. All occurrences are registered in the entity's information, enabling the influence of historical events in future outcomes [74]. Getsios et al. [44] use this feature to model the effect of smoking cessation attempts in tobaccorelated outcomes.
Events are listed in order after random sampling over the parametrization of time-to-event evidence, rewriting the list after each occurrence. Events have their own associated time that passes when the event occurs, hence DES is best suited to model discrete processes. As events have different duration, the cycle lengths are not necessarily equal. Several authors [42,46,55,62] find this structure convenient for modeling the care pathway of a health facility. The timing structure of a DES model allows the assessment of multiple and competing risks, as they will be organized in the future events list by time-to-event [79], with no immediate restriction for two events to happen simultaneously [73]. Kotiadis [52] and Norouzzadeh et al.
[61] take advantage of this characteristic to model different times for referrals depending on medical factors while tracing key indicators in the system. DES also allows for the status of variables in the system to affect the nature of the relationships of an individual with the rest of the system. Günal et al. [46], Oh et al. [62] and Comans [39] uses the interference feature to evaluate the queues and backlogs at different stages of the patient pathway, understanding waiting time as a change in the manner a patient interacts with a provider, given the providers' status (e.g., 'Occupied'). By fixing the maximum waiting time allowed in concordance with national guidelines, the authors can assess the requirements in the rest of the system to reach this goal.
As Microsimulations, Discrete Event Simulations aim at producing statistically valid estimations out of the documented behavior of a system. This rigidity poses an important tradeoff compared to other techniques as it needs detailed, well-defined processes, accurate historical data, and high intellectual, computer, and data management capabilities. Standfield et al. [73] conclude that if individualization or interference is not an important driver of the performance of the system, including these characteristics would be an unnecessary over-specification and unlikely to be informative to decision-makers.
Agent-based models. Agent-based models focus on the activities of the agents composing the system. Each agent is individually defined with a set of rules and an objective, that may be described from heuristics to the optimization of a utility function. Kalton et al. [50] use this technique to model how mental patients engage with medical and social ecosystems while studying the effect of coordination capabilities. The individualization allows the agents to be influenced by their history and external variables. At the same time, agency focus allows the technique to capture emergent population phenomena [76].
The system is modeled in a simulated space, adding the possibility to include spatial variables. Nianogo et al. [60] exploit these characteristics when understanding the dynamics of the diabetes population in L.A, USA. The 'Virtual Los Angeles Obesity' model simulates a cohort of patients with different characteristics that interact differently with different environments. By assigning rules for the relations with the environment, the model seeks to describe the trends in obesity and diabetes out of the behavior of the agents, and at the same time test interventions by changing the environmental conditions or characteristics of said agents.
Agent-based models also allow for the inclusion of random factors to consider the bounded rationality that is present in agents' behavior. Finally, as agents can be affected by spatial or other types of determinants, and because the rules commanding agent's behavior can be set as thresholds, endogenous and time-dependent feedback loops are also possible. In advanced models, agents can evolve and learn with methods like neural networks and other forms of machine learning [29,77,78].
As with System Dynamics, authors use proxies and expert opinions when hard evidence is not available [46]. This flexibility makes them appropriate to test behavioral theories and understand complex population phenomena. On the other hand, statistical validity is not usually the first concern in either technique, where the usefulness of the assessment is more important.
Hybrid simulations. Hybrid simulations can combine the strengths of two or more models. Gao et al. [43] developed a tripartite model combining System Dynamics, Agent-Based Models, and Discrete Event Simulations. He uses a previously developed System Dynamics model to understand the progression of diabetes up until the early stage of renal disease. As described by Jones et al. [49], the model properly describes diabetes progression by including key feedback loops. Constructing from this model, Gao et al. [43] include two different types of hybrid relationships. First, there is an upstream-downstream relation between the original model and an Agent-Based Model for the populations that flows into a particular state (diabetes) to become individualized agents. The ABM model can study the incidence of a complication (early-stage renal disease) by simulating key behaviors in the development of the disease. In parallel, the second hybrid relation integrates DES for monitoring the different status of the patients and tracks the evolution of healthcare processes and resource availability and usage.

Complexity
To understand and compare the representation of complexity in simulation models we first compiled 13 distinct features of complex systems identified by Randall [80] and Wilenksy & Rand [81]: Undetermined or fuzzy boundaries, the possibility of being open, possibility of having nested sub subsystems, dynamism in the network of relationships with different scales of interconnectivity, emergent phenomena, nonlinear relationships, feedback loops, leverage points, memory/path dependence, sensitivity to initial conditions, robustness, diversity and heterogeneity, interconnectedness and interactions. Building from the previous section, we identified the characteristics of the described modeling techniques that can represent features of complex systems specifically related to relationships between system components. The modeled complexities were classified into one framework with definitions that could be applicable across methodologies. The exercise resulted in nine aspects of complex relations that can be represented with simulation models. We present the nine aspects of complex relations together with the characteristics in each discipline to represent them. In parenthesis, we show the number of papers modeling each complexity. Among the complexities identified, four are non-linearities (1 to 4), and they were the most commonly modeled. Table 3 summarizes the aspects of complexity enabled in each modeling technique.  Table 3. Complexity aspects enabled per simulation modeling technique.

Markov Model System dynamics Micro-Simulations Discrete Event Simulation Agent-Based Models
The technique can incorporate dynamic changes over time, but not endogenous feedback loops. 2 Even though the 'Markovian Property' defines that transition probabilities will depend only on the current state and not on previous states thus eliminating the possibility of having 'Memory', researchers can overcome this by incorporating tunnel states and parallel models. https://doi.org/10.1371/journal.pone.0254334.t003

PLOS ONE
time affects healthcare provision. By interacting, each component affects the final outcome according to its particular characteristics and those of the previous component.
2. Dynamism (23/30): Dynamism represents the circular causality of a system. If component (A) changes the nature of its relations in the system as the system progresses, then we say the system presents dynamism. Besides the dynamics produced by the passing of time, relations can be influenced by the changing conditions of any other component, producing endogenous feedback loops. In methods where estimation correspond to ordinary differential equations, the value of component (A) will be determined by a function of the state of other components (B, C) [74]. For MM the other components (B, C) can only be time, hence no endogenous feedback loops are possible [73]. For ABM, conditions ruling the behavior of agents can change depending on other components of the system or time as programmed by the modeler [74]. In Alonge's [37] model for a pay for performance incentive scheme, dynamism is clear when understanding the effect of 'volume of service' over the reduction in 'quality' and the increase of 'revenue', which in time affect the 'volume of service' downwards and upwards respectively. The particular state will affect the relationship with other components, and at the same time mutations between states are triggered by these relations. Similarly, ABM can define different behaviors of its agents depending on current or past relations with the rest of the system [70,75]. The best example of interference is the change from available to occupied of rooms modeled by Günal [46]. Because a patient is occupying a room, other patients have to change their behavior to that room and wait.
4. (Intelligent) Adaptation (2/30): Adaptation is the ability of a component to change the nature of its behavior to contingency happening in the system. This ability presumes the intelligence of components to make decisions. ABM can integrate this complexity when specifying agents' behavior not only as a function of other system components but also as conditions and operations in said function such as 'ifs' and optimization [14,75]. For example, in Kalton's model [50] patients can make up to 40 decisions based on logic and preferences developed during their life process, care experience and health status. Decisions include taking their medicine, looking for employment, starting to abuse substances, etc.
5. Soft variables (9/30): Refers to the possibility of incorporating simplified proxies for difficult-to-measure variables. Allows the inclusion of behavioral and qualitative relations. The possibility of using soft variables in ABM [82] and SD [76] responds to each methodology obtaining outputs focusing on agents' behavior and system structure respectively, instead of mathematical correctness to represent phenomena. A good representation of a soft variable is "Workload pressure" modeled by Rashawn et al.
[63] as the ratio between the actual nurse-to-patient ratio and the standard nurse-to-patient ratio.
6. Individualization (17/30): Integrates the possibility of including individual-level characteristics. Comprehends the complex system features of heterogeneity and diversity. DES and MS use a sample of individual units, each with a unique set of attributes [73,75]. ABM can program each agent with different characteristics [82]. Individualization is notable in the model by Lay-yee et al. [54], where data is granular at patient level, with variables such as gender, ethnicity and housing status. Each of these variables affects the subject's number of doctor visits, reading ability and conduct problems.  [55], where one or a combination of the diferent treatment schemes are possible for distinct patients. When a combination is chosen, the treatment sections of the model happen in parallel.
8. Historical occurrences / Memory (18/30): Also known as hysteresis, the concept includes path dependence. It refers to the influence of past states on the nature of the relationships of the current state. In methods that allow individualization, events can be stored in the individual's characteristics. For SD, the influence of events is stored in the stocks. A good example is the model by Vataire et al. [66], where the number of previous depression events updates the model attributes.
9. Emergence (2/30): Characteristics of a system to develop new behaviors, different from those of the sum of its parts. ABM enables this characteristic by allowing agents to interact freely, only following the programmed behavior [82]. For example, in Nianogo's model for policies to treat population obesity [60], researches realize that their agents would change non objective behaviors because of the interventions, making them ineffective. Also, agents would quickly go back to the undesirable behavior after the intervention was finished (in despite of the intervention objective), diminishing the long-term effect.

Optimization capabilities
All simulation modeling techniques used 'what if?' scenarios, defined as to gain information about the performance of the system (or parts of the system) when simulating the change of a variable from its original value, while using as counterfactual the baseline model. Fourteen (out of 30) articles complemented the assessment with 'how to?' scenarios, defined as fixing a variable's value as a goal and focusing on how the other variables change from the baseline values to meet this condition.

Long term assessment
The studies had different time lengths in their assessment. While some papers had a closer look at the activities on a working day (3/30), the majority had assessments of at least 5 years (21/30). The mean number of years in the assessments was 18 years (standard deviation 20). Lifelong simulations (2/30) were considered as 60 years and working hours of a working day as 10 hours.

Discussion
We have characterized the use of simulation models for IHS performance assessment. First, by exposing topics of interest to IHS that can be modeled, and the techniques to model them. Second, by exposing how these techniques can implement system thinking in said topics of interest, while enabling features befitting of integrated care performance assessment.
To characterize the ability of the reviewed simulation models to implement system thinking, we have created a common framework with 9 complexity features enabled differently across modeling technique. These complexity features allow for the correct understanding of causality paths in a system's performance. For integrated care, this means enabling accurate accountability for system components and consequently, creates a better position to guide system improvement. Accurate accountability is necessary for value-based care, and especially value-based payment schemes, two key elements of integrated care initiatives [4,11]. Furthermore, disentangling the complex relations between system components is the key to deal with comorbidities, identifying consumed resources, and implementing ad-hoc interventions [11]. While accurately representing the complex relations of the system is essential for the model structure, simulation models can optimize interventions by testing 'what if?' & 'how to?' scenarios. These scenarios simulate changes (or fix values, respectively) anywhere in the system and compare it to a baseline value of system performance. By doing so, SM provides an easy way to compare the value of multiple interventions, understand the value of each component and identify bottlenecks and other deficiencies in the system. At the same time, the term of assessment is manageable in function of the objective of the study. Short and long-term interventions aimed at improving efficiency, changing health behavior, and preventive care are an important part of the toolbox of IHS, and the possibility of assessing them and optimize their implementation in the correct time frame is expected when in pursuit of the Triple Aim [3].
The application areas identified in the review were in line with the findings of previous work focused on characterizing applications areas of simulation modeling in healthcare [18]. Likewise, the simulation techniques covered in this work are the most used and studied in literature. Markov Models are the simplest among simulation models, because of relatively low computer, human, and data needs. It is the preferred methodology when assessing situations with low complexity. System Dynamics models add the possibility of including feedback effects and soft variables with a population perspective, characteristics that make it more prevalent in the "Policy and Strategy" area, a realization in line with results of extensive reviews aimed at linking simulation methods and healthcare areas of application [22,25]. Microsimulations and Discrete event simulations extend the complexity into individual-level assessments, which in place enables the influence of past events. The main difference between the two is that Discrete Event Simulations add the possibility of including interference. This characteristic makes it more suitable to understand health processes that require queuing, a common feature in the topic of "Health Resource Management". Furthermore, several authors coincide in that Discrete Event Simulation is the most common technique for evaluating the operation management of care facilities [19,22,25]. Agent-Based Models understand the behavior of the system out of the behavior of its agents. This simple definition allows the study of complex phenomena with a relatively simple technical construction. The technique can include all the described complexities, but the fact that works in an entirely simulated environment diminish the validity of its results.
A common characteristic of all the simulation modeling techniques is the inclusion of data from multiple sources and the possibility of a probabilistic estimation. Twenty out of the 27 papers performed a probabilistic sensitivity analysis, either with Monte Carlo simulations or other. A probabilistic estimation is not included as an aspect of complexity as we don't consider uncertainty to be unique to complex systems, and for the same reason, Monte Carlo simulations are not included as a SM technique to assess complex systems. However, the possibility to include probabilistic estimations allows the inclusion of uncertain evidence, which is essential for the comprehensiveness of the models. Validation is key for the usefulness of the simulation results. Described in detail elsewhere [29], typically, a five steps approach is used in SM, comprising: Face validity, internal validity, cross validity, external validity, and predictive validity.

Model selection
A system is most appropriately modeled by the technique that allows the inclusion of the most important characteristics of said system. The selection of the most appropriate simulation modeling technique to assess performance must consider the characteristics of the system and the capabilities of each technique. It is important that only essential characteristics are considered so there is not an over-specification that hinders the analysis. In this line, identifying and prioritizing the complexities that rule the system to be modeled will help evaluators in selecting the most appropriate simulation model. Using our framework for complexity for this purpose, we created a conceptual map (Fig 2) that aids evaluators in selecting a simulation model to produce an accurate assessment of situations where complex relations are important. The tool is a summary of the results and characterization presented in this paper. The first step is to identify the most important complexity of the system to be modeled. Following a few key questions, the tool points to the technique with fewer inputs and technical difficulties that is appropriate to model said system.
To help readers navigate the tool, we use the evaluation of a pay for performance incentive scheme by Alonge et al. [37] as an example. We start by assuming that the most important characteristics of the issue are (1.) the feedback loops that performance bonuses generate over the revenue and quality of services and (2.) the effect of "Gaming" (a soft variable) of the staff over this new payment scheme. Starting from the center and navigating through the figure we could go to either "Important feedback loops" or "Soft variables" and if individual effects are not considered essential, the tool takes us to System Dynamics-that is the approach used by the author. Another example is the evaluation of interventions for reducing waiting time in a health facility. Queues and backlogs are assumed the most important characteristic. If we consider non-essential the intelligent behavior of the agents, then the tool points to Discrete Event Simulation. Otherwise, an Agent-Based Model would be the most appropriate.
Sometimes the complexities of a system cannot be ranked according to their importance. If this is the case, evaluators should repeat the exercise starting from all the identified complexities as if each were the most important one. If the different runs result in different modeling techniques, a hybrid model is to be considered. This is the case for the paper by Gao et al. [43]. In this case the authors seek to model three elements of diabetes care. First, diabetes progression at the population level, with feedback loops being the most important complexity. Selecting important feedback loops in the figure takes you directly to System Dynamics (when individualization is not important). Second, disease complication, where individualization of risk factors and healthy behavior is crucial. After individual effects, the figure passes through agent behavior towards Agent-Based Models. Finally, the authors study the status of every patient to track the use of resources. In this case, individualization is the priority complexity, but as agent behavior is not important for this element, the user will lean in favor of simultaneity of events, arriving at Discrete Event Simulation. As selected by the authors, the tool guides each situation following the characteristics and prioritization of complexities to the appropriate modeling technique.

Limitations
By focusing only on simulation modeling, the review overlooks many analytical methods to assess complex systems. Several authors have described other analytical methods for studying different aspects of complexity in health systems, including network analysis, marginal structural models, queuing theory, Petri nets [22], and artificial intelligence [83]. Previous work by Jun et al. [22] characterizes and compares a wider set of modeling methods. However, it does not consider the distinctive characteristics of the system to be modeled or describe how do they apply system thinking. Our review focuses solely on simulation models because of the advantages they present in the assessment of integrated care systems. Network Analysis provides an assessment of the structure of the (complex) relations in a system but does not consider causal pathways. Marginal structural models and queuing theory are useful to represent time-dependent covariates and interference (as defined in this paper) respectively, but they are limited to these capacities. SM and Artificial intelligence methods, such as Machine Learning, differ in that the latter constructs a model from patterns in the data, while SM constructs from the structure of the system and then populates the model with data. Besides making the estimations more comprehensible, this characteristic of SM allows policymakers to test structure changing interventions, such as the ones in integrated care. In any case, the mentioned analytical approaches are complementary to SM, as they can provide the necessary inputs to build and populate the simulation model. Comparisons between different analytical methods, understanding their capacities to represent complex system characteristics, is scarce and should be further assessed in future research.
It is important to highlight that the approach to find IHS topics-of-interest is not the extent of subjectivity, as there are multiple definitions for integrated care [3]. In this sense, it is probable to encounter multiple other IHS topics-of-interest that can be successfully modeled with SM techniques. In the same line, our selection criteria focused on finding papers that allowed us to understand the implementation of a complex system perspective, criteria that resulted in fewer reviewed papers than previous literature linking simulation modeling and healthcare performance assessment. Nevertheless, we are confident that the selection of papers in the review together with the complementary literature used, allowed us to accurately characterize the field of simulation modeling in their ability to use system thinking in integrated healthcare.
Finally, clarify that our work does not provide an in-depth description of the different simulation modeling techniques. We acknowledge that such a task would be impossible to undertake with our study design. Instead, we provide readers with an introduction to the identified simulation modeling techniques and highlight the characteristics that allow them to implement system thinking. We encourage readers that find a solution in this work to the challenges they encounter when assessing the performance of a complex health system to learn in detail the technique that our paper has pointed towards. For this purpose, we recommend starting with the complementary literature that we include for each technique in Table 2.

Conclusion
Simulation modeling techniques can use system thinking and evaluate performance emphasizing the complex relations between system components, in topics of relevance for integrated healthcare systems. By using simulation models to complement the performance assessment of integrated health systems, managers can correctly attribute causality to system components, optimize interventions, and create long term assessments. All these are important advantages over traditional assessment methods. Adding simulation models to the performance assessment tools at disposition of health authorities may be the key to understand the full value of integrated care. Selecting a simulation technique is facilitated when both the characteristics of the modeling techniques are understood, and the complexities ruling the system performance are identified and prioritized. To facilitate the use of the discipline, we consolidated complexity features of different modeling techniques into one framework and provide future performance evaluators with a visual aid to guide the selection of the most appropriate model for the assessment of complexity-enhanced systems, such as integrated healthcare.