THROMBOEMBOLISM: DEALING WITH POTENTIALLY CAUSAL AND CONFOUNDING RISK FACTORS USING DIRECTED ACYCLIC GRAPH (DAG) ANALYSIS

Introduction: Hospital-acquired venous thromboembolism (HA-VTE) in children comprises multiple risk factors that should not be evaluated separately due to collinearity and multiple cause and effect relationships. This is one of the first case-control study of pediatric HA-VTE risk factors using Directed Acyclic Graph (DAG) analysis. Material and Methods: Retrospective, case-control study with 22 cases of radiologically proved HA-VTE and 76 controls matched by age, sex, unit of admission, and period of hospitalization. Descriptive statistics was used to define distributions of continuous variables, frequencies, and proportions of categorical variables, with a comparison between cases and controls. Due to many potential risk factors of HA-VTE, a directed acyclic graph (DAG) model was created to identify confounding, reduce bias, and increase precision on the analysis. The final model consisted of a DAG-based conditional logistic regression. The study was approved by the Institutional Review Board (


Introduction
In the past two decades, venous thromboembolism (VTE) has been recognized as an essential complication in pediatric patients.In the last ten years, VTE incidence has reached 42-58 cases of venous thromboembolism (VTE) for every 100,000 children admitted, according to information from the database of pediatric patients admitted to hospitals in the United States of America (USA).This same database showed that the incidence of VTE is much higher in hospitalized children (40.2-10,000 versus 7.8/10,000 outpatients).Deep venous thrombosis (DVT) is currently considered the second leading cause of preventable harm in hospitalized patients, according to a study conducted in 80 pediatric hospitals in the USA (1)(2)(3).The clinical consequences of DVT in children are significant since it is estimated that around 25% of children with DVT in the extremities develop signs and/or symptoms of chronic venous insufficiency (known as post-thrombotic syndrome)(4) and 16-20% evolve with confirmed pulmonary embolism (PE) (11) with a mortality rate of 2-9% in these patients (5,6).
Hospital-acquired venous thromboembolism (HA-VTE) comprises multiple risk factors that should not be evaluated separately due to collinearity and multiple cause and effect relationships (7).For this reason, novel methods that allow a visual representation of theoretical assumptions of a process understudy that is easily interpreted by the reader, encouraging an iterative process of updating and revising theories about causal relationships, helping researchers in avoiding unintentional confounds and colliders (a common effect of two variables (i.e., exposure and an outcome) could be useful.Thus, the present study aims to improve the identification of risk factors for the development of HA-VTE in children using a novel approach, the directed acyclic graph (DAG), that seems to gather these necessary features (8) to deal with confounding factors of reallife.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Study design:
We conducted an observational, case-control study to evaluate risk factors for the development of hospital-acquired venous thromboembolism (HA-VTE) in children.
Subjects: Children (newborn to 16  international normalized ratio (INR), Partial Thromboplastin Time (TTP), and anti-factor Xa.Outcome variables were death (cases and controls), pulmonary embolism (for patients with previous DVT), the extension of the VTE (radiology proved), and the transformation from non-occlusive to occlusive VTE. .

Statistical analysis
Descriptive statistics were used to define distributions of continuous variables, frequencies, and proportions of categorical variables, which were then compared between cases and controls using Mann-Whitney U and χ2 tests, respectively.Fisher's exact test was used in place of χ2 in instances of a two-by-two table with a frequency value of 5 or under in at least one cell.Results were expressed as odds ratios (ORs), with accompanying 95% confidence intervals (95% CIs) calculated by the Wald method.
To evaluate the trend rate of HA-VTE incidence in the last years, we implemented a temporal series analysis with data from 2012 to 2017.The Autoregressive Integrated Moving Average (ARIMA) was adopted.
The cases were divided into three different age categories: all ages, less than six years missing data not included in the statistical analysis).We selected eight main exposure variables: central venous catheter, intensive care unit stay, mechanical ventilation, and length of stay were based on the systematic review and meta-analysis of risk factors, and risk-assessment models published by Mahajerin A et al. (7).Other risk factors comprised in DAG such as liver failure, heart failure, immobilization, local trauma, nephrotic syndrome, surgery, infection, hematologic malignancies, L-asparaginase, autoimmune/inflammatory disease, and corticosteroids were also included based on our assumption that diseases work as leading players on the development of VTE and are usually connected with other risk factors.One example is that we typically do not see children with infection alone having VTE.Obesity was also included because it was one of our hypotheses as a risk factor for pediatric VTE, and it is not yet fully explored in other studies.After selecting the exposure variables, we created directed paths using a browser-based software DAGitty (http://www.dagitty.net)(Figure 1).Variables were represented according to the software pattern (Figure 2).A family history of thrombosis and thrombophilia are described as unobserved variables because we intended to collect it, but essential information was lacking during data analysis.

Results
Given the 6,295 hospitalizations during six years of the study, this yielded an average HA-VTE of 6.25 per 10,000 hospitalized children per year.Twenty-six cases of hospital-associated venous thromboembolism (HA-VTE) were diagnosed during the six years of observation but only 22 cases were enrolled after application of the validation criteria for venous thromboembolism (VTE) cases selection and loss due to missing data (Figure 3).
Most cases (86.7%; n=19/22) were found using data bank of the pharmacy sector stored in MV 2000® software (Recife, Pernambuco, Brazil).The twenty-two cases were matched to 76 randomly selected controls.
There was no difference in the frequency of VTE between cases and controls regarding sex, age, and unit of admission as expected due to matching (Table 1).We did not observe peak VTE in infancy due to the absence of the Neonatal Intensive Care Unit and Adolescent ward in the hospital.
. For estimating the direct effect of the exposure variables (heart failure, ICU admission, length of stay, liver failure, mechanical ventilation, catheter, and obesity) on the outcome (venous thromboembolism), the software recommended doing an adjustment on corticosteroids, hematologic malignancies, immobilization, infection, Lasparaginase, local trauma, nephrotic syndrome, and surgery (Supplement Material S1 and S2).After that, we created two different multivariate regression models: (1) Full Model, utilizing all risk factors for VTE that were statistically significant, and a (2) Final Model after adjustment (Table 3).Length of stay (LOS), liver failure, Lasparaginase use, and nephrotic syndrome were independent risk factors for hospital-associated venous thromboembolism in children.

Discussion
We found a significant increase in the incidence of HA-VTE across all age groups from 2012 to 2017, with a similar trend rate reported in a multicenter study in the United States (1).
To our knowledge, this is the first study of hospital-associated venous thromboembolism using a directed acyclic graph (DAG) to deal with multiple confounding risk factors.
The DAG is a visual representation of theoretical assumptions of a process understudy that is easily interpreted by the reader, encouraging an iterative process of updating and revising theories about causal relationships.A DAG-informed approach helps researchers in avoiding unintentional confounds and colliders (a common effect of two variables (i.e., exposure and an outcome), that is represented in DAG by the two arrows pointing toward it (13), arousing from these variables).The DAG approach has limitations because it does not eliminate all sources of bias; it is time-consuming, requires a broad review of the literature and evidence-based understanding of causal relationships.It is also a pre-requisite to choose between minimally enough adjustment sets considering all variables that are well measured and well specified with little missing data.Likewise, as with any analytic approach, DAGs cannot account for unmeasured confounding or potential covariates not determined from the literature (17).There is a robust literature supporting DAG theory and technique (8,13,18), but this approach has not yet been broadly adopted, although it has been highly recommended by experts in the field of biostatistics (15).
We found the length of stay (LOS), L-asparaginase use, and nephrotic syndrome to be independent risk factors for hospital-associated venous thromboembolism in children.In a recent systematic review and meta-analysis of risk factors and risk-assessment models by Mahajerin A et al. (7), a central venous catheter (CVC), intensive care unit stay, mechanical ventilation, and length of stay were identified as leading risk factors in the case-control studies.Most of these studies did not systematically evaluate the more common co-morbidities.Those factors, besides mechanical ventilation, were also considered risk factors (p<0.005) in univariate analysis in our study, but not all of them were maintained in the final multivariate model.
This finding emphasizes the importance of considering the co-morbidity as essential covariable to be included in VTE case-controls studies as interventions (i.e., catheter) occur in sick children.
Length of stay (LOS) has been a consistently identified risk factor but the mechanism by which it increases the risk of VTE is not fully understood.It works as a proxy that also covers unknown risk factors not yet investigated.We postulate some theoretical relationships that were represented by DAG analysis (figure 1).
Length of stay (LOS) is a risk factor that remains in many case-control and non-case-control studies, which suggests that it acts as a proxy for known and unknown risk factors.Among the known risk factors, some are very insightful, such as length of stay in the ICU, trauma, immobilization, and surgery.In a scenario of multiple risk factors with numerous medical conditions and interventions, it is expected that a meticulous structured statistical model is needed, a good reason for using the DAG concept.Following DAG analysis, LOS persisted as a critical risk factor, which suggests that there are relevant factors not yet considered in our theoretical model and, therefore, not controlled in multivariate analysis.One could argue that the persistence of LOS as a .
risk factor would be a small sample size issue.However, even larger studies (19,20), including a recent comparative validation study of risk assessment models for pediatric hospital-acquired venous using the Children's Hospital-Acquired Thrombosis (CHAT) registry (21), still confirm the importance of LOS.Another reason would be the fact that unknown elements, such as biological or biochemical factors involved in the genesis of VTE, are not yet understood, identified, or available in clinical studies.For this reason, we still do not have enough knowledge to explain all the relationships related to LOS and venous thromboembolism, an issue that requires further investigation.
Acute lymphoblastic leukemia (ALL) is the most common type of pediatric cancer and, therefore, the most frequent malignancy associated with VTE in childhood.The incidence of VTE ranges from 5 to 70% when imaging studies identify asymptomatic cases.Asparaginase is a crucial chemotherapy agent in the treatment of ALL and can induce an acquired deficiency of protein C, S, and antithrombin (22).Most thrombotic events occur in the upper venous system and the central nervous system (CVST) (23) with a potential of severe consequences.Adult studies showed that prophylactic anticoagulation could be safely administered without increasing the number or severity of bleeding events, resulting in a reduction in the cumulative incidence of VTE (24) in ALL patients.Hence, VTE prophylaxis (25) or protein replacement with fresh frozen plasma(26) has recently been proposed in patients with ALL during the induction period with pediatric clinical trials ongoing.In our study, prophylactic anticoagulation in ALL was not employed.
Nephrotic syndrome is another prototype for pediatric VTE with multifactorial and complex pathophysiology.The nonselective proteinuria results in hemostatic abnormalities mainly due to loss of anticoagulant and fibrinolytic proteins (such as antithrombin III, plasminogen, protein S, and plasmin) and increase of prothrombotic substances (factor V, VIII, alpha-2 macroglobulin, thromboxane A2, and fibrinogen) produced by the liver to balance the loss of anticoagulant elements contributing to a procoaguable state (29).
Other reported risk factors for venous thromboembolism (VTE) in this population include infection, CVCs, diuretics, and intravascular volume depletion(30).In our study, 5 of 6 (83.33%) patients with nephrotic syndrome developed VTE.Infection (60%), CVC (40%), and ICU admission (20%) were also seen in most patients.Our study showed a higher percentage (83.33%) of nephrotic patients with VTE, considering a recent cohort study of 370 children with nephrotic syndrome admitted to 17 pediatric hospitals across North America from 2010 to 2012, where only 11 (3%) of patients developed VTE(30).
Mechanical ventilation was not statistically significant in the final model but remained to better adjust confounding factors.
The limitations of our study include the small sample size (as a result of single-center research), the retrospective nature and, the missing data of some patients, i.e., lack of data on hypercoagulability, absence of radiological documentation of all VTE events -a real-life situation in retrospective studies in underdeveloped countries.One of the reasons our sample was not larger was the fact our hospital does not admit patients with solid malignancies, does not perform cardiothoracic and orthopedic surgery and, the emergency department is referred only for patients that are attended in the outpatient clinic (i.e., not trauma patients).There are some .explanations about why the findings in our study were different from other publications.Potential covariates, unmeasured confounding, and lower statistical power could be essential reasons that justify these differences.
On the other hand, we evaluated the co-morbidities in our DAG model, and this could have lessened the significance of medical interventions as the leading risk factor of HA-VTE.We hope that the more extensive ongoing studies that will probably consider the role of co-morbidities can establish the real weight of the medical intervention in HA-VTE in children.
Our study has some important strengths.To our knowledge, this is the first case-control study of pediatric HA-VTE using DAG analysis worldwide, and the first pediatric case-control HA-VTE study in Brazil.
The DAG-based approach in our study highlighted the importance of medical conditions, appeared as the most critical variables, and made us question if therapeutic interventions would be only triggers or proxies ("tip of the iceberg") for the observed thrombosis.Researchers should consider medical conditions (i.e., nephrotic syndrome, ALL) in risk-assessment models besides directing spotlights to interventions (i.e., catheters).We reported the importance of medical conditions on the genesis of HA-VTE using a DAG-based approach, which makes it possible to clarify the influence of confounders and multiple causalities, such as catheter, a significant risk factor highlighted in several studies.This approach is rational since children that use intravenous devices are sick, so it is not reasonable to put all the weight on a catheter since other risk factors play an essential role in VTE.Besides that, we reported an increase of HA-VTE events along the years of the study, which suggests that educational actions should be performed to prepare the clinical team not only to deal with these patients but to teach medical students and residents since it is a university hospital.Finally, an international effort should be made to increase the robustness of data and to balance the different characteristics of pediatric hospitals in the globe to create offset risk-assessment models and, therefore, adequate protocols for HA-VTE prophylaxis in children.
, and from 6 to 15 years old.The linear correlation between the HA-VTE rate and the years use evaluated by R-square (coefficient of determination).Due to a large number of potential variables that could confound or mediate the causal effect of the exposure (risk factors of HA-VTE) on the outcome (VTE), a directed acyclic graph (DAG) model was created to reduce bias, improve transparency, and increase precision on the analysis.Variables included in DAG were clustered in four different groups: (1) exposure variables (mechanical ventilation, liver failure, ICU admission, catheter, heart failure, obesity, and length of stay), (2) forced-variables (heart failure, obesity, and liver failure -not linked to other risk factors in the model), (3) non-forced variables (immobilization, local trauma, nephrotic syndrome, surgery, infection, hematologic malignancies, Lasparaginase, autoimmune/inflammatory disease, corticosteroids, and venous thromboembolism) and (4) nonobserved variables (dehydration, first-degree family history of thrombosis, and thrombophilia -variables with years old) hospitalized from May 24 th , 2012 to August 31 st , 2018, who (9)lusion and selection criteria: Cases were defined as patients who developed hospital-acquired venous thromboembolism (HA-VTE) according to the International Society of Thrombosis and Hemostasis (ISTH)(9)criteria set as i) signs, symptoms or radiological diagnosis of venous thromboembolism (VTE): deep venous thrombosis and/or pulmonary embolism no sooner than two days of hospital admission; ii) Absence of documentation of signs or symptoms consistent with VTE in the admission history and physical exam; iii) Length of hospitalization of at least two days; iv) Diagnosis of VTE less than two days of hospital admission but a previous history of hospitalization in the last 30 days; v) Cases were classified as HA-VTE radiologically proved.Radiological diagnosis of VTE was confirmed using compression ultrasonography with Doppler imaging for objective confirmation of extremity DVT, with computed tomography (CT) or magnetic resonance imaging (MRI) for suspected extension into deep pelvic or abdominal veins, and spiral CT for pulmonary embolism (PE) confirmation.The Doppler ultrasound exam was registered in a separate file and not always was attached to the patient chart, so we revised it separately.The reason we created a classification of likely cases of VTE was that diagnostic modalities are not broadly available in our hospital.Cases were selected using four different strategies: i) Pharmacy software (disposal of anticoagulants); ii) International Classification of Diseases (ICD-10) codes for VTE; iii) Radiology report; iv) Active screening by medical students (only for the prospective phase of the study for one year, from August 1st, 2017 to August 31st, 2018.Controls were bleeding and its classification (according to ISTH(12)), and laboratory test used for anticoagulation monitoring: (3)control potential confounders in the context of rare outcomes, data sparsity, and multicollinearity.It is especially useful when the number of covariates is significant to the study size.Therefore, we implemented the following steps: (1) selection of the variables that are appropriate to include using a causal directed acyclic graph (DAG) to exhibit theorized causal relations among variables; (2) division of the variables into three classes: (i) the main exposure X; (ii) forced-in variables (e.g., age, sex) (iii) the non-forced variables which will be candidates for deletion;(3)building a Full Model, through conditional logistic regression model, including all main exposure terms, forcedin variables, and non-forced variables.The next step was to select which variables of the Full Model would be included in the final parsimonious adjusted model (Final Model).We applied three methods to obtain this Final Model to validate the results of each modeling process: (1) assess, and thereby minimize, mean squared error (MSE) using forward selection strategy, as we indicated sparse-data bias; (2) usual forward method, using the likelihood ratio stepwise selection from the variables presented in the Full Model; (3) usual backward deletion method, using likelihood ratio stepwise selection with a previous traditional multicollinearity analysis from the variables presented in the Full Model.The first method was implemented in Microsoft Excel® (2016) software, and the two last traditional methods, using SPSS Statistics® (version 25.0.Armonk, NY: IBM Corp.)

Table 1 .
Matching characteristics of cases versus controls.The risk factors for HA-VTE among cases versus controls are presented in Table2.The median length *IQR, interquartile rangeWe found a significantly increased trend rate of HA-VTE throughout the years in our institution (Figure4), which was more pronounced in the <6 years age group .Four patients (4/22; 18.2%) presented with signs and symptoms of VTE less than 48 hours of admission and were previously hospitalized in the last 30 days.Regarding the VTE site, two patients (2/22; presented with the following symptoms: one with cough and hemoptysis and the other with dyspnea, tachypnea, cough, and tachycardia.Only one patient had cerebral venous sinus thrombosis (CVST) and presented with seizures.He had a diagnosis of acute lymphoblastic leukemia (ALL) and was treated with Lasparaginase.DVT cases were diagnosed using Doppler ultrasound and PE with a CT scan.The 22 patients had 38 sites of VTE which included lower limbs (26/38; 68.4%); intra-abdominal (4/38; 10.5%) -inferior vena of stay was higher in cases (39 days -IQR: 39) than controls (6.5 days -IQR: 11).Central venous catheter (CVC) use was higher in cases (81.81%) than controls (46.05%).Long-term CVCs were identified in 16.66% of catheterized patients, with the remainder (83.33%) having short-term CVCs (Peripherally Inserted Central Catheter [PICC] lines, temporary jugular, subclavian or femoral lines).ICU admission in the last 30 days was seen more frequently in cases (13/22; 59.09%) than controls (25/76; 32.89%).Infection was also more frequent in cases (17/22; 77.27%) than controls (33/76; 43.42%) with no difference between local or systemic ones.L-asparaginase use was higher in cases (6/22; 27.27%) than controls (2/76; 2.63%).Nephrotic syndrome was seen in 22.72% (5/22) of cases compared to 1.32% (1/76) of controls.Liver failure was more common in cases (3/22; 13.63%) than controls (2/76; 2.63%).We did not observe any difference between cases and controls regarding obesity, surgery, immobility, mechanical ventilation, length of mechanical ventilation, corticosteroid use, inflammation/autoimmune disease, and neoplasia..

Table 2 .
Univariate analysis to identify potential risk factors for HA-VTE among cases versus controls.International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.(

which was not certified by peer review)
International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.(

Table 3 .
Multivariate regression models.Statistical significant differences are in bold.