Comparing the estimates of effect obtained from statistical causal inference methods: An example using bovine respiratory disease in feedlot cattle

The causal effect of an exposure on an outcome of interest in an observational study cannot be estimated directly if the confounding variables are not controlled. Many approaches are available for estimating the causal effect of an exposure. In this manuscript, we demonstrate the advantages associated with using inverse probability weighting (IPW) and doubly robust estimation of the odds ratio in terms of reduced bias. IPW approach can be used to adjust for confounding variables and provide unbiased estimates of the exposure’s causal effect. For cluster-structured data, as is common in animal populations, inverse conditional probability weighting (ICPW) approach can provide a robust estimation of the causal effect. Doubly robust estimation can provide a robust method even when the specification of the model form is uncertain. In this paper, the usage of IPW, ICPW, and doubly robust approaches are illustrated with a subset of data with complete covariates from the Australian-based National Bovine Respiratory Disease Initiative as well as simulated data. We evaluate the causal effect of prior bovine viral diarrhea exposure on bovine respiratory disease in feedlot cattle. The results show that the IPW, ICPW and doubly robust approaches would provide a more accurate estimation of the exposure effect than the traditional outcome regression model, and doubly robust approaches are the most preferable overall.


Introduction
In veterinary science, the goal of many observational studies is to estimate the causal effect of exposures on disease outcomes. There are many approaches to estimation. In general, methods a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 used to adjust for confounding in observational studies can be classified into two categories: G-methods and stratification-based methods. G-methods include IPW, standardization and G-estimation, where the conditional exchangeability has been used in subsets defined by covariates to estimate the causal effect of exposures on outcomes in the entire population (marginal). Stratification-based methods include stratification, restriction and matching, but the conditional exchangeability is used in subsets defined by covariates to estimate the association between exposures and outcomes in those subsets only (conditional) [1]. The commonly used outcome regression models belong to the stratification-based category. Therefore, in this manuscript, we illustrate the advantages of methods from G-methods category to illustrate the advantages to the estimation of the average causal effect and contrast them with more commonly used outcome regression models that are likely more recognizable to veterinary researchers. Specifically, we only focus on three methods to estimate the average causal effect: IPW, inverse conditional probability weighting (ICPW) and the doubly robust approach, and we refer them as the causal inference estimation approaches throughout the paper. The rationale for this paper is to introduce causal inference estimation approaches to veterinary researchers using realistic example data and to illustrate the advantages (reduced bias in the estimation of the average causal effect) when compared to traditional outcome regression model-based approaches to estimation.
In this paper, the advantages of causal inference methods will be illustrated by showing their ability to provide an unbiased estimation of the average causal effect. Because of our goal to illustrate the advantages of causal inference approaches, the target reader for the paper is a quantitative epidemiologist comfortable with regression modeling approaches, estimation approaches, matrix algebra, and reading mathematical formulas. We have provided more statistical detail than most veterinary methods manuscripts and less than most statistical methodology papers. This paper is not intended as either a step-by-step tutorial for causal inference methods nor a treatise on causal inference methods. We provide appropriate references throughout for readers who want more in-depth knowledge.
There is a need for an illustrative example with causal inference methods because, although causal inference approaches have been available for a long time [2], adoption of the methods in veterinary epidemiology appears to lag behind other epidemiology disciplines. For example, in 2018 the American Journal of Epidemiology published 344 articles, of which 14 referred to logistic regression (an outcome regression model approach) and 18 referred to inverse probability weighting or propensity scoring in the title or abstract. By contrast in 2018, Preventive Veterinary Medicine published 268 articles, of which 31 referred to logistic regression and one referred to inverse probability weighting or propensity scoring in the title or abstract (see Appendix for exact search string). These statistics are an imperfect measure of uptake of the methods, but they are likely reflective of the differences in uptake.
This paper is organized as follows: Section 2 briefly recaps the main concepts of marginal models and estimation of average causal effects using an example based on infectious causes of Bovine Respiratory Disease (BRD). The BRD example is carried throughout the paper and was chosen because it is a common topic of research in veterinary science. As respiratory disease is common in all species, the topic provides a relatable example for many veterinary epidemiologists, even those not working in bovine production. We purposefully selected a subset of data with complete covariates for our example. This subset of data is not necessarily representative of the original dataset. For a thorough analysis of the original data and the associations found in the dataset, we direct the readers to [3][4][5]. Section 3 outlines in detail each of the approaches, models used in the analysis, and the estimation equations. Aspects of Section 3 require an understanding of matrix algebra notation. This section is intended to set the stage for the later sections by introducing the models and estimation equations to be used. Readers who are unfamiliar with matrix algebra or who already know the basics for the estimation methods can skip Section 3. Section 4 reports the results of the approaches applied to realistic BRD data and simulation studies which document the advantages of causal inference methods in reducing the biases in estimation. Comparisons among the approaches are discussed in Section 5. Section 6 concludes with a summary and further discussion.

Introduction to data and marginal causal effects
BRD is an important economic disease that causes morbidity and mortality in feedlot cattle; it is also a major contributor to antibiotic use in cattle production. In feedlot cattle, BRD usually occurs soon after cattle have arrived at the feedlot due to increased stress associated with transportation, mixing of the cattle, a new feed ration, and exposure to multiple pathogens. The disease is multifactorial in origin with many factors contributing to increased risk. Previous studies have shown that serologically negative animals exposed to Bovine Viral Diarrhea Virus (BVDV) at the feedlot are at increased risk of developing BRD [3,6]. Exposure to BVDV before arrival at the feedlot, either by vaccination or natural exposure, could offer cattle protection against BRD [4,7]. However, there are some individual-level risk factors and unmeasured feedlot-level risk factors that have direct effects on both prior BVDV exposure and BRD incidence, which makes the evaluation of the direct effect of prior BVDV exposure on BRD difficult. Commonly, we refer to this situation as confounding. Hernan and Robins defined the confounding in chapter 7 of Causal Inference as the bias that results from the presence of causes shared by treatment (exposure) and outcome [15]. However, the direct effect of BVDV exposure prior to arrival at the feedlot on BRD incidence is difficult to evaluate because the relationship is confounded by individual-level risk factors and unmeasured feedlot-level risk factors. Knowing the effect of prior BVDV exposure is of particular interest because BVDV exposure can be manipulated, especially with vaccination, and therefore this represents a potential intervention point for preventing and reducing BRD. Other BRD risk factors, such as age, weight, and breed, are less amenable or impossible to manipulate.
Given that both outcome (BRD) and exposure (prior BVDV exposure) are binary, the average causal effect of the exposure can be evaluated through the odds ratio. In a cohort study, because the disease event is incident, the odds ratio is more appropriately defined as the disease odds ratio. There are two types of odds ratios estimates, conditional or marginal (unconditional), depending on whether we are conditioning on, or marginalizing over the confounding covariates. A conditional odds ratio can help to decide whether an exposure is beneficial for an animal with particular characteristics, while a marginal odds ratio can be used to assess the effect of the exposure in the population as a whole. Veterinary epidemiologists often fit an logistic regression and obtain the estimates of the coefficient (β) of the exposure and use the estimates of exp(β) as the odds ratio, and for a model with covariates, this represents a conditional rather than a marginal estimator. Collapsing over the other covariates, the marginal odds ratio can be different from the conditional odds ratio. This is called the non-collapsibility of the odds ratio. For more explanation and examples of the non-collapsibility, please see Greenland and Robins 2009 [8] and Hernan 2011 [9]. The causal inference approaches enable estimation of the marginal causal effects, which correspond with the traditional parameters of interest in randomized trials [10][11][12]. The marginal risk can then be used to estimate relative measures such as the risk ratio or the odds ratio.
Therefore, our goal is to illustrate approaches to estimating the marginal odds ratio with and without the exposure to BVDV prior to arrival at the feedlot to BVDV on the outcome, BRD. Exposure to BVDV prior to arrival at the feedlot was measured by the presence of antibodies to BVDV in blood samples collected when the animals arrived at the feedlot. We used an independently developed, a priori causal diagram based on biologically plausible pathways to inform the covariates to include in the estimation of the exposure effect [5]. We determined the minimal sufficient set of fixed effects required to assess the total effect of prior exposure to BVDV (as indicated by the presence of antibodies to BVDV at arrival) on BRD outcome [13,14]. In addition, given the importance of cluster variables in veterinary settings, we explicitly included feedlot in the model to enable assessment of the difference in the three methods when clustering is present [15].

Notations
Throughout this paper, we use bold letters such as A and β to denote vectors. We use non-bold letters such as A and β to denote scalar quantities. Let Y ij be the dichotomous observable outcome for the j th individual (j = 1, 2, . . ., n i ) in the i th cluster (i = 1, 2, . . ., I). n i is the number of individuals in each cluster. I is the total number of clusters. A ij is a binary variable of the individual-level exposure (A ij = 1 if exposed and A ij = 0 if unexposed). Let X ij = (X ij,1 , X ij,2 , . . ., X ij, k ) be a k-dimensional vector of individual-level covariates, which may confound the relationship between exposure and outcome. Each individual has two potential outcomes: Y 0 ij and Y 1 ij . If the individual was actually exposed, we can only observe Y 1 ij but not Y 0 ij . Otherwise if the individual was not exposed, we can only observe Y 0 ij but not Y 1 ij . The connection between the observable outcome and the potential outcomes is The concept of potential outcomes was introduced in the context of both randomized and non-randomized studies by Rubin [16].
Given the individuals are nested in clusters, we can use a vector to represent the outcome in the i th cluster, Y i ¼ ðY i1 ; Y i2 ; :::Y in i Þ 0 , in which each element is the observable outcome for each individual within cluster i. Similarly, the exposure and covariates in the i th cluster are denoted as vectors A i ¼ ðA i1 ; A i2 ; :::A in i Þ 0 and X i ¼ ðX i1 ; X i2 ; :::; X in i Þ 0 . The cluster-level potential out- In later sections, we will fit an exposure model with A i as the response and an outcome model with Y i as the response. U iA and U iY represent the random cluster effects in the exposure model and the outcome model respectively.

Approaches overview
In total, we use six approaches for estimation of the odds ratio. The approaches are summarized here and described in more detail below.  3. An inverse probability weighting (IPW) approach, with an exposure model that uses logistic regression with feedlot as the random effect to estimate the probability of exposure (propensity score) for each individual. The inverse of the probability of exposure is then used to weight each individual in the estimation of the marginal odds ratio.

4.
A modified version of the IPW approach that addresses concerns about unmeasured cluster-level confounders called inverse conditional probability weighting (ICPW), which is used to deal with cluster effect by stratification. For this approach, the exposure model uses a conditional logistic regression to estimate the probability of exposure for each individual; however, the odds ratio estimated from the outcome model is still the marginal odds ratio.
5. IPW with a doubly robust estimation approach. In section 2.7 we provide more detail about the rationale incorporating the doubly robust estimation.
6. ICPW with a doubly robust estimation approach.

Outcome model
The outcome model has a binary response, and the exposure alone or together with the covariates are the explanatory variables with either binary or continuous form. The set of covariates are technically all the potential individual-level confounders required to provide an unbiased estimate of the exposure-outcome relationship. We make this assumption about the covariates but we will not repeat it later in the text. Further, for the BRD dataset used in our application has a clustered structure due to multiple feedlots, the most common way of analysis is to fit a generalized linear mixed model with logit link and a random effect for the cluster variable as follows: where β = (β a , β 1 , β 2 , . . ., β k ) 0 are the effect of the explanatory variables on the outcome on logit scale, and U iY � Nð0; s 2 Y Þ is the random cluster effect. The estimator of the effect of exposure on the outcome is based on the risk in each exposure group calculated as expitðb a þ P k p¼1 b p X ij;p Þ for A ij = 1 and expitð P k p¼1 b p X ij;p Þ for A ij = 0, where the "expit()" is defined as expitðzÞ ¼ expðzÞ 1þexpðzÞ for any variable z. The "expit()" is also called the inverse logit function. Note that this estimate is different from the routine exp(β) estimator employed very frequently, which is an estimate of the conditional effect of exposure on the outcome.

Exposure model
The exposure model is fitted to estimate the probability of exposure given the covariates, which is also called the propensity score [17]. It is for this reason we refer to this as the exposure model. The estimated probability of exposure will be used to compute the inverse probability weights for each individual. Similar to the outcome model, we also fit a generalized linear mixed model with logit link and a random cluster effect: where α = (α 1 , α 2 , . . ., α k ) 0 are the effects of the covariates on the exposure of interest on logit scale, and U iA � Nð0; s 2 A Þ is the random cluster effect. From this model we obtain the estimated probability of exposure as the second line of Eq 2, which will be used to weight each individual before estimating the marginal odds ratio.

Assumptions and estimates for IPW approach
Here we very briefly present the formula and assumptions for the IPW estimator. For a more through description of the assumptions and formulas of IPW, readers are referred to Chapters 2 and 3 in Causal Inference by Hernan and Robins [1].
There are three assumptions required in order to obtain an unbiased IPW estimator [1]: IðA ij ¼ aÞ for all i and j, which means that an individual with observed exposure A equal to a, has observed outcome Y equal to its potential outcome Y a .

Conditional exchangeability: fY
U iA for all i and j, which means that the risk of the outcome under the potential exposure level a among the exposed is the same as the risk under the potential exposure level a among the unexposed.

Positivity: For all i,
P n i j¼1 a ij 6 ¼ 0 or n i , and 0 < P(A ij = a ij |X i , U iA )<1 for all i and j, where a ij = 0 or 1, which means that the conditional probability of being under every exposure level is greater than 0 and less than 1.
Given the validity of these assumptions in the dataset, the IPW estimator gives the average causal effect of exposure on the outcome by creating a pseudo-population by weighting each individual according to the inverse of the probability of exposure. In the pseudo-population the exposure and the measured confounders are independent. In lay terms, the IPW estimator provides an estimate of the risk of the outcome in each exposure group. This is achieved because the population is balanced with respect to confounders due to weighting by the inverse probability of exposure [18]. More formally, this can be summarized as follows. The risk of the outcome in the exposure group A = a is estimated by the inverse probability of the exposure weighted mean of the outcome Y when exposure A = a is given. The risk of disease under exposure is estimated byP 1;IPW and the risk of disease under no exposure is given byP 0;IPW , which can be expressed as the Eq 3.
where the weights for the IP estimator on the denominator, PðA ij ¼ 1jX ij ;Û iA ;αÞ, are obtained with the random cluster effect exposure model in Eq 2. In the case where the weights have extreme value, IPW may lead to biased estimate of the causal effect. Instead, stabilized IPW have been proposed to tackle these extreme values and provide unbiased estimation [19]. In the BRD dataset used in our application, the range of the estimated conditional probabilities of exposure to BVDV is from 0.19 to 0.98, thus using IPW is sufficient. As can be seen in the equation above, the IP weights are in the denominator of the estimator formula, hence the term inverse probability weighting. It is this process that creates the pseudo-population that balances the confounder distribution across the level of the exposure variable A, makes the confounders and the exposure independent and enables estimation of the average causal effect [17]. The average causal effect of the binary exposure A ij on the binary outcome of interest Y can be estimated using the odds ratio asP 1;IPW =ð1ÀP 1;IPW Þ P 0;IPW =ð1ÀP 0;IPW Þ . If the estimated odds ratio covers 1 (i.e. P 1;IPW ¼P 0;IPW ), it suggests that exposure A does not have an average causal effect on outcome Y in the population [1]. One potential limitation of IPW approach is that there should be no unmeasured confounders, including the cluster-level confounding variables. However, the inverse conditional probability weighing (ICPW) can provide a robust estimation for a case with unmeasured cluster-level confounders.

Inverse conditional probability weighting (ICPW)
In veterinary populations, clustering is common and the ability to adjust for cluster-level variables is limited; therefore this condition may not be guaranteed. For example, in the cattle population such as we use for the BRD example, an unmeasured cluster-specific factor such as surrounding environment of the farm, which likely impacts the occurrence of BRD (the outcome Y), is also likely associated with the covariates such as animals' health status. Inclusion of the cluster-level variable as a random effect may not be sufficient to control for the confounding effect. In this circumstance, the inverse conditional probability weighting (ICPW) approach proposed by He [20] can solve this issue by using the sufficient statistic for the unmeasured cluster-level confounder (denoted as V i ). For extensive discussions about either the concept of sufficient statistics or the ICPW, we refer the reader to other literature [20,21]. However, for completeness, we present the assumption and formula and briefly discuss the difference from the IPW estimator in the supporting information section.
In the ICPW approach, we fit a conditional logistic regression as the exposure model (conditional on V i , which is the cluster-specific variable). This approach treats the cluster as a stratifying variable rather than a random effect, and we obtain an estimate of the exposure probability conditional on the cluster. The exchangeability and positivity assumptions mentioned in Section 2.4 need to be modified due to the usage of the sufficient statistic of V i , and an assumption regarding existence of the sufficient statistic is also required. More details are provided in the supporting information.
The major difference between the IPW and ICPW is that we construct the probability of exposure (the propensity score) A ij conditional on individual-level covariates X ij and the sufficient statistic of the cluster-level covariates V i (denoted as S i = S(A i )), which is a function of A i ¼ ðA i1 ; A i2 ; :::; A in i Þ. This is done using a conditional logistic regression for the exposure model. Formally, this is as follows in Eq 4, let a i be the observed value of A i , the exposure model for ICPW estimator can be fitted with a conditional logistic regression for all individual j in cluster i as follows: For the ICPW estimator, we useP a;ICPW in Eq 5 to estimate the risk of disease under exposure level A = a. For a binary exposure, A ij the estimators are as follows: Letα be the conditional maximum likelihood estimator that maximizes the joint conditional likelihood.

Doubly robust
The doubly robust approach has two modeling components and produces a consistent effect estimator if at least one of the two component models is correctly specified and assuming that there are no unmeasured confounders. In other words, it gives us a second chance to correctly specify at least one of the models. The first component is the exposure model, which could either be a random effect logistic regression model as Eq 2 or a conditional logistic regression model as Eq 4. The second component is the outcome model in Eq 1. The formula for the doubly robust estimator is provided by Cao et al. [22]. We useP a;DR to represent the doubly robust estimators for the risk of disease under exposure A = a. For the binary exposure variable A ij , the estimators are as follows: where "PS" is the probability of exposure (propensity score) obtained from the exposure model. m 0 , m 1 are the risks of the outcome in each exposure group obtained from the outcome model. For the IPW with a doubly robust estimation approach, the "PS" part is PðA ij ¼ 1jX ij ;Û iA ;αÞ for a random cluster effect exposure model shown as Eq 2. For the ICPW with a doubly robust estimation approach, the "PS" part is the conditional logistic regression model PðA ij ¼ 1jX ij ; S i ;αÞ shown as Eq 4. For the outcome model part, where m 0 ðX ij ;Û iY ;βÞ is the estimated risk of the disease outcome under no exposure, and m 1 ðX ij ;Û iY ;βÞ is the estimated risk of the disease outcome under exposure, which was shown as Eq 1.

Infectious causes of bovine respiratory disease data application
In the Australian-based National Bovine Respiratory Disease Initiative (NBRDI) data, BRD is the outcome and BVDV serology upon arrival at the feedlot is the exposure of interest (BVDV induction). Age, weight, mix history, persistently infected (PI) group, and BVDV vaccination are the five individual-level covariates, which may confound the relationship between BVDV induction and BRD. Feedlot is the cluster identifier. Fig 1 presents the relationship among all the variables through a directed acyclic graph (DAG). A detailed description for this dataset and the proposed DAG can be found in Chapter 4 and Chapter 11 in Hay 2015 [5]. The original NBRDI data has 35,131 animals with the BRD incidence rate as 0.176. After matching data from vendor questionnaire with serology results using animal identifier, the sample size was reduced to 2,272 with BRD incidence rate increased to 0.528. Observations with missing values in variables age, mix history, and BVDV vaccination were deleted. We also deleted observations from one feedlot with only two animals, which routinely backgrounded small groups of animals for extended periods. The final subset of data that we use has 1,552 animals nested within 10 feedlots. The BRD incidence in our final dataset is 0.537 which is higher than the prevalence from the original NBRDI. As mentioned, the association observed in the complete NBRDI data has been described extensively previously [3][4][5]; therefore, our focus is not an extensive reanalysis of the data or making any clinical inference but rather as a demonstration. To distinguish the subset we use from the NBRDI data, we refer to our dataset as the Subset-BRD dataset. Table 1 includes the detailed information of the variables we used in the application. Table 2 shows the descriptive statistics for the Subset-BRD dataset within each category of prior BVDV exposed and non-exposed groups, where means and standard deviations are reported for continuous variables and proportions are reported for categorical variables. Table 3 shows the estimated parameters in the outcome model and the random effect exposure model for our subset of 1,552 study subjects.
The risk of BRD with and without BVDV exposure from the Subset-BRD dataset, P 0 and P 1 , were estimated by six different approaches (Approaches overview) as shown in Table 4. For each model, the estimated odds ratios were also obtained based onP 1 =ð1ÀP 1 Þ P 0 =ð1ÀP 0 Þ . The standard error corresponding to each point estimate was obtained based on 500 bootstrap replicates from stratified sampling within each feedlot. Note that the most recognizable estimator of the conditional odds ratio (exp(β)) is not reported. That approach to estimation of the conditional odds ratio for the MOM would result in an odds ratio estimate of 0.5922 or exp(−0.5239). Using the P k p¼1bp X ij;p þÛ iY Þ. The outcome model approach includes "univariable outcome model (UOM)" and "multivariable outcome model (MOM)". The MOM is a commonly used approach to modelling risk factor data such as these and is often employed in veterinary epidemiology. The weighting approach includes "IPW" and "ICPW", where the "IPW" fits a random feedlot effect exposure model and weights each individual using the inverse probability of exposure. The "ICPW" fits a conditional logistic regression with feedlot as the stratum, and the weight for each individual is calculated using the inverse probability of exposure conditional on the covariates and sufficient statistic for the feedlot effect. The doubly robust approaches include both "DR-IPW" and days before day 0 b Feedlot is an intensive commercial unit in a single location involved in the finishing stages of beef production c Group is an identifier used for animals within a cohort that were together on a given day prior to induction d Cohort comprises all study animals grouped together in a pen following their animal-level induction, and cohorts were nested within feedlots https://doi.org/10.1371/journal.pone.0233960.t001 "DR-ICPW". The outcome models for both doubly robust approaches are the same as the multivariable outcome model, and the exposure model components are different (IPW and ICPW). What is apparent from evaluation of the results in Table 4 is that the different estimation approaches provided quite different estimates of the marginal odds ratio. The UOM and MOM are methods that use conditional exchangeability in subsets defined by the confounding variables to estimate the effect of prior BVDV exposure on BRD in those subsets only, while instead we are interested in the marginal causal effect which can be quantified as the marginal odds ratio. If we limit our discussion to the estimate from the MOM model compared to the estimates from the weighting for the doubly robust methods, we see different average causal effects estimated from the models. The difference in estimates is apparent in P 0 and P 1 . The analysis of the Subset-BRD data provided evidence of differences. However, because the true odds ratio is unknown, it is impossible to know which approach is the least biased. Therefore, we used simulation where the true values of the parameters are known, to illustrate that both the weighting and doubly robust approaches have advantage over the outcome models with reduced bias.

Simulation study
Consistent with the objective of the manuscript, to document advantage of the weighting and doubly robust methods, here we present the simulation study and the performance of the six approaches used to analyze the Subset-BRD data. There are four different simulation scenarios. Under each scenario, we simulated the potential outcome Y a ij , exposure A ij , covariates X ij and feedlot ID. The observed outcome, Y ij , can be calculated from the potential outcome and exposure as ij . Let P a be the averaged potential outcome probabilities when A ij = a for all samples. P a ¼ 1 N P I i¼1 P n i j¼1 PðY a ij ¼ 1Þ, for a = 0 or 1. For each simulated dataset, we treated P a as the true outcome probability under exposure A ij = a. The true odds ratio (OR) was calculated as OR ¼ P 1 =ð1À P 1 Þ P 0 =ð1À P 0 Þ . For each approach, we calculated the average bias and root mean square error (RMSE) for each point estimate over 100 simulated datasets. The average true P 0 , P 1 and odds ratio for 100 simulated datasets were also reported for comparison. In the following sections, we compare the performance among the outcome model approach, the weighting and the doubly robust approaches, and as well as the approaches within each approach-category. Scenario 1. The goal for scenario 1 was to create a simulation setting that mimics the realistic data structure so the pattern of the realistic data estimates observed in Table 4 could be explained and compared with the true outcome probability P a . This dataset includes cluster (feedlot) sizes that are the same as the Subset-BRD data, which is not balanced and range from 16 to 418 animals. In this simulation, the value of the covariates (age, weight, mix history, and PI group) X ij were directly taken from the Subset-BRD dataset.Û iY andÛ iA were the random effects estimated from the outcome model and exposure model shown in Eqs 1 and 2 respectively. We then assigned exposure to the study units, and the exposure assignment mechanism was PðA ij ¼ 1Þ ¼ expitð P k p¼1α p X ij;p þÛ iA Þ, whereα ¼ ðâ 1 ;â 2 ; :::â 5 Þ were the parameters estimated in the random effect exposure model shown in Table 3. For each animal, there were two potential outcomes generated, :::;b 5 Þ were the parameters estimated from the outcome model shown in Table 3. The total sample size is 1552, the same as the Subset-BRD dataset. The result of scenario 1 is shown in Table 5. What we see in Table 5 is that the mean of the true risk of disease in the exposed animals over 100 simulated datasets is 0.5007 and 0.5931 for the unexposed animals. The mean of the true odds ratio for the 100 simulated datasets is 0.6890. We can also see that both the weighting and doubly robust approaches have smaller bias than the outcome model approach, and overall doubly robust approaches perform the best. This illustrates the advantage of the causal inference approaches in estimation of a marginal causal effect. Scenario 2. In Scenario 1, the cluster size was unbalanced. However, it might be of interest to know if the outcome model approach performs better when clusters are balanced. The purpose of the scenario 2 is then to evaluate all approaches under a balanced cluster (feedlot) size setting, where each feedlot has the same number of animals. We ensure the simulated datasets in this scenario to be close to the Subset-BRD dataset in size by sampling 155 animal ID's from each of the 10 feedlots with replacement. Then we matched the corresponding covariates X ij and the estimatedÛ iA andÛ iY from the Subset-BRD data. The exposure variable, A ij , and potential outcomes, Y 0 ij and Y 1 ij , were also simulated following the same mechanism as scenario 1. The total sample size is 1550. The result of scenario 2 is shown in Table 6. Again, the advantage of weighting and doubly robust approaches is evident even when the cluster size is balanced. The outcome model approach has an estimated OR of 0.4098, while the true OR is 0.711, and the closest other method is DR-ICPW with 0.7342. Scenario 3. In scenario 3, the aim was to evaluate the performance of these approaches in a more general setting to show that the patterns of estimation among the approaches were not caused by the choice of covariates or random effects. Instead of taking X ij or estimating U iA and U iY directly from the Subset-BRD data, we simulated these values according to the distributions in the Subset-BRD data. Table 7 shows the detailed distribution information about simulated X ij , where the parameter values were chosen based on the covariate distributions in the Subset-BRD data. The feedlot-level random effect in the exposure model is U iA � Nð0; s 2 A Þ. The feedlot-level random effect in the outcome model is U iY � Nð0; s 2 Y Þ. Both σ A and σ Y can be estimated by the standard deviation of the random effects in the exposure model and outcome model fitted with the Subset-BRD data. Again, A ij , Y 1 ij and Y 0 ij followed the same simulation assignment mechanism as in scenario 1. The total sample size is 1550. The result of scenario 3 is shown in Table 8. We do not devote much text to the result of the simulation, as it is consistent with the prior scenarios, the least bias estimates are associated with doubly robust estimation methods, although the bias in the outcome regression models is less than in the two prior simulation scenarios. Scenario 4. Scenario 4 is very similar to scenario 3 changing only the distribution of the PI group (X4) variable in order to have a stronger confounding variable. From the parameter estimates and the corresponding p-values in Table 9, we can see that PI group variable is contributing to both the outcome and random feedlot effect exposure model significantly. PI group follows a multinomial distribution with 4 levels, and the estimates of the first two levels are quite different from the remaining levels. In scenario 3, we used the original distribution of the PI group in the Subset-BRD data, in which the proportions of having PI group level 1 and level 2 are quite small. Now in scenario 4 we simulated PI group from a multinomial with p 1 = 0.4, p 2 = 0.4, p 3 = 0.1, p 4 = 0.1 by increasing the proportions of level 1 and level 2 to make PI group to be a stronger confounder. The result of scenario 4 is shown in Table 10. Again, the advantage of the weighting approach, and in particular with the doubly robust approach is evident, with the least bias associated with DR-IPW. Doubly robust sensitivity analysis. One of the anticipated difficulties associated with doubly robust approach is the need to assess the different impact from the misspecified model of the exposure used to estimate the inverse probability weight or the misspecified outcome model used to estimate the m 0 , m 1 . By "misspecified model" we mean changing the link function we used in the logistic regressions (e.g. logit). Therefore, to assess the performance of doubly robust approaches and compare them with the other approaches under the one-modelmisspecification situation, we performed a sensitivity analysis following the setting in scenario 3 by deliberately misspecifying the model.
To create the outcome model misspecification setting, we used a logit link for the outcome model, but Y 0 ij and Y 1 ij were actually simulated from binary distributions with cloglog link function. Similarly, to create the exposure model misspecification setting, we used a logit link for the exposure model, but A ij was actually simulated from a binary distribution with cloglog link function. Tables 11 and 12 show the results of all approaches under the outcome model misspecification situation and the exposure model misspecification situation respectively.

Summary discussion of simulations results
All the scenarios indicate that the doubly robust approaches have the best performance for consistency and stability overall, where the difference between DR-IPW and DR-ICPW are very small. The sensitivity analysis result suggests that the doubly robust approaches also have the best performance when either the outcome model or the exposure model are incorrectly specified. When the outcome model is incorrectly specified, the performance of the weighting approach should not be influenced. However, in Table 11 the RMSE's for the doubly robust approaches are still smaller than the weighting approach. Similarly in Table 12, the doubly robust approaches outperform the weighting approaches when the exposure model is incorrectly specified.

Discussion and conclusions
In this paper, we aimed to document the advantages of using weighting and doubly robust approaches to estimate the average causal effect of an exposure on the outcome. Our rationale was that although the methods have been available for decades, these approaches, which are less biased, appear to be infrequently used in veterinary epidemiology. If the goal of research is to obtain the least biased estimation of causal effect, then it seems reasonable that epidemiologists will employ methods as suggested. The approaches we recommended here have been documented previously [18,23,24], although perhaps not so explicitly with realistic veterinary data.
In an observational study, IPW, ICPW and doubly robust estimation are useful in estimating the causal effect of the exposure when there are confounders involved. IPW adjusts for confounders by creating a pseudo-population where the measured confounders and exposure are independent. ICPW is robust for clustered data when the cluster-level confounders are not measured. Doubly robust estimation is a combination of traditional outcome model and exposure model (IPW or ICPW) idea, which provides stable and consistent estimates if at least one of the outcome model or exposure model are correctly specified. As the true relationship among exposure, outcome, and confounders are rarely known, the doubly robust estimation has the advantage in both stability and consistency in estimation.
When compared to the traditional outcome model approach in the application to the NBRDI data, the results from IPW, ICPW and doubly robust estimation showed considerable amount of difference in the estimated effect of the exposure on outcome. Simulation studies mimicking the Subset-BRD dataset revealed that the IPW, ICPW and doubly robust estimation methods are superior to the traditional outcome model approach in both bias and precision of estimation. Further simulation studies showed that the doubly robust methods are robust to model misspecification and is thus the recommended approach.