A method of back-calculating the log odds ratio and standard error of the log odds ratio from the reported group-level risk of disease

In clinical trials and observational studies, the effect of an intervention or exposure can be reported as an absolute or relative comparative measure such as risk difference, odds ratio or risk ratio, or at the group level with the estimated risk of disease in each group. For meta-analysis of results with covariate adjustment, the log of the odds ratio (log odds ratio), with its standard error, is a commonly used measure of effect. However, extracting the adjusted log odds ratio from the reported estimates of disease risk in each group is not straightforward. Here, we propose a method to transform the adjusted probability of the event in each group to the log of the odds ratio and obtain the appropriate (approximate) standard error, which can then be used in a meta-analysis. We also use example data to compare our method with two other methods and show that our method is superior in calculating the standard error of the log odds ratio.


Background
Many randomized controlled trials (RCT) are conducted in livestock production facilities. In these studies, although the animals are individually randomized to treatment group, the treatments groups are housed together in units. These housing units can be referred to variably as pens, houses, barns, rooms, pastures, villages, etc. and differ in name based on the species and country. Regardless of the name, housing is associated with a commonality of ration and environment that very often impacts important outcomes such as mortality, morbidity and growth. In these livestock populations, the housing unit is therefore considered a source of dependency in the outcome. To account for this dependency, it is often recommended that a random effect for the housing unit be incorporated into the model estimating the intervention effect size. Similarly, in observational studies, dependency in the outcome of animals housed together is commonly adjusted for by inclusion of a random effect for housing (pen, barn, house, etc.) [1][2][3][4]. This dependency in the outcome is sometimes also called clustering and readers interested in a more detail explanation of dependency of outcomes in livestock populations are directed to other publications [2]. The need to adjust for housing effects has implications for the reporting of study effects. For example, the REFLECT statement, which is a guideline for reporting clinical trials in livestock production, suggests that authors report "For each primary and secondary outcome, a summary of results for each group, accounting for each relevant level of the organizational structure, and the estimated effect size and its precision (e.g. 95% confidence interval)". The rationale for requesting group-level risk information, in additional to relative estimates, is that risk is easier to interpret for end users, compared to adjusted relative estimates, which are more useful to meta-analysis. Group-level estimates of disease risk can be obtained using standard statistical analysis software. For example, in the GLIMMIX procedure from SAS (version 14.1), the group-level estimates of disease risk are obtained by applying the inverse link function to the least-squares means estimates reported in the outcome [5]. For studies that use random effects models, the effect sizes should all be adjusted estimates.
Interestingly, in a recent review of veterinary clinical trials we noted that several authors of clinical trials in livestock production chose to report only the adjusted group-level estimates of disease risk and corresponding 95% confidence intervals [6,7], not a relative effect size (odds ratio or risk ratio). To illustrate, Schunicht et al. (2002) in a clinical trial of antibiotics to control undifferentiated fever in feedlot cattle reported using the following approach to analysis "The animal health variables were compared between the experimental groups by using linear logistic regression modeling techniques controlling for within-pen clustering of disease by using the method previously described (24) and reviewed (25,26)" [6]. The authors then reported the initial undifferentiated fever risk as 23.17% for the oxytetrcycline group and 18.32% for the tilmicosin group with a common standard error of 1.59%. The authors also reported the p-value of 0.046 for the treatment effect, however the odds ratio was not explicitly reported [6].
For meta-analysis of adjusted effect sizes the log odds ratio is often used. When authors only report the adjusted disease risk per group it is necessary to convert the group-level risk back to a log of the odds ratio (also called the log odds ratio). However, we were unable to find guidance for converting the risk estimates to the log odds ratios in standard meta-analysis texts [8][9][10]. In personal communications with researchers who work on similar topics and have encountered the same issue, it appears that at least two approaches have been used. We call these naive approaches because they do not appear to be based on any proposed published methods. Here, we propose an approach to obtaining estimates of effect size (log odds ratio) and its standard error (SE) using only the point estimate of the risk of disease in each treatment group and the significance test results (p-value) of the treatment effect size. We compare our proposed method to two naive approaches described to us by other review teams.

Methods
In the following section, we first describe the random effects model on which we based our further analysis. We then propose a novel method to transform the adjusted group-level risk back to the log odds ratio averaged across random effects and present a method to approximate its standard error. The adjusted group-level risk is on a probability scale between 0 and 1; if a percentage is reported it can be converted to the probability scale first before apply our proposed method for transformation.

The adjusted group-level risk obtained from a generalized linear mixed model(GLMM) for binomial outcome data
We consider a random effects model with two treatment groups, both shared with J blocks. Let n ij be the number of experimental units that received treatment i(i = 1, 2) in block j(j = 1, . . ., J) and r ij be the number of events. To compare the performance of these treatments and account for dependence, one can model the probability of treatment effect, π ij , by using the generalized linear mixed model with a logit link function. The model is: where γ j denotes the random block effects of block j and s 2 g is the variance of the random block effects. I is the indicator function.
Different estimation methods have been proposed to maximize the likelihood function of the GLMM. In the lme4, Zelig and glmmML packages from R (version 3.5.2), the estimation methods that can be selected are the Laplace approximation developed by Daniels (1954) [11], the Barndorff-Nielsen and Cox approach (1979) [12], and the Gauss-Hermite quadrature approach. Monte Carlo likelihood approximation is used in the glmm package to find the Monte Carlo maximum likelihood estimates for the fixed effects and the variance components [13]. For testing the significance of any fixed treatment effects, the test statistic under the null hypothesis in these R packages approximates a normal distribution. In SAS 14.1, the default method of parameter estimation for models with random effects in GLIMMIX is the restricted pseudo-likelihood method as demonstrated by Wolfinger and O'Connell (1993) [14]. However, comparing these estimation methods is not the aim of this paper. Our goal is to document an approach to converting the adjusted group-level risk of the outcome to the adjusted log odds ratio averaged across the random effects estimate regardless of the estimation method used. For authors who use SAS for analyses, the adjusted group-level risk is often called the least squares means on the probability scale (LSMEANS statement) in the default SAS outputs format. It is:Ê

Converting the adjusted disease risk back to a log odds ratio
For conducting meta-analysis, we need to transform the estimated group-level risk, written here on the probability scale (0-1)Êðp ij jg j ¼ 0Þ back to a log odds ratio estimateb 1 and its standard error. SupposeÊðp 1j jg j ¼ 0Þ andÊðp 2j jg j ¼ 0Þ were reported, then the point estimate of β 1 is given by the standard formula: To obtain the standard error (SE) ofb 1 , we use the p-value (p) for testing the significance of the fixed effect,b 1 and assume the normality of the test statistic of β 1 . Then we have: where z 1−p/2 is the 100 × (1 − p/2)% quantile of the standard normal distribution. Note that for testing the significance of the fixed treatment effect in some software packages (GLIMMIX in SAS, hglm in R), the test statistic under the null hypothesis approximates a t-distribution. Although there are several methods available of determining the degree of freedom for the approximate t-distribution, containment is the default method in GLIMMIX if the model contains random effects. If degrees of freedom are reported in the original studies, then we can use t-distribution as the reference distribution.
For the case where there are J blocks within each treatment group i, the model is: where γ ij is the random block effects of block j within treatment i. The method proposed above also applies here.

Two naive methods
As discussed, we were unable to find guidance as to how to transform the group-level adjusted risk, and therefore we evaluated the three possible approaches we have identified. The two naive ways are the relatively simple approaches suggested based on personal communications with other review authors. We present these for comparison's sake; however, readers should be aware that we could find no citations recommending these approaches, hence we label them as naive. Suppose we have an RCT with two treatments t 1 and t 2 . Let n 1 and n 2 denote the total number of enrolled animals in the two treatment groups. p 1 and p 2 denote the adjusted group-level disease risk calculated using equation 1, while l 1 and l 2 are the lower 95% confidence limits, and u 1 and u 2 are the upper 95% confidence limits. One suggestion was to transform the adjusted disease risk and the confidence intervals directly. Letb 1 be the point estimate of the log odds ratio, which is obtained as follows: Then, the estimates of lower limits and upper limits ofb 1 are obtained as followŝ Then we can calculate the standard error ofb 1 by dividing the range of the confidence interval by the 97.5% quantile of the standard normal distribution as follows: SEðb 1 Þ ¼û Àl 2z 0:975 : Another proposed method was to convert probabilities to raw frequency data based on the number of study units and then calculate log odds ratio and standard error using those frequency data. This method ignores the random block effects and the formula for the standard error is obtained by using delta method. The estimates are: b 1 ¼ log ðp 2 n 2 Þðn 1 À p 1 n 1 Þ ðp 1 n 1 Þðn 2 À p 2 n 2 Þ � � ; SEðb 1 Þ ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi ffi ffi ffi ffi ffi ffi 1 p 1 n 1 þ 1 n 1 À p 1 n 1 þ 1 p 2 n 2 þ 1 n 2 À p 2 n 2 r :

Simulations
In this section, we conducted simulations to evaluate the performance of the proposed backcalculation method and the two naive methods. The simulation settings varied in the sample size, number of blocks, magnitude of the variance of the random effects and the value of true log odds ratios. With the combination of the above factors, there were 16 different settings.
For each setting, we generated 10,000 data sets to compute the standard error (SE) of the estimated log odds ratio. The mean absolute error (MAE) of the SE is the average, across 10,000 simulations, of the absolute difference between the SE back calculated by our proposed method and the true estimate of SE.
In this simulation, we fixed β 0 = −1.5 and the mean standard errors were obtained by taking the average of the standard errors of the estimated log odds ratio across 10,000 simulated data sets. The simulation results, shown in Table 1, indicate that our proposed methods performed better than the other two approaches in all the scenarios above. The magnitude of the MAE of proposed method is smaller than other two methods in most of the cases. The magnitudes of the MAEs of the two naive methods are not negligible compared to the mean standard error.

Examples
Here, we provide examples of three possible conversion approaches using two data sets. The first data came from a 2-arm RCT conducted in swine population at the Department of Veterinary Diagnostic and Production Animal Medicine in Iowa State University. This data can be found on https://github.com/dapengh/lor-est-dataset. The objective of this study was to compare the efficacy of two interventions administered at the individual level in terms of the risk of a pig being treated or not. In this data, the random effects of different pens nested within different rooms create dependency that should be accounted for in this model. Table 2 shows the adjusted probability of being treated and the 95% confidence limits. The p-value of testing significant differences between the two groups is 0.0000299. Table 3 shows the comparison of two naive methods with the method proposed and true estimation results. All three methods provide an accurate point estimate of log odds ratio, while our proposed method outperforms the naive methods in terms of estimation of the standard error. The difference in the estimate of the standard error for each method and the true standard error are also reported. The true estimate is the value obtained from fitting the generalized linear mixed model in statistical software packages (e.g. lme4 package in R).
Another example is an observational study and the data is a subset of the dataset "cbpp" from the R package lme4. The data description link is available at https://cran.r-project.org/ web/packages/lme4/lme4.pdf. The aim of these data are to study the within-herd spread of Contagious Bovine Pleuropneumonia (CBPP) at different time periods (period 1 and 2) in infected herds. In this example, cattle from multiple districts are studied, and therefore districts create dependency that should be considered as a random effect in this model, much the same way that housing units would create dependency in the swine RCT. Table 4 shows the adjusted disease risk of the period effect and the 95% confidence limits. The p-value of the period effect is 0.00127.
As with the RCT example, this observational study (Table 5) shows that the point estimates of the log odds ratio are accurate for all approaches but the standard error of our proposed method has the smallest difference from the estimate given by the software results. The extent Table 2. Adjusted group-level risk of disease and 95% confidence intervals reported on the probability scale (0-1) for RCT data using results from a generalized linear model (lme4 package). The model contains a fixed effect for treatment and a random effect for pens (n = 24) nested within rooms (n = 2).  Table 3. Estimates of Log odds ratio and its standard error corresponding to the RCT data in Table 2  of the difference is smallest using our proposed approach and therefore we propose this method.

Discussion
The aim of this paper is to help researchers extracting results from studies that only report the adjusted disease risk of each arm and the significance test results of the effect size and backtransform these to the log odds ratio averaged across a random effects scale. Our rationale for sharing this information is that this approach to reporting was more common than we expected in livestock reviews. For example, in one recent review, at least 10 of 75 relevant studies used this approach to reporting. Without a method to accurate estimate the log odds ratio and its standard error, the results of such studies might be excluded from systematic reviews and contribute to research wastage. We should note, that for parameter estimation in a generalized linear mixed model, different software has different estimation methods. These estimations methods although of interest, are not relevant here as our method seeks to convert the adjusted group-level disease risk reported on a probability scale (i.e., between 0 and 1) to the log odds ratio no matter what estimation method is used. One potential limitations of our approach is requirement that the p-value of the treatment is reported. If a study contains more than two treatments, this method is also valid if corresponding p-values are reported.