The authors have declared that no competing interests exist.
Rapid growth in scientific output requires methods for quantitative synthesis of prior research, yet current meta-analysis methods limit aggregation to studies with similar designs. Here we describe and validate Generalized Model Aggregation (GMA), which allows researchers to combine prior estimated models of a phenomenon into a quantitative meta-model, while imposing few restrictions on the structure of prior models or on the meta-model. In an empirical validation, building on 27 published equations from 16 studies, GMA provides a predictive equation for Basal Metabolic Rate that outperforms existing models, identifies novel nonlinearities, and estimates biases in various measurement methods. Additional numerical examples demonstrate the ability of GMA to obtain unbiased estimates from potentially mis-specified prior studies. Thus, in various domains, GMA can leverage previous findings to compare alternative theories, advance new models, and assess the reliability of prior studies, extending meta-analysis toolbox to many new problems.
Normal science progresses when scientists build on prior research to extend, test, and apply theories of biological, physical, and social phenomena [
The current approach to quantitative aggregation of prior research uses various meta-analysis techniques [
Despite these limitations, the rapid growth of scientific literature has promoted increasing applications of meta-analysis. Publications listed in nine major databases (Web of Science Core Collection, MEDLINE, Biological Abstracts, Zoological Records, BIOSIS Citation Index, Data Citation Index, SciELO Citation Index, Current Contents Connect, and Derwent Innovations Index) with the term “meta-analysis” in the title show over 25-fold growth (from 1,247 in to 31,314) over the last decade, now reaching tens of thousands annually. Thus, the value of a broader method for quantitative aggregation of prior research can be immense across various disciplines. Consider a few examples.
Over 125 studies in environmental science have analyzed the impact of the pesticide Atrazine on freshwater vertebrates, yet no quantitative conclusion can be drawn in the absence of a method to combine them [
A meta-regression study combines 60 prior estimates of the impact of climate change on human violence [
In energy research, multiple methods exist to estimate diffuse solar energy in a location using data from distant sensors [
In occupational health, at least 10 studies have estimated the effectiveness of workplace-based return-to-work interventions after injury or illness [
In urban planning, a review found 45 published models of municipal solid waste generation [
In obesity research, over 47 separate studies have estimated human basal metabolic rate (BMR) as a function of different body measures, such as fat mass (F), lean mass (L), body weight (BW), age, and height, among others [
In such settings, besides quantitative aggregation of prior findings, a general approach to aggregation could allow researchers to leverage the data previously collected by others to build, test, and compare alternative new theories, and assess the reliability of individual studies.
In this paper, we introduce the Generalized Model Aggregation (GMA) method. GMA uses available summary statistics from prior studies to estimate a meta-model that, when simulated, can replicate those original statistics. GMA provides consistent and reliable estimates, requires few restrictions on the structure of the meta-model or previous studies, can correct for biases in prior studies due to missing variables and model mis-specification, allows for both fixed effect and random effects models, and accommodates statistics for hypothesis testing and model selection. Moreover, the GMA approach relies on few assumptions about model structure and underlying effects, offering transparent and easy-to-understand results even in aggregating heterogeneous prior studies. Therefore, GMA enables quantitative model aggregation, theory building, and theory testing in a wide range of applications that previously relied only on qualitative literature reviews to synthesize existing findings. We demonstrate the GMA method in multiple simulated scenarios and an empirical validation.
The intuition behind GMA is simple and summarized in
Prior studies provide the vector of empirical signatures,
The idea of matching simulated summary statistics against the empirical ones is well-established. For example, it is the basic idea of the method of simulated moments [
Consider a data generating process
In GMA, we estimate the data generating model by utilizing the results of
At the heart of GMA is the idea that if we simulate the true function
GMA estimates share their theoretical underpinnings with the method of simulated moments [
Implementing GMA requires samples of (simulated) explanatory variables,
An efficient estimate of
To validate GMA, we first test it on simulated problems where the underlying data generating model is known. In each case, a true data generating process,
Seven scenarios across five model structures are explored. In Scenario 1 (
a) Estimated parameters of a linear generating process (meta-model)
GMA is similarly effective in three other linear models. In Scenario 2 (
In all four scenarios GMA extracts unbiased estimates of the true data generating process with 95% confidence intervals comparable to or tighter than those in the original studies. Even when prior studies are missing important variables and do not include the true effects in their confidence intervals, their signatures contain information, and GMA can extract that information to yield better estimates (
In Scenario 5, we explore GMA’s ability to infer a continuous nonlinear data-generating process from prior analyses of variance (ANOVA) on categorical data. We use a true data generating process of the form
Scenario 6 explores GMA’s applicability to nonlinear models. Specifically, following a prior study [
The predicted outcome is fluid leakage rate and its expected value under the true data generating process (left), and each model is shown using color maps. Black dots in the two middle charts identify the original data points used in estimation of the two linear models. However, these “raw” data points are not used in GMA estimation, only the coefficients of the two linear models (3+2 coefficients) and two R2 terms (total of 7 signatures) are used for estimation of the non-linear meta-model (graphed on the right).
Finally, in the last scenario we assess GMA’s ability to estimate random effects models and compare it with the standard random effects method that is the workhorse of classical meta-analysis [
In the previous scenarios, we assumed no measurement errors for explanatory variables. However, measurement errors are common and can lead to biased parameter estimates in the prior studies, and thus in GMA results. Therefore, it is important to account for those errors when they are expected to be large. There are multiple methods in the literature for correcting for measurement error in traditional meta-analysis [
Basal metabolic rate (BMR) is the largest component of human energy expenditure and accurate estimates are critical for understanding human metabolism, developing obesity and malnutrition interventions, and identifying patients with metabolic abnormalities, among others. As an empirical validation of GMA, we aggregate prior studies of the determinants of BMR and compare the predictive power of the resulting meta-model with existing models in the literature. We focus on a single population group, white males over 18 years of age. A recent review of the literature [
Alternative Meta-model Estimates | MSC |
---|---|
BMR = 558 + 2.8H + 7.5F + 12L - 3.1A + N(0,170) | 2,676 |
BMR = 851 + 1.1H + 8.7F + 13L - 3A - 3.3BMI |
2,722 |
BMR = 231 + 4.4H + 3.1F + 16.2L - 2.4A + 0.06F2 - 0.03L2 + N(0,128) | 2,429 |
BMR = -3526 + 3.6H + 11F - 5.8L - 2.6A - 130.4 ln(F) + 1299.3 ln(L) + N(0,136) |
aModel Selection Criterion, MSC = χ0 + 2 dim(β).
bBody Mass Index (BMI), a common measure of obesity, is weight divided by height squared.
The accuracy of this model is compared against alternatives in predicting BMR in an empirical validation sample of 159 male subjects [
While the validation dataset is not large for the standards of many typical statistical analyses, it is noteworthy in this context. First, the sample is a relatively large one in this literature given the costs and complexities involved in good measurements of BMR and Fat Mass. Indeed, we found only 3 prior studies that included larger samples, while the vast majority of prior research had used much smaller samples (typically under 50 subjects). Moreover, we needed a validation sample that was not used in prior published BMR estimation, so that we would not unintentionally bias our own model selection procedure, or contaminate the input data into GMA when using prior studies that had utilized the validation sample. The validation sample we used offered a satisfying resolution to both concerns.
Besides providing more accurate equations without using any individual level data, GMA provides three additional insights. First, it identifies a statistically significant nonlinearity in the change in BMR as a function of
GMA provides a flexible method to quantitatively combine diverse statistical findings. GMA extends meta-analysis to more heterogeneous sets of underlying studies that vary in design and variable operationalization. At the heart of this method is the idea that any statistic reported in prior studies has information about, i.e., a signature of, the data generating process. The development of GMA, a method to piece together many such signatures, further highlights the importance of detailed reporting and replicability of scientific studies. For example, there is much useful information in the covariance matrix of explanatory variables and reporting that, even if in an online appendix, would be very valuable for other researchers. GMA can facilitate the iterations of theory building and theory testing central to scientific method. By allowing researchers to formally estimate and compare new models against prior results, GMA provides faster feedback about the usefulness of new models and theories. Moreover, GMA can be used to assess the consistency of new detailed models of a phenomenon against more aggregate empirical findings. For example, detailed models of BMR defined at the body-organ level could be constructed and partially estimated by GMA against prior BMR equations. While domain-specific research informs many disaggregated parameters (e.g., the metabolic rates of brain and liver), GMA could ensure the aggregation of those components into a model that is consistent with overall findings (e.g., BMR change over age and due to weight changes). Such applications promise a more fruitful dialogue between mechanism-based and statistical models. GMA can also help resolve apparent inconsistencies among prior studies. For example, the effect of different measurement methods for similar concepts could be estimated and inconsistencies due to measurement separated from those due to missing variables or heterogeneity in the underlying samples.
This paper introduces the idea of GMA and motivates future research to apply, elaborate, and expand the method. First, to keep the method general, we used a simulation-based approach that can use any signature with a squared error matching function. More efficient likelihood based linking functions could be devised for well-behaved subsets of empirical signatures. Second, we offered various methods for generating samples of explanatory variables. Given the importance of these samples as inputs to GMA, future research should seek additional methods, assess the pros and cons of those methods, and investigate the sensitivity of results to various degrees of inconsistency between these samples and those from the data generating process. Third, we showed that GMA does a fine job in estimating traditional random effects meta-analysis models. Systematic comparisons with other meta-analysis methods is another promising area of research. Forth, a more detailed treatment of measurement error and conditions under which GMA can estimate that is a promising area for future studies. Moreover, theoretical research can focus on the properties of effective signatures and the minimum set of signatures required for identifying a given model. Finally, the extent of usefulness of GMA will only be known when more empirical applications are conducted. We hope this paper motivates many such applications.
GMA may overcome some of the common challenges faced by meta-analysis by accommodating broad model specifications, including study-specific effects, and potential omitted variable biases [
This document includes three main sections: (1) Supplementary Notes A-K, (2) Supplementary Tables A-L, and (3) Supplementary Figs A-B.
(PDF)
This document presents: (1) instructions on how to read and execute the codes, (2) descriptions about the notations and functions in the codes, and (3) details of the codes.
(PDF)
This zipped file includes GMA codes in ‘.m’ files. The codes are developed using MATLAB.
(ZIP)
We are grateful to David Keith, Yakir Reshef, John Speakman, John Sterman, and Klaas Westerterp and seminar participant at MIT for their feedback on earlier drafts of this paper. Dr. Westerterp also provided the validation dataset which we used for the BMR analysis.