The association between socioeconomic status and pandemic influenza: Systematic review and meta-analysis

Background The objective of this study is to document whether and to what extent there is an association between socioeconomic status (SES) and disease outcomes in the last five influenza pandemics. Methods/principle findings The review included studies published in English, Danish, Norwegian and Swedish. Records were identified through systematic literature searches in six databases. We summarized results narratively and through meta-analytic strategies. Only studies for the 1918 and 2009 pandemics were identified. Of 14 studies on the 2009 pandemic including data on both medical and social risk factors, after controlling for medical risk factors 8 demonstrated independent impact of SES. In the random effect analysis of 46 estimates from 35 studies we found a pooled mean odds ratio of 1.4 (95% CI: 1.2–1.7, p < 0.001), comparing the lowest to the highest SES, but with substantial effect heterogeneity across studies,–reflecting differences in outcome measures and definitions of case and control samples. Analyses by pandemic period (1918 or 2009) and by level of SES measure (individual or ecological) indicated no differences along these dimensions. Studies using healthy controls tended to document that low SES was associated with worse influenza outcome, and studies using infected controls find low SES associated with more severe outcomes. A few studies compared severe outcomes (ICU or death) to hospital admissions but these did not find significant SES associations in any direction. Studies with more unusual comparisons (e.g., pandemic vs seasonal influenza, seasonal influenza vs other patient groups) reported no or negative non-significant associations. Conclusions/significance We found that SES was significantly associated with pandemic influenza outcomes with people of lower SES having the highest disease burden in both 1918 and 2009. To prepare for future pandemics, we must consider social vulnerability. The protocol for this study has been registered in PROSPERO (ref. no 87922) and has been published Mamelund et al. (2019).


Introduction
This note documents the judgments and calculations made during the quantitative analyses performed as part of the meta-analysis of studies assessing the association between indicators of socio-economic status and different pandemic related outcomes (e.g., infections, hospitalizations, deaths).
As a result of the Covid-19 outbreak and the increased interest for our study, we decided to focus on the established meta-analytic techniques and finish the paper based on these.

Analysis plan Initial
The original plan was to analyze the data in three ways: 1. A random effects meta-analysis 2. A PET-PEESE meta-regression, using study level variables as explanatory covariates 3. Bayesian model estimating "dose-response" gradients This is based on the published pre-analysis plan, which included the following section on how we planned to perform the quantitative analysis of the gathered data: The quantitative part of the study will pool results across studies. Such pooling can be done using various methods that impose different constraints on the type of studies that can be pooled. We will pursue three strategies. The first two are within the frequentist statistical tradition. We will here note whether coefficients are statistically significant at the 1%, 5%, or 10% level, i.e., whether the evidence base indicates pooled effects that would be unlikely if the true effect was zero. We will also discuss the strength of this test by assessing the magnitude of the pooled coefficient and its standard error (precision) in relation to plausible effect sizes. In the third, Bayesian methods will be used and we will assess how the evidence updates weakly informative priors for the coefficients.

Pooled effect meta-analysis
Where several studies are available with similar outcome and exposure measures, we will show forest plots and estimate pooled effects using fixed and random effects analyses with the metafor meta-analytic package in R [20], transforming the outcome variable when this is required to make the sampling distribution approximate the normal distribution, e.g., taking the log of odds ratios or using the Freeman-Tukey double arcsine transformation for proportions. In these analyses, the pooled estimates will reflect comparisons of the highest to the lowest reported socioeconomic group. We expect random effects to be more appropriate, since the socioeconomic gradient in outcomes may differ across time and region (e.g., we would expect a lower gradient in countries and periods with lower inequality). Cochran's Q test will be used to assess whether data indicate statistically significant heterogeneity in effects at the 5% level. Effect estimates are also expected to differ systematically across studies according to the socioeconomic "distance" between compared groups. For instance, we would expect a larger outcome difference between the top and bottom 10% of a distribution than we would between those above and below the mean. Depending on the total number of studies that can be pooled in a given analysis, it may also be appropriate to conduct subsample analyses that assess whether pooled effects differ within subgroups of studies characterized by region, pandemic, age-group, gender, and estimation technique or quality assessment score.
Meta-regression A recent innovation in meta-analyses is a meta-regression technique with precision effect test and precision effect estimate with standard error ("PET-PEESE") [21,22]. This technique will be used to pool estimates with similar outcome measures and will allow us to include study-level information as covariates and explore how these correlate with the coefficient estimates. This allows us to assess whether coefficients from comparisons of educational groups tend to differ from those comparing income groups, whether coefficients vary systematically by study-level variables such as pandemic, country-level inequality measures, statistical methodology used, or quality assessment score. The method additionally allows for the examination of how estimates differ systematically with e.g., age-groups. Finally, the technique tests whether coefficients vary systematically with reported standard errors, which may indicate the presence of small sample or publication bias.
Bayesian meta-analysis The above strategies require a similar outcome measure and will pool coefficients for the highest relative to the lowest socioeconomic group from each study. This ignores the "dose-response" information available from studies that report coefficients comparing multiple socioeconomic levels to a reference level (e.g., coefficients for different income quantiles). Under the assumption that an underlying socioeconomic gradient will be linear on the logit scale, all such reported estimates can contribute to estimating the underlying gradient [23,24]. The resulting statistical model will be coded and estimated using the Stan language for probabilistic modeling [25] with a multilevel/hierarchical specification to account for heterogeneity across exposure measures (e.g, income, education), pandemic, and study-level covariates. We will also explore whether such an approach makes it feasible to pool studies across outcome measures to assess the hypothesis that gradients vary systematically by the severity of outcome.

Amended
Due to the Covid-19 outbreak and the strong interest in this study, as well as the large heterogeneity in outcomes and indicators used across studies, it was decided to simplify the quantitative analysis and prioritize standard meta-analytic analyses and a set of comparisons across studies of different types (e.g., those dealing with the 1918 vs those dealing with the 2009 pandemic).
Two approaches were used: 1. Using the R Metafor meta-analytic package, we estimated random effect models on the total sample of studies and on splits across different subsample dimensions 2. Using the Stan programming language for probabilistic modelling, we estimated a hierarchical model that included parameters for different subsample characteristics in a joint analysis.
Study level characteristics defining meta-analytic subgroups 1. Deprivation measure: ecological,individual level. Does the study use information on the level of individuals (e.g., self-reported income) or does it proxy individual characteristics (e.g., neighborhood poverty levels)? 2. Case-criterion: Infected, admitted hospital, severe hospital, mortality 3. Control-criterion: General population, infected, admitted hospital, sever hospital, other (This would refer to the control sample in a case control or the at-risk population in other studies). General population should not be taken to mean "population representative" -this category is also used to cover other studies where the controls are an appropriate non-infected sample from the population the cases are selected from (e.g., military personnel). Period: 1918Period: , 2009 5. Country/region Which country or multi-national region was studied 6. Type of estimate reported: odds ratio, relative risk

4.
The case and control defining criteria are included to help inform an answer two questions central in this project: 1. Are flu related outcomes (infections, hospitalizations, death) more or less common in groups and individuals with lower socioeconomic status. 2. Is there a progressively increasing gradient with severity, such that the over-representation of low-SES individuals is stronger for hospitalizations than for infections, stronger for severe hospital treatments than for hospital admissions, and stronger for death than for severe outcomes.

Study level data -documentation
We begin by constructing a data-set, labelled meta_df, appropriate to a random-effects analysis. We include studies that allow us to express the relative risk or relative rate of some flu-related outcome (infection, hospitalization, death) in a group with low relative to a high socio-economic indicator.
For each study, we include either a) the estimate and its upper and lower bound, or b) the estimate and its standard error. The data frame has the following columns: 1. Study number 2. odds ratio 3. lower bound 4. upper bound 5. standard error (log) In a separate data frame, we also add information needed for the subsample analyses.
The data was extracted from the articles in question by Ole Rogeberg, and compared to estimated extracted and compiled into an Excel data sheet by Svenn Erik Mamelund and Clare Shelley-Egan. The resulting documentation (this note) was consequently read and controlled again by SEM, and discrepancies resolved.
In several cases, multiple estimates were available from the same study using different methods or different indicators of socio-economic status. To avoid giving undue weight to single studies, we did not include estimates from multiple univariate analyses using different SES-indicators from the same sample. Individual level indicators were preferred over ecological if both were available, and all else equal income > education > other indicators.
If multiple estimates from distinct sub-groups were estimated and a combined estimate could be used instead, the full sample estimate was used. In other cases, if a study included estimates for different periods, both were included. Likewise, if e.g., two age groups could not be combined, they would be entered separately.
When multivariate estimates are available, these were preferred and the most direct estimate of economic deprivation was used. Note that the use of adjusted estimates could potentially be an issue if some studies "overcontrol" in the sense of adding controls on the "causal path" from SES or deprivation to outcomes. If a researcher were to estimate the "effect" of poverty after controlling for diet, dwellings, health related behaviors, etc., at some point the baby would be thrown out with the bath-water. Note that the comparison of odds ratios across samples and models is inherently problematic in some ways.
For each study we include the study's identifier (from the Excel sheet with extracted data), the odds-ratio with lower and upper bounds, the standard error, and the other information.

Study 1
Aligne 2016 The estimates are found on page e38 -we use the estimates that pools all age groups and all weeks. They are based on the 81% of cases with valid postcode.
Contains estimates of all ages, comparing areas grouped by levels of deprivation. Also contains sub-group analyses for different age groups and outcomes at different time. We use the "all ages" estimate, using the most affluent group as the reference.

Study 3
Balter 2010 Deprivation measure: Ecological ("We defined neighborhoods using the United Hospital Fund (UHF) designation, which aggregates adjoining ZIP codes to create 42 NYC neighbor-hoods (12). We then created a neighborhood poverty variable by categorizing UHF neighborhoods into tertiles (low-, medium-, and high-poverty neighborhoods) based on the percentage of residents living <200% of the federal poverty level, according to the US Census 2000") Case-criterion: "Admitted hospital ("We analyzed surveillance data to describe NYC residents who were hospitalized with pandemic (H1N1) 2009 in NYC from the start of the first ICS activation to the end of the second activation (April 24-July 7).) Note that this also includes people hospitalized for other reasons found to have H1N1.

Period: 2009
Country/region: New York, USA

Type of estimate reported: odds ratio
This study has two estimates listed -one adjusting for age and one not. Neither has standard errors or confidence intervals. We consequently use the study information to calculate an odds-ratio.
The study lists 498 high-poverty events and 172 low poverty events. These are used as case-counts.
To get control counts: The study notes that 0.327 and 0.264 were the respective shares of high and low poverty individuals in NYC based on the 2000 NYC census. An online search found the total NYC population in 2000 to be at 8 008 278 individuals. This is used to get counts for the high and low poverty groups in the full NYC population, which we use as control counts.

Study 5
Chandrasekhar 2017 This study includes data from 2009-10 to 2013-2014. Exclusion reason: It combines seasonal and pandemic influenza across non-pandemic and pandemic years.

Study 7
Chowell 2008 This study is part of the narrative analysis, but data cannot be used in the meta-analysis, as it only reports correlations with p-values.

Study 8
Chowell 2014 Four estimates for different waves, with population density (persons/kmˆ2) as the contrast. None of the socieconomic measures are credible indicators of poverty.

Study 9
Dawood 2012 Region not very specific as SES measure.

Study 10
Duggal 2016 Comparisons of high, upper middle and lower middle income economy (country level).
This is a meta-analysis -which makes it problematic to include, as it will reflect evidence already included from individual studies.

Study 12
Gilca 2011 This one provided two separate estimates: One comparing hospitalized to non-hospitalized cases, and one comparing severe (ICU or death) vs non-severe hospital cases.

Estimate 1
Compares people with verified H1N1 tracked in the confirmed case registry to those in hospital -using phone interviews to get information on the community cases.

Estimate 2
Compares people with ICU or death (severe cases) to non-severe cases in hospital.

Study 14
Grantz 2016 Deprivation measure: Ecological (census tract-level data on demographic characteristics in 1920)

Control-criterion: General population
Period: 1918 Country/region: Chicago, US Type of estimate reported: relative risk Table 1 of this paper includes the results of a Poisson model estimating pandemic influenza mortality and its associations with various neighborhood characteristics. Using percentage illiterate as the indicator of poverty and the multivariate regression adjusted RR, we have a relative risk of 1.028 (1.020, 1.036). Figure 2 indicates that the illiteracy rate across districts vary from 0-7 (lowest) to 21-28 (highest). This gives an approximate extreme-to-extreme range of 20 percentage points. To avoid extremes where the linear relationship may break down we use the numbers in the text: "for every 10% increase in illiteracy rate within a given census tract, mortality increased by 32.2% (95% CI: 22.2, 43)." We interpret the 10% increase as a 10 percentage point increase, as this fits with the numbers: If 1 percentage point increases with 1.028, then 10 would increase it by (1.028)ˆ10 = 1.318 which seems close enough that their number is calculated the same way but with more decimals.

Case-criterion: Mortality
Control-criterion: Infected who were not hospitalized for 30 days after specimen collection

Period: 2009
Country/region: USA (selected states) Type of estimate reported: odds ratio The estimates are found in table 2 This study has several indicators of poverty (education, insurance, crowded dwellings and poverty directly). I use the direct poverty resukt only -including several coefficients would give this study excessive influence on the pooled estimate, as the variables are correlated and results given for univariate analyses.
In addition, the study has a second set of results for a subsample (Alaskan Native/American Indian), but these data are also included in the full analysis used here.
NOTE: The sheet states that none of the variable were significant in a multivariate analysis -if the multivariate analysis simply included these proxies at the same time this is likely due to multicollinearity, i.e., a situation where the variables are collectively significant but correlated to such an extent that it is hard to quantify their individual contribution.

Study 16
Hoen 2010 This study compares schools with media-reports of H1N1 influenza to nearby schools not mentioned in the media, i.e. estimating the probability that a school has "confirmed cases of novel H1N1 influenza that are picked up by the media and detected by HealthMap." This means that the associations are a mix of differential prevalence and varying media interest concerning infections for rich and poor schools.

Study 17
Hu 2012 This is a study of how incidence rates varied with lagged rainfall and weather, as well as the socioeconomic status of different areas.
The socioeconomic area index (SEIFA) ranges from 800 to 1200 with a mean (sd) of 1064 (70), and the coefficient on SEIFA (on the logarithmic scale) is estimated at -0.06 (-0.19 to 0.06). We did not see how the results from this paper could be translated to a metric comparable to those used in other studies.

Study 18
Huang 2016 This is an estimated SIR infection model. We did not see how the estimates using socioeconomic index could be converted to something that could be included in the meta-analysis.

Study 19
Inglis 2014 This study reports the number of cases from regions grouped into different quintiles on the basis of deprivation scores. Out of 2978 total confirmed cases, 1837 cases came from the lowest and 170 from the highest, with 971 from the remaining quintiles.
This study was not included in the quantitative analysis. We would need to know the extent to which cases were found in the "high deprivation" areas relative to what we would expect in the "no effect" counterfactual where cases would be proportional to the population in each quintile. However, we do not know the size of the different populations residing in each "area quintile".
As far as we could tell from the paper, however, the quintiles are quintiles of areas, not the population. To put it simply: consider a case where there are two neighborhoods -one crowded urban neighborhood where everyone poor is crammed into small apartments, and one sparsely populated rural area with manors and castles. We would expect more people from the first region, which would be "high deprivation", even if there were no association with SES.

Study 20
Inglis 2013 Marked as duplicate of 19

Study 21
Janjua 2012 This study compares influenza incidence across parents with children community schools of a rural BC community Deprivation measure: Individual -Household density collected by phone Case-criterion: Infected (antibodies tested on people self-reporting symptons in phone survey)

Country/region: Canada
Type of estimate reported: odds ratio The estimates are found in table 1, multivariable model. We assume that 1st to 3rd quantiles (reference) of household density have the lowest density, i.e., highest SES, and that the fourth quantile is the one with the highest density (lowest SES).

Study 22
Kumar 2012 Marked as not relevant

Study 23
Launes 2012 Deprivation measure: Individual (Parental education level) Case-criterion: Hospitalized ("patients aged 6 months to 18 years hospitalized for influenza syndrome") Control-criterion: Infected ("patients aged 6 months to 18 years with confirmed influenza A (H1N1) 2009 infection using real-time RT-PCR and man-aged on an outpatient basis.")

Study 25
Lenzi 2011 No quantitative estimates

Type of estimate reported: odds ratio
The estimates are found in Model 3 in table 3, which adjusts for underlying conditions, insurance and access to care and for Bronx (from where cases were likely oversampled). We use the educational level and risk of hospitalization results.
Has estimates for education, % below the poverty line and household income, each of these for adults and children respectively.

Study 27
Lowcock 2011 Just abstract -published as study 28

Study 29
Maliszewski 2011 Not shown in underlying study.

Study 30
Mamelund 2003 Mutlivariate coefficients given with no standard error or confidence interval. Left out.

Study 31
Mamelund 2006 Deprivation measure: Individual (Census data -Using the working class vs bourgeois distinction)

Country/region: Norway
Type of estimate reported: relative risk The point-estimate is found in table 4 model 3. The confidence interval is taken from SEM's spreadsheet.

Study 32
Manabe Not a relevant study

Study 33
Mansieaux 2015 Not a relevant study: Wrong study period -Covers a post-pandemic period (20 Dec 2010 to 20 February 2011).

Country/region: Global
Type of estimate reported: relative risk The estimates are found in table 2, which compares mortality at the country level by per-head income controlling for latitude. The model estimates a gradient of -0.967 with a standard error of 0.229 where the outcome is log(pandemic mortality) and income is logged. This gives a percent to percent interpretation. As they write: "This means that a 10% increase in per-head income was associated with a 9-10% decrease in mortality." Statistically, per head income explained almost half the variation in excess mortality seen for the 1918 pandemic (Rˆ2 = 0.482). Unfortunately, the paper does not say what the range or standard deviation of logged per-head incomes was in the data.
In the absence of this information, I consider a contrast between two countries -one of which has double the income of the other. We call the excess mortality of these two countries Y_H and Y_L, with income X_H = 2*X_L.

Study 36
Navaranjan 2015 A test negative case-control study Would have preferred to use individual education -but do not understand what their reference category is and how to get a "low education to high education" comparison. They include coefficients for "high school or less education" and for "post-secondary school completion" -is the reference group people with non-completed post-secondary school?
Using total deprivation. One score for children, one for adults.

Study 37
Nikolopoulos 2011 GDP per capita gradient for EU countries. GDP per capita had a mean of 102 and sd of 45. The coefficient on GDP per capita was 0.017 (0.00, 0.039), and the model was a Poisson regression with log link on the mean. Looking at the reported mean, a reasonable contrast that avoids going into the extremes of the data might be between a country with 120 vs 60 on the GDP per capita scale. The relative mortality of the poor relative to rich country will then be given by exp(-60 * beta), which gives us an expected relative mortality of 0.36 (0.1, 1).

Study 38
Pasco 2012 Deprivation measure: Ecological (area level index of relative socioeconomic advantage and disadvantage)

Study 39
Pearce 2011 This study uses data on mortality during the 1918 pandemic (different waves) using an indice of deprivation based on 2000 data. I could not find estimates allowing for comparison of high to low SES groups. The closest I could find were correlations with p-values.

Notes:
From SEM: The average of Ward Scores from the Indices of Deprivation 2000: District level Presentations for England It combines a number of indicators which cover a range of domains (Income, Employment, Health Deprivation and Disability, Education, Skills and Training, Housing and Geographical Access to Services) into a single deprivation score for each area.
While they write that this indice correlated with pre-pandemic mortality, "showing that geographical predictors of social disadvantage and all-cause mortality have been stable over many decades.
The study results were initially excluded from the quantitative meta-analysis over concerns regarding the use of a control variable measured 80 years after an event. SEM correctly noted that this was not an exclusion criteria discussed in the pre-analysis plan, but examining the results we were unable to find results that allowed for a comparison of high to low SES with confidence intervals.

Period: 2009
Country/region: USA Type of estimate reported: relative risk The estimates are found in table 3 This is a machine learning paper that has a set of correlations early on. They report correlations between log mortality rate and different indicators of SES: personal and household income, educational attainment, poverty rate. Although not entirely clear -it seems that the information is all at the ecological level.

Study 43
Quinn 2011 Not included.

Study 46
Sloan 2017 Abstract -paper included elsewhere

Case-criterion: Mortality
Control-criterion: General population (that is: at risk population, military personnel on same boat)

Estimate 2: Mortality relative to infection
Deprivation measure: Individual ("economid condition" of household as judged at first impression by enumerator with no pre-specified criteria) Case-criterion: Death (household stated influenza, pneumonia or indefinitely diagnosed illness suspected to be influenza)

Country/region: US
Type of estimate reported: odds ratio These numbers are more shaky, as the raw numbers are not included in the paper. For mortality, the mortality rate by SES is given as rates after correcting for age. Age-correction seems to have been as follows: A separate (not included) estimate of mortality rate at the SES-age_group level, averaged using age-group weights similar to overall continental US in 1910.

Study 51
Tam 2014 Studies regular influenza -not pandemic In the text they write: "Influenza-related hospitalization of adults associated with low census tract socioeconomic status and female sex in New Haven County, Connecticut, 2007-2011 The incidence increased as the percent of persons living below poverty in a census tract increased, as the percent of persons in a census tract with no high school diploma increased, as the percent of crowded households in a census tract increased, as the percent of non-English speaking households in a census tract increased, and as median income in the census tract decreased. These trends were present in each influenza season including the 2009-10 H1N1pdm season (Table S1)." Given this, we have two pieces of information indicating that flu was over-represented in low SES and under-represented in high SES (table 2 and the text), and one piece indicating the opposite. We take the labels in table S1 to be wrong and adjust for this. This gives us Low income cases: 43 High income cases: 13 Approximate size of low_income population size in person-years, using age adjusted incidence rates per 100 000 person-years: 13 / (72.1/100000) = 18 030 Approximate size of high_income population size in person-years, using age adjusted incidence rates per 100 000 person-years: 43 / (20.2/100000) = 212 870 cases <c(13, 43) controls <c(18030, 212870) -cases or <-(cases [1]/controls [1])/(cases[2]/controls[2]) cat("Odds ratio: ", or, "\n") ## Odds ratio: 3.571241 log_or_se <sqrt(sum(1/c(cases, controls))) cat("Log_or_se: ", log_or_se) ## Log_or_se: 0.3166056 meta_df <rbind(meta_df, c(51, or, exp(log(or) -1.96 * log_or_se), exp(log(or) + 1.96 * log_or_se), log_or_se)) study_df <rbind(study_df, c(51, "Ecological", "Admitted hospital", "General population", 2009, "USA", "odds ratio"))

Estimate 2 -Mechanical ventilation
Deprivation measure: Ecological (County median household income)

Type of estimate reported: odds ratio
The estimates are found in Table 4 (adjusted RR). I use model 2 estimates (unlike the model 1 estimates extracted to the Excel sheet), as these are based on more cases (they include individuals with missing obesity info, which would otherwise cause substantial attrition).

Country/region: Canada
Type of estimate reported: odds ratio The estimates are based on the following numbers for two groups distinguished by period (the third group is from the period after vaccination has begun rolling out and is excluded here): The counts given for different income groups in table 1 and 2 appear to differ. I assume the correct ones are in table 2 where the positivity rates are given. study_df <rbind(study_df, c(53, "Individual", "Infection", "Other", 2009, "Australia", "odds ratio"))

Type of estimate reported: odds ratio
The estimates are found in table 3.

Case-criterion: Infected (households with self-quarantined index patient and a secondary case)
Control-criterion: General population (matched households with self-quarantined index patient and a close contact)

Country/region: China
Type of estimate reported: odds ratio The estimates are found in table 2. This is one of those cases where one could discuss whether some of the controlled for variables are "on the causal path" from poverty to infection (e.g., controlling for "sharing room with index case patient"). However, a simple case/control counts ratio of lowest to highest income (based on counts in study_df <rbind(study_df, c(58, "Individual", "Infection", "General population", 2009, "China", "odds ratio"))

Type of estimate reported: relative risk
The estimates are found in table 3.
Deprivation index quantiles -here a higher deprivation index is more deprived.
Only one period fits study selection criteria.

Preparing analysis data Adding study level information from spreadsheet
We add in author, journal and year of each study using a csv exported from the data extraction excel sheet. study_info <fread("study_information.csv") setnames(study_info, "Study _index", "study_index_orig") meta_dt <data.

Measure
Method Period

Data preparation
We prepare the data for a simple Bayesian hierarchical model.

Model code and prior choices
The model has two substantive priors -for the mean and standard deviation of the effect distribution.
Both are assigned a normal prior with mean zero and standard deviation of 0.4. For the effect mean, this expresses a belief that the true average effect across studies will most likely be in the range of exp(-0.8, 0.8) = 0.45 to 2.25. Put differently, we expect that low socioeconomic status (relative to high) is unlikely to predict a change in flu outcome risks by more than a factor of two on average. The prior for the standard deviation similarly expresses a belief that the effect estimated by a single study is unlikely to differ from the average effect by more than a factor of 2. y~normal(study_re, se); } generated quantities { real mu_exp = exp(mu); } base_bayes <stan("random effect.stan", data = stan_data_base, iter = 10000, refresh = 10000) The model performs well, with no signs of divergence or other estimation issues.

Model code and prior choices
This model has two additional sets of parameters expressing the extent to which different study level indicators are associated with the outcome. The priors for the mean and standard deviation of the effect distribution remain as before.
In addition, there is a block of parameters associated with study level indicators (e.g., period, country etc). These are given the same "factor of two" prior, but normalized for each set of indicators, so that the pooled mean expresses the "average" across the countries in the sample, across the two periods, and so on.