Fig 1.
Overview of the transfer function approach to modeling microbiome interventions.
(A) A transfer function model (Eq 1) is trained to forecast future community profiles. This model leverages past community data, past and current intervention information, and static subject-level characteristics. (B) Forecasts on held-out subjects are used to evaluate model performance, potentially guiding model improvements. (C) The trained models are used to simulate counterfactual trajectories, supporting the study of hypothetical interventions. Multiple interventions can be applied concurrently, and they may be real-valued. (D) To identify taxa sensitive to the interventions, partial dependence effects from simulated trajectories are used to calculate mirror statistics.
Fig 2.
Transfer functions applied to oscillatory predator-prey dynamics.
Panels give simulated predator and prey populations over 10 time units. The grey rectangles indicate periods during which the prey growth rate is decreased. A transfer function model is trained using data up to time 7, and forecasts are shown as dashed curves. Subjects have been chosen to represent low (top left) to high (bottom right) forecasting error rates. The forecasts accurately reflect the true perturbation effect and predator-prey relationship; however, they can compress time and dampen large peaks.
Fig 3.
Example data used in the simulation study.
Each panel displays one taxon’s trajectories, with rows representing individual subjects. Tile colors encode abundances, which have been quantile transformed to support cross-taxa comparison. Red borders indicate the samples where the intervention is present. The first row of taxa (tax1—3) have nonnull effects, while the bottom row are all null. Note the potentially delayed intervention effects in nonnull taxa.
Fig 4.
Simulation forecasting errors for normalized data.
The y-axis shows the average MAEk across folds. Within panels, the signal strength and number of taxa increase from left to right. Column panels give the proportion of intervention-sensitive taxa and the signal strength. Rows distinguish between settings with different sequencing depth heterogeneity and phylogenetic correlation. Outliers (below 1.5×IQR or above 3×IQR of errors in fido and mbtransfer) are excluded. Runs that did not complete within 72 hours are omitted. The two hyperparameter settings of fido and mbtransfer perform similarly. The fido package is comparable to mbtransfer when the intervention strength is weak but deteriorates when the intervention is strong.
Fig 5.
Comparison of long-run forecasting residuals.
We average errors across all taxa and truncate those with a magnitude greater than 50. A comparison of forecasting residuals across four folds (rows) in one simulation run suggests that forward integrating the MDSINE2 model can lead to exponentially increasing forecasting errors.
Fig 6.
Inferential performance in the simulation experiment.
Rows encode normalization methods and phylogenetic correlation. Columns have varying lags and compare mirror statistics, DESeq2, and a pre-post t-test. Color hue and shade encode the number of taxa and proportion of nonnull hypothesis, respectively. The target FDR has been set to q = 0.2 (vertical grey line). DESeq2 lacks FDR control for lag one effects in any simulation context. mbtransfer’s mirror algorithm controls the FDR when given DESeq2-asinh transformed data and sufficiently many taxa.
Fig 7.
mbtransfer forecasting error on the diet intervention dataset.
The y-axis is faceted by quantiles of abundance and the x-axis is faceted by time horizon h. In-sample error refers to errors made at new timepoints for individuals who appeared in the training data, while out-of-sample predictions are made on individuals absent from training. Performance is strongest in shorter time horizons and for more abundant taxa.
Fig 8.
Intervention-sensitive taxa in the diet study.
(A) Counterfactual difference in simulated trajectories for a subset of the selected taxa in the diet study. (B) Subject-level data from the same taxa, with each row representing a subject and each column a timepoint, potentially interpolated from the original, unevenly sampled measurements. These data are consistent with the interpretations from the counterfactual simulation. For example, OTU000006 often shows transient increases (e.g., Animal1, and Animal6) while OTU000065 has more prolonged departures (e.g., Animal3 and Animal 9).
Fig 9.
Intervention-sensitive taxa in the pregnancy study.
(A) Counterfactual differences for a subset of selected taxa from the re-analysis of [2]. Counterfactual differences are computed for each subject in the data, and bands represent the first and third quartiles of differences across subjects. Since the bands for birth control reinitiation overlap, we conclude that the model does not learn the interaction effects between the intervention and contraception use. (B) The corresponding subject-level data grouped by birth control reinitiation survey response. Note that these data have interpolated to the biweekly level to account for uneven sampling times.