Fig 1.
Binding model usage and implementation.
(a) In systems serology, antibodies are first captured with antigen-coated beads, resulting in the complexes shown on the very left. These complexes are then separated into wells and incubated with different fluorescently tagged detection reagents, each of which leads to a certain amount of bead-associated fluorescence proportional to the amount of binding. The binding model takes these detection signals and infers the abundance of each antibody Fc species immobilized in the complexes. (b) Each detection has a known or fit binding affinity to each antibody Fc species, which can be directly used to quantify the equilibrium constant for the initial monovalent binding event. (c) To model multivalent binding, we consider all binding events which lead to a particular binding configuration. The monovalent binding event is quantified with the monovalent binding affinity, and subsequent binding events for the same detection are quantified using the monovalent binding affinity multiplied by a crosslinking constant, , which encapsulates steric and local concentration effects.
Fig 2.
The binding model accurately infers antibody Fc abundances from synthetic data.
(a) Each Fc species was assigned a random abundance. These abundances were used to generate synthetic detection signals which were then used to infer the original antibody abundances. (b), (c) Initial versus inferred antibody abundance (b) without and (c) with added detection signal noise. (d), (e) The coefficient of determination between the initial and inferred antibody abundances versus the amount of noise added to the (d) detection signals or (e) binding affinities. (f) Each binding affinity was individually perturbed by 30% up and down. The set of affinities with this perturbed affinity was used when computing synthetic signals for randomly generated initial antibody abundances, and the set with the unperturbed affinity for inferring antibody abundances from the synthetic signals. The sensitivity of the inferences to perturbations of this binding affinity was computed as one minus the agreement (R2) between the inferred and initial antibody abundances, shown separately for each species.
Table 1.
Binding affinities (Ka in units of M-1) between each antibody Fc species and detection [22].
Fig 3.
The binding model effectively imputes unseen measurements.
(a) To measure the binding model’s imputation performance, we start with a real systems serology dataset and then mask measurements corresponding to a particular detection and use the model to impute them. (b) The model imputes detection signals by using the incomplete data to infer the antibody abundances, which are then used to infer the left-out signals. (c) Imputed versus actual measurements. The metrics shown on the plot relate the log10 of each of the plotted values. ‘r’ signifies the Pearson correlation and ‘R2’ signifies the coefficient of determination. (d) Pearson correlation and (e) coefficient of determination between actual and imputed values for binding model and PCA when 10% of values are dropped. (f) Dataset schematic for imputation at 100% missingness for a single detection. (g) Pearson correlation and (h) coefficient of determination between actual and imputed values for the binding model at various percentages of missing values.
Fig 4.
Binding model infers IgG fucosylation, improving prediction of downstream effector response.
(a) IgG fucosylation blocks binding to FcγRIIIA with little effect on binding to the other FcRs, such as FcγRIIA. (b) Higher fucosylation of bead-bound antibodies leads to a lower FcγRIIIA signal relative to FcγRIIA signal. The binding model uses this information to infer how many antibodies are fucosylated. (c) Inferred IgG fucosylation versus measured FcγRIIIA signal / FcγRIIA signal. (d), (e) Inferred abundance of afucosylated IgG versus antibody-dependent natural killer cell activation (ADNKA), measured by two markers: (d) CD107a and (e) MIP1b. (f), (g) Effector function measurements were predicted with two sets of regressors: binding model-predicted abundances of antibody Fc species IgG1, IgG1f (fucosylated IgG1), IgG3, IgG3f and subclass detection measurements (with the set of subclasses varying by dataset). Repeated 8-fold cross validation with 10 repeats was used, and the coefficient of determination (R2) on the validation sets is shown on the y-axis. (f) The regression performance for each set of regressors for the Zohar et al. SARS-CoV-2 data [11]. In this dataset, the available subclass detections used as regressors were α-hIgG1 and α-hIgG3 IgG. (g) The regression performance for the Alter et al. HIV dataset [10]. The available subclass detections used as regressors were α-hIgG1, α-hIgG2, α-hIgG3, and α-hIgG4 IgG. The Mann–Whitney U-test was used to define differences and the Benjamini–Hochberg method was used to adjust for multiple comparisons, with an adjusted .
Fig 5.
In COVID-19, inferred IgG fucosylation varies by target antigen, symptom severity, and vaccine efficacy.
Inferred fucosylation of IgG by (a) target antigen, (b) target antigen type, (c) target antigen and presence of ARDS. (d) SARS-CoV-2 vaccine protection is associated with inferred afucosylation of IgG targeting spike protein antigens. The Mann–Whitney U-test was used to define differences and the Benjamini–Hochberg method was used to adjust for multiple comparisons, with an adjusted .
Fig 6.
Inferred IgG fucosylation correlates with HIV severity and is lower for membrane-associated antigens.
(a), (b) Inferred IgG fucosylation by target antigen and patient status for HIV-infected subjects. (c) Inferred IgG fucosylation by antigen type. (d) Inferred IgG fucosylation by antigen type and patient status. (e) IgG fucosylation was inferred for each sample and antigen. The fucosylation inferences for each antigen were compared across samples and used to compute a Pearson correlation coefficient. The pairwise correlation in IgG fucosylation between each antigen is shown. A Mann–Whitney U-test was used to define differences and the Benjamini–Hochberg method was used to adjust for multiple comparisons, with an adjusted .
Fig 7.
Identifying a minimal set of detections.
Combinations of two FcRs are dropped, and their signals are imputed. The (a) Pearson correlation and (c) coefficient of determination between the log10 of the imputed and measured values. (e), (f) Imputed vs measured values for FcγRIIB and FcγRIIIB respectively, when they are left out together. ‘r’ signifies the Pearson correlation and ‘R2’ signifies the coefficient of determination. (g), (h) Imputed vs measured values for FcγRIIIA and FcγRIIIB respectively, when they are left out together. (b) (d) The same metrics as (a) and (c) when three FcRs are imputed. An ‘x’ indicates that one of the specified values exceeds the lower y-axis limit.
Table 2.
Software tools.