Fig 1.
Overview of the approach used in the study. Two survey datasets were used — a Nationally Representative Survey (NRS) conducted by researchers and the Household Pulse Survey (HPS) by the U.S. Census Bureau — along with demographic data for Virginia census tracts from the U.S. Census Bureau. These datasets were integrated with our proposed approach and a null model to synthesize four agent populations representative of Virginia, U.S., which were validated against real-world Virginia census tract vaccination data. The icons within the figure were obtained from open sources, including Openclipart (https://openclipart.org/) and Wikimedia Commons (https://commons.wikimedia.org/wiki/Main_Page). Maps in the figure were created using ArcGIS Pro and openly available boundary files from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html).
Table 1.
Descriptive statistics summarizing the demographic distribution within Virginia census tracts.
Table 2.
Comparative distribution of respondent characteristics from the individual-level surveys.
Fig 2.
Overview of the synthetic population generation approach for public health applications. Individual-level survey data containing public health characteristics, combined with spatially aggregated demographic data, are processed through an iterative proportional fitting (IPF) method. This generates a weights matrix that is next integerised and expanded to create a final synthetic population dataset. In this dataset, individuals are replicated with health-related attributes and behaviors, such as vaccine uptake, from the survey data. The icons within the figure were obtained from open sources, including Wikimedia Commons (https://commons.wikimedia.org/wiki/Main_Page). Maps in the figure were created using ArcGIS Pro and openly available boundary files from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html).
Table 3.
Evaluation metrics comparing observed and simulated percentages of demographic and vaccine status variables across Virginia census tracts.
Fig 3.
Scatterplots comparing observed and simulated vaccine uptake within Virginia census tracts: A) percentage from the HPS, B) count from the HPS, C) percentage from the NRS, D) count from the NRS.
Table 4.
Coefficients from logistic regression models describing the relationship between demographic variables and vaccine uptake from the HPS, along with the synthetic population and null model generated from the HPS. Significant coefficients are indicated with an asterisk (*) at the 90% confidence level (p-value < 0.10).
Table 5.
Coefficients from logistic regression models describing the relationship between demographic variables and vaccine uptake from the NRS, the synthetic population generated from the NRS, and the null model generated from the NRS. Significant coefficients are indicated with an asterisk (*) at the 90% confidence level (p-value < 0.10); Gender was not included due to its insignificance (p-value greater than 0.10) in the NRS dataset.
Fig 4.
Clusters and outliers of vaccination uptake in Virginia Census Tracts: A) observed, B) null model, and C) synthesized population with HPS data, and D) synthesized population with NRS data. Maps in the figure were created using ArcGIS Pro and openly available boundary files from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html).
Fig 5.
Cluster and outliers of vaccination attitudes, beliefs, and perceptions across Virginia census tracts for the synthetic population generated from the HPS. Maps in the figure were created using ArcGIS Pro and openly available boundary files from the U.S. Census Bureau (https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html).