Safe reopening of university campuses is possible with COVID-19 vaccination

We construct an agent-based SEIR model to simulate COVID-19 spread at a 16000-student mostly non-residential urban university during the Fall 2021 Semester. We find that mRNA vaccine coverage above 80% makes it possible to safely reopen to in-person instruction. If vaccine coverage is 100%, then our model indicates that facemask use is not necessary. Our simulations with vaccine coverage below 70% exhibit a right-skew for total infections over the semester, which suggests that high levels of infection are not exceedingly rare with campus social connections the main transmission route. Less effective vaccines or incidence of new variants may require additional intervention such as screening testing to reopen safely.

building closures, limited extracurricular activities, and hybridized in-person/remote classroom instruction. As these levels of intervention lacked much precedent, various models were developed to help guide policy and predict outcomes [7,8]. Policy decisions ultimately struck a balance between forecasts, campus safety and comfort, and university resources [9].
With the introduction of apparently effective vaccines [10][11][12][13] and increased natural immunity from earlier exposure [14], more universities are planning to conduct Fall 2021 primarily in-person [15]. Like in Fall 2020, it is still uncertain how much intervention is needed to control COVID-19 infections. We develop an agent-based SEIR model to forecast total COVID-19 infections over the course of a semester at a primarily non-residential urban university campus with 16000 students and 800 faculty. Baruch College, part of the over 275,000 student City University of New York (CUNY), is used to inform our framework.
Urban universities, such as those in the CUNY system, are usually located in densely populated areas and serve many students from minority groups. Preliminary surveys indicate that vaccine hesitancy from minority groups that present higher COVID-19 infection incidence and higher than average vaccine hesitancy will limit vaccine coverage to somewhere between 60-80% of the the United States population with African Americans among the most hesitant [16][17][18][19][20]. As many students live with their families, reopening such universities is accompanied by elevated risk to and from their households and communities.
Depending on the vaccine administered, current clinical trials suggest efficacy ranging from 65-95% [10][11][12]. As these statistics are derived by comparing the symptomatic case incidence in the vaccination group to that in the placebo group, it is likely that asymptomatic cases are missed in these statistics. Preliminary data suggests that vaccination reduces asymptomatic cases as well [21][22][23][24]. The data in [23] was obtained from biweekly testing in healthcare workers and the author's found the BNT162b2 vaccine 86% effective at preventing symptomatic and asymptomatic spread. Another study [24] analyzes biweekly screening tests in a group of employees at St Jude's Childrens Hospital. A 72% reduction in asymptomatic cases was observed. The authors point out that short followup time, small cohort size, and that individuals choosing to not vaccinate might be higher risk could limit the accuracy of their findings. It remains unclear to what extent vaccinated individuals can spread COVID-19, and how well the vaccines protect against variant strains of COVID-19.

Methods
We model different scenarios with two primary variables: vaccine effectiveness and vaccine coverage of the campus population. We utilize the agent-based campus Susceptible-Exposed-Infected-Removed model from [25]. Similar to [7,8], students and faculty are assigned individualized schedules that they follow throughout a simulated semester. Schedules are organized into common meetings-classroom, broad environment, clubs, residential, socializing-during which COVID-19 is equally likely to be passed from infected agents present to each susceptible agent also present. The infection rate in our model is set to obtain an average reproduction number R 0 = 3, which represents the average number of infections caused by an exposed agent in an entirely susceptible population. Estimates for COVID-19 spread in large communities (such as cities and countries) put R 0 in the range [2.0, 3.0] [26][27][28][29][30], but there are some higher estimates [31]. The statistic varies by community contact structure. It has been observed that R 0 is larger in reopened universities [4]. For example, [7] sets R 0 = 3.8 in their campus COVID-19 model for a mostly residential urban university. See [7] and [8] for more discussion about elevated R 0 levels in a university  The same scenarios as in Figure 1, but with 50% effectiveness for the vaccine.
setting. We remark that our model implicitly assumes agents are wearing facemasks while on campus. The assumption is implicit because we base our choice of R 0 on what occurred in universities for the 2020-2021 school year in which, to our knowledge, all universities required facemasks in public spaces. Vaccination and antibody status impact agents' susceptibility and infectiousness. Each agent is assigned the vaccinated attribute independently with probability V . Such agents have inward protection factor r i and outward protection factor r o . Vaccinated agents contribute a factor of 1 − r o of exposure time to susceptible agents in each meeting they are present at. When computing the probability vaccinated agents are infected at the end of a day, the probability is multiplied by 1 − r i . All COVID-19 infections in vaccinated agents are classified as asymptomatic. Our 80% vaccine effectiveness scenarios sets r i = 0.8 = r o , while 50% effectiveness has r i = 0.5 = r o . The model is initiated with 10 randomly selected students in the exposed state. The main statistic we consider is the total number of agents ever in the exposed state. We refer to this as total infections. Figure 1 displays box-plots with medians for the total number of infections in 1000 simulated semesters given different levels of vaccine coverage V ∈ {0.5, 0.6, . . . , 1.0}. In these simulations, vaccinated individuals are assumed to have an 80% reduction in the probability of becoming infected and a 80% reduction in the probability of infecting others. This is consistent with current data for the mRNA COVID-19 vaccines [21,23]. We see that the infection is well controlled with vaccination levels above 80%. A median 30 total infections occur at V = 0.8. Although the median is below 50 for all V ≥ 0.5, lower levels of vaccine coverage have right-tail events with relatively high numbers of total infections. We see simulations with 350 total infections when V = 0.5 and over 150 total infections when V = 0.7.

Results
The plots in Figure 2 have the same response and independent variables as in Figure 1, but display total infections with 50% vaccine effectiveness. While increasing vaccine coverage reduces total infections, a striking feature of the data is that all scenarios exhibit right-skew.  We see multiple outliers; some simulations in Figure 2 have over 1200 total infections. This indicates nearly 6% of the campus population becoming infected, which is much higher than current national and local COVID-19 incidence. Even with 100% of the campus population vaccinated, there are simulations with nearly 200 total infections (over 1% of the campus population).
Weekly screening of 25% of the campus population reduces total infections. Figure 3 displays total infections in the scenarios from Figure 2, but with 25% of the campus population screened weekly for COVID-19. Individuals who test positive quarantine until recovering. A right-skew is still present, but it is less extreme than with no testing. Moreover, the medians for total infections in Figures 3 are lower than in the analogous scenarios without testing ( Figure 2). Figure 4 shows simulations concerning the necessity of facemasks. We assume that the vaccine is 80% effective and coverage is 100%. Recall that our usual model assumes that facemasks are worn at all times in non-social settings. It has been estimated that facemasks, when worn by both infected and susceptible agents, reduce the susceptible agents exposure to the virus by at least 50% [32,33]. To model no facemask use, we increase the duration of time spent in classrooms, the broad environment, clubs, and residence halls by factors of 2,3, and 4. Tripling the duration, for example, triples the probability agents become infected in each interaction in these spaces. We observe that these increases do not cause for many more total infections. The median total infections remains in the interval [15,20] and right tail events do not amount to many additional infections. In fact, over 1000 simulations we observed similar maximum total infections (34 versus 35) in the model with no multiplying factor and that with the risk multiplied by 4.

Discussion
On May 10, 2021, shortly before completing the analysis in this article, New York State Governor Andrew Cuomo announced that the City University of New York and State University of New York campuses will require proof of vaccination for all students attending in-person classes [34]. Our study suggests that, so long as vaccine efficacy is at or above 80% for preventing both symptomatic and asymptomatic infections, this policy allows for a safe university reopening with minimal extra precautions needed. Our model typically assumes that individuals use facemasks while participating in campus activities. However, additional simulations suggest that facemasks are not necessary with 80% vaccine effectiveness and 100% vaccine coverage.
If the vaccines are less effective, then our results suggests that caution is needed. Even with high levels of vaccine coverage, vaccinated members of the campus population may still become infected. Moreover, the right skew we observe for total infections implies that the risk of many agents in the model becoming infected is non-negligible. Most infections in our model are asymptomatic cases. The reason for this is twofold. First, asymptomatic cases are more common in young people [35,36]. Secondly, COVID-19 vaccines have been demonstrated to reduce symptomatic cases more effectively [10][11][12]37]. CUNY campuses have a large proportion of minority groups. The 2019 CUNY Student Data Book reports a 25% Black and 30% Hispanic undergraduate population across all 25 CUNY campuses. Minority groups exhibit more vaccine hesitancy [19] and neighborhoods with more minority residents have higher COVID-19 incidence [38]. As many students at a non-residential urban university live at home, high levels of asymptomatic cases pose a silent risk to their households and communities.
Ensuring that a large proportion of the campus population is vaccinated, and administering screening testing are effective ways to control total infections. Most of the infections in our model are from socializing. Accordingly, students should be encouraged to practice safe social contact during the semester such as distancing and wearing facemasks in the presence of unvaccinated students. This is similar to what was suggested in [8].We further comment that infections will lower at least proportional to the level of dedensification employed by the university. For example, if half as many students are regularly on campus we expect that total infections will reduce by at least half.
A limitation with our model is the difficulty with setting parameters. Another is in designing the contact structure and relative risk of different meeting types. Socializing plays a dominant role for infection spread in our model, but it is difficult to create a realistic contact structure. A novel aspect of our approach compared to other agent-based COVID-19 university models [7,8] is that we use a Markov chain to create social groups matching students with similar characteristics (year and area of study). Our sensitivity analysis also includes scenarios with less socializing. On a different note, if screening testing is present on campus, then there is the opportunity for college administrators to respond in real-time to rising case counts. For example, moving to all remote instruction when a certain threshold is reached. For that reason, testing may be even more effective than our simulations suggest.

Conclusion
We constructed an agent-based SEIR model made to resemble a mostly non-residential urban university campus. We then ran different scenarios for vaccine effectiveness and coverage by the campus population. If the vaccines are 80% effective at preventing both symptomatic and asymptomatic COVID-19 infections, then our study suggests that no extra precautions are needed for a safe reopening. Lower vaccine effectiveness and coverage may lead to undesirable levels of COVID-19 infections. A right-skew for total infections suggests that rare but extreme events could have particularly bad outcomes. Screening testing helps control total cases.

Disclaimer
The contents of this report reflect the views of the authors, who are responsible for the facts and the accuracy of the information presented herein. This document is disseminated in the interest of information exchange. The report is funded, partially or entirely, by a grant from the U.S. Department of Transportation's University Transportation Centers Program. However, the U.S. Government assumes no liability for the contents or use thereof.

Model specifics
We utilize the agent-based campus Susceptible-Exposed-Infected-Removed model from [25]. Similar to [7,8], students and faculty are assigned individualized schedules that they follow throughout a simulated semester. Schedules are organized into common meetingsclassroom, broad environment, clubs, residential, socializing-during which COVID-19 is equally likely to be passed from infected agents present to each susceptible agent also present.
All agents in the model start in either the susceptible state or with antibody protection. At the onset of the model, we independently assign each agent the antibody attribute with probability .20. A proportion of the agents with this attribute have antibody protection which prevents infection. Those without antibody protection act as normal susceptible agents. If a susceptible agent becomes exposed to COVID-19 then, after a random incubation period, the agent progresses to the asymptomatic or symptomatic infected state with equal probability [35,36]. Such agents occupy this state for a random infectious period and subsequently transition to the recovered state. Symptomatic individuals decide after a random observation period to self-quarantine, either voluntarily or from seeking independent testing, until recovered. Recovered agents cannot become infected again. Except for the quarantine period, periods are modelled with independent geometric random variables. We write Geo(1/p) to denote the geometric distribution P (X = k) = (1 − p) k−1 p for integers k ≥ 1 and 0 < p < 1 which has mean 1/p.  Duration is a measure of risk-intensity rather than time elapsed during a meeting. We scale it by the number of agents in the meeting and the risk of infection spread in the meeting type. For example, socializing has a longer duration than class time, not because more time is spent doing so, but because it is a riskier setting for infection spread [4,39]. Complete details regarding meeting structure are in Section 7.2. Each individual in a meeting in the susceptible state acquires exposure time equal to the meeting duration times the number of attendees in the infected state also at the meeting. At the end of each day, the total number of exposure minutes for each susceptible agent is tallied. This total is scaled by the infection rate which results in the probability the agent becomes infected on that day. The other manner in which infections occur in our model is through exogenous exposure in the non-campus community. We set the average exogenous exposures per week by applying a fixed (small) probability of becoming infected to each agent at the end of each day. Our model has 2 exogenous infections per week on average. Given our model's population of 16800 agents, this corresponds to 1.7 positive cases per 100,000 agents per day. At the time of writing, New York City has a rate roughly 10 times this, but we expect the rate to drop by Fall 2021 [38].
The infection rate in our model is set to obtain an average reproduction number R 0 = 3, which represents the average number of infections caused by an exposed agent in an entirely susceptible population. Estimates for COVID-19 spread in large communities (such as cities and countries) put R 0 in the range [2.0, 3.0] [26][27][28][29][30], but there are some higher estimates [31]. The statistic varies by community contact structure. It has been observed that R 0 is larger in reopened universities [4]. For example, [7] sets R 0 = 3.8 in their campus COVID-19 model for a mostly residential urban university. See [7] and [8] for more discussion about elevated R 0 levels in a university setting.
The main statistic we consider is the total number of agents ever in the exposed state over the course of a 15-week semester. We refer to this as total infections. The model is initiated with 10 randomly selected students in the exposed state. Our base model represents a reopening with the full population present on campus, antibodies present in 20% of the population, but no vaccination and no screening testing. We assume that facemasks are used except when socializing in private. This is accounted for by lowering the risk of infection spread in public spaces such as classrooms and broad environment. In the base model there is no active monitoring of the number of cases, so no adaptive policies (such as temporary suspension of in-person instruction) are ever implemented. With R 0 = 3, we find, on average, 1200 total infections in the base model with no vaccination and 20% antibody incidence. Figures 6 gives a sense of how the number of infections evolves over time. Note that we do not include a "Thanksgiving Effect" with a November rise in infections in our model. Figure 7 shows the average number of infections occurring in each setting. As mentioned previously, socializing is the main venue for infection spread in our model. Note that we have 400 students living in the residential dorms. This is in alignment with Baruch College and amounts to less that 2% of the student population.
Vaccination and antibody status impact agents' susceptibility and infectiousness. Each agent is assigned the vaccinated attribute independently with probability V . Such agents have inward protection factor r i and outward protection factor r o . Vaccinated agents contribute a factor of 1 − r o of exposure time to susceptible agents in each meeting they are present at. When computing the probability vaccinated agents are infected at the end of a day, the probability is multiplied by 1 − r i . All COVID-19 infections in vaccinated agents are classified as asymptomatic. Our medium-effectiveness scenarios set r i = 0.5 = r o , while low-effectiveness has r i = 0.2 = r o and high-effectiveness has r i = 0.8 = r o . We perform a sensitivity analysis to setting r i = r o in Section 7.3. Table 1 shows the relevant infection parameters. Once exposed, vaccinated agents progress through the stages of COVID-19 infection (exposed, infectious, recovered) as a normal susceptible agent would. See Figure 5 for a schematic.

Meeting Structure
Each of the 16000 student is assigned a Year in 1,2,3,4 (in equal proportions) and an Area in Business, STEM, and Humanities. The proportion in each area is 75%, 15%, and 10%, respectively. The 800 faculty are divided in the same proportions as students to each of the three areas. The student/faculty designation, year and area of an agent play a role in the meetings they attends. Broadly, there are five types of meetings: class, broad, club, social, and residential. Students and faculty interact in class time and broad meetings. Only students interact in club, social, and residential meetings.

Courses
Courses meet twice per week either MW or TuTh for c · 100/L minutes each class where c = 1/10 is a scaling parameter to account for reduced transmission probability in classrooms and L is the number of students enrolled. Courses are either General Interest (G), Business (B), STEM (S), or Humanities (H). Each class is independently designated as either a MW or TTh meeting class with probability 1/2 each. The number of classes of various sizes in Table 2 are chosen so that 20% of all classes are General Interest, and the proportions of classes of each size align with the counts provided by the Baruch College Common Data Set. We draw inspiration for how class meetings are generated in [7] using enrollment histogram data and order statistics to create correlations among courses in students among different years. Let C X,y be the total number of classes of size y in area X ∈ {B, S, H, G}.
Let T X = C X i=1 X i be the total number of seats offered across all of the courses in area X. Form the vector p X = (p 1 (X), . . . , p C X (X)) with p i (X) = X i T X .
Index the courses in G⊕ X := (G 1 , . . . , G C G , X 1 , . . . , X C X ) as Ω X = {1, 2, . . . , C G +C X }. Define the random variable Y (X) that takes values in Ω X where, with probability 1/5, Y (X) is drawn from a multinomial with distribution p G on 1, . . . , C G and, with probability 4/5, is drawn from a multinomial with distribution p X on C G + 1, . . . , C X . We then assign classes to four students in area X, one of each year, simultaneously by sampling four independent be the arrangement of the Y k (X) from least to greatest. The student in year k is assigned class Y (k) (X). Each student is assigned four classes in this manner.
This construction ensures that the amount of each class size in each area is proportional to the ratios in Table 2. Using order statistics ensures that students in an earlier year are more likely to take large, general interest classes. Faculty in the corresponding area are assigned to teach two uniformly samples courses in their area.

Broad environment
All agents spend 20/L minutes per M, T, W, Th meeting with the L students and faculty in their area, and 10/16800 total minutes per week meeting with all agents in the model. This represents ambient environmental contacts (hallways, elevators, lobbies, gym, library) that occur on campus.

Clubs
Clubs meet 100/L minutes on Thursday where L is the size of the club. There are 50 General Interest, 30 Business, 20 STEM, and 10 Humanities clubs. Each student joins a uniformly random general interest club with probability 1/5 and a uniformly random club in their area with probability 1/5. The probability a student does not participate in any clubs is (4/5) * (4/5) = 16/25 = 0.64. This is in line with the participation rates for clubs at Baruch College according to the 2018 Student Experience Survey.

Residence Hall
Pick 400 total students uniformly at random from years 1 and 2 to live in residence halls. Pair these students up into 200 groups of two students each representing roommates. Each roommate group meets 300 minutes per day. The entire group of 400 students in the residence hall spend 100/400 minutes together per week.

Social
Small and large social groups are formed via a Markov process. All students are labeled as low, medium, or high socializers. In line with socializing surveys from [40], the probability a student is a low socializer is 0.15, medium is 0.45, and high is 0.40.
Let L, M, and H be the sets of low, medium, and high socializers. Furthermore, let X k (Y ) be the set of level Y socializers from area X in year k. For example, H 3 (M) are medium-socializers in their third year of humanities. Whenever a student is sampled from a group Z, the sampling is done so that the student is uniformly sampled from Z ∩ L with probability 0.10, from Z ∩ M with probability 0.30, and from Z ∩ H with probability 0.60. Call this method ( * ).
A small social group is formed according to the following algorithm.
(i) Select a student from the entire population according to ( * ). Suppose they are from area X and year k.
(ii) The next student is sampled according to ( * ) from: • The entire student population with probability 1/6.
• All students in year k with probability 1/6.
• All students in area X with probability 1/6.
• All students in area X and year k with probability 1/2.
(iii) With probability 1/2, no more members are added to the group. With probability 1/2 the algorithm continues using the year and area of the newly added member to generate the next choice via (ii) and (iii).
A medium social group is formed by replacing the probability of adding an additional member to the group at step (iii) with 9/10. Every Friday there is a large social group consisting of five uniformly randomly selected medium social groups. The duration is 2000/L minutes L the total number of people in the meeting. The long duration of large social groups is capturing the "superspreader phenomenon" observed on campuses during the 2020-2021 school year [4]. Small social groups have expected size 3. These model close friends who study, eat, and pass time together. Medium social groups have expected size 11 and large social groups have expected size 55. These model larger social gatherings such as parties or events.
Each small group meets with probability 1/2 on each weekday M, Tu, W, Th for 1000/L minutes where L is the size of the group. This makes a minute of socializing ten times higher risk than a usual minute. Each medium group meets with probability 1/2 on Th and F for 1000/L minutes where L is the number of people in the meeting. Large social groups meet for 1000/L minutes on F (with probability 1). These random choices are made for the first week and repeated for all weeks thereafter. The parameter s scales for the higher risk of infection transmission during socializing since facemasks and social distancing are less likely to be employed. 100 is chosen so that the scaling is relative to the meeting time of a course.
We form 3000 small social groups, 300 medium social groups, 50 large social groups for the base model.

Additional Sensitivity Analysis
The reproduction number is a phenomenological output of the infection biology and contact structure in the model. Thus, it is is difficult to calibrate in heterogeneous populations (see the discussion in [8]). For this reason, we additionally run our base model with R 0 = 2 and R 0 = 4. Since socialization is a major source of infection spread, we also include a version with R 0 = 3 and half as much social interactions. These variations are displayed in Figure 8. As expected, total infections are greatly reduced by decreasing socializing. Moreover, we see that total infections are sensitive to our choice of R 0 . This is more reason for administrators to exercise caution in their reopening plans. Lastly, Figure 9 shows box plots for total infections with unequal vaccine effectiveness parameters (r i , r o ) ∈ {0.3, 0.7} 2 . We find that the impact from each is roughly the same. This suggests that our choices of setting r i = 0.5 = r o in our main analysis and also r i = r o in our low-, medium-and high-effectiveness vaccine scenarios reasonable simplifications to make.

Code Access
The code for the project is publicly available on Github at the address https://github.com/MAS-Research/SEIR-Campus as an extension of [25]. There are two files associated with this paper: CunyCovid.ipynb replicates the simulations discussed in this paper and writes the data to a file. ImageRendering.ipynb loads the simulation data to produce the graphics shown in this paper.