Taking account of asymptomatic infections: A modeling study of the COVID-19 outbreak on the Diamond Princess cruise ship

The COVID-19 outbreak on the Diamond Princess (DP) cruise ship has provided empirical data to study the transmission potential of COVID-19 with the presence of pre/asymptomatic cases. We studied the changes in R0 on DP from January 21 to February 19, 2020 based on chain binomial models under two scenarios: no quarantine assuming a random mixing condition, and quarantine of passengers in cabins—passengers may get infected either by an infectious case in a shared cabin or by pre/asymptomatic crew who continued to work. Estimates of R0 at the beginning of the epidemic were 3.27 (95% CI, 3.02–3.54) and 3.78 (95% CI, 3.49–4.09) respectively for serial intervals of 5 and 6 days; and when quarantine started, with the reported asymptomatic ratio 0.505, R0 rose to 4.18 (95%CI, 3.86–4.52) and 4.73 (95%CI, 4.37–5.12) respectively for passengers who might be exposed to the virus due to pre/asymptomatic crew. Results confirm that the higher the asymptomatic ratio is, the more infectious contacts would happen. We find evidence to support a US CDC report that “a high proportion of asymptomatic infections could partially explain the high attack rate among cruise ship passengers and crew.” Our study suggests that if the asymptomatic ratio is high, the conventional quarantine procedure may not be effective to stop the spread of virus.


Introduction
The COVID-19 outbreak has developed into an international public health emergency. The reproductive number (R 0 ) of COVID-19 is a key piece of information for understanding an epidemic. Current intervention methods focus on quarantine methods with either mitigation or suppression strategies aimed at reducing the reproduction number R 0 and flattening the curve [1]. Asymptomatic or pre-symptomatic infectious cases are less likely to seek medical care or to be tested and quarantined, contributing to the infectious potential of a respiratory virus [2,3]. Clinical findings have suggested that the viral load in asymptomatic patients is similar to that in symptomatic patients [4]. Evidence suggests that these pre/asymptomatic patients can infect others before they manifest any symptoms [5][6][7]. In an earlier study [8]  Wuhan, China, 200 (83%) individuals out of 240 reported no exposure to an individual with respiratory symptoms, which suggests pre/asymptomatic infection is common [9]. Little is known on the implications of asymptomatic COVID-19 transmission on disease dynamics [10]. The Diamond Princess (DP) data [11][12][13][14] with reported asymptomatic cases may be considered as an "accidental" trial in an isolated environment. Based on the DP data [11][12][13], we estimate the R 0 as a function of time, and our approaches take explicit account of possibly infectious contacts between quarantined passengers in cabins and pre/asymptomatic crew, which has not been explored in the literature. A US CDC report states that "a high proportion of asymptomatic infections could partially explain the high attack rate among cruise ship passengers and crew" [15]. On January 20, 2020, the DP departed Yokohama, Japan, making stops in Hong Kong, Vietnam, Taiwan, Japan, and was scheduled to return to Yokohama on February 4 [12]. The DP [11][12][13][14][15], with 3,711 people (2,666 passengers and 1,045 crew members) on board as of February 5, 2020, was found to have an outbreak of COVID-19 from one traceable passenger from Hong Kong. This passenger became symptomatic on January 23 and disembarked on January 25 in Hong Kong. On February 1, six days after leaving the ship, he tested positive for SARS--CoV-2 at a Hong Kong hospital [12]. Japanese authorities were informed about this test result. Group activities continued on board through February 4, when the authorities announced positive test results for SARS-CoV-2 for another ten people on board. The ship was quarantined by the Japanese Ministry of Health, Labour and Welfare for what was expected to be a 14-day period (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19), off the Port of Yokohama [12]. Initially, passengers were quarantined in their cabins while the crew continued to work [14]. Only symptomatic cases and close contacts were tested for COVID-19 and PCR-confirmed positive passengers were removed and isolated in Japanese hospitals. As reported [11,15], phased attempts were made to test all passengers including asymptomatic cases starting on February 11. As of February 20, 619 cases had been confirmed (16.7% of the population on board), including 82 crew and 537 passengers [11]; and 50.5% of the COVID-19 cases on the DP were asymptomatic (consisting of both true asymptomatic and pre-symptomatic infections) [13], while an estimated proportion 17.9% (95% credible interval: 15.5-20.2%) never developed symptoms [13]. Of 66 SARS--CoV-2 positive American DP travelers with complete symptom information, 14 (21%) were pre-symptomatic while on the ship [16]. Overall, 712 (19.2%) of the crew and passengers tested positive; of these, 331 (46.5%) were asymptomatic at the time of testing [15].
The R 0 of COVID-19 on DP has been estimated previously [17]; this research identified the R 0 as 14.8 initially and then declining to a stable 1.78 after the quarantine and removal interventions, assuming a 70% reduction in contact rate. That research does not take account of asymptomatic infections. Other researchers using the DP data up to February 16 have estimated the median R 0 as 2.28 [18]. They found R 0 remained high despite quarantine measures, while concluding that estimating R 0 was challenging due to the difficulty in identifying the exact number of infected cases. The R 0 values have important implications for predicting the effects of interventions. The threshold for combined vaccine efficacy and herd immunity needed for disease extinction is 1-1/R 0 . At R 0 = 2, the threshold is 50%, while at R 0 = 4, this threshold increases to 75%.
We investigated the changes in R 0 for COVID-19 on the DP from January 21 to February 19 with a chain binomial model [19][20][21] at different times under two scenarios: no quarantine assuming a random mixing condition before February 5, and quarantine of passengers in cabins from February 5 to 19-passengers may get infected either by an infectious case in a shared cabin or by pre/asymptomatic crew who continued to work. This work adds to the growing knowledge gained from the DP data in that we estimate R 0 by (1) mimicking the quarantine articulated in the 'author contribution' section. The funders did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

Competing interests:
The authors have read the journal's policy. We wish to confirm that there are no known conflicts of interest associated with this paper and there has been no significant financial support for this work that could have influenced its outcome. AT&T provided the salary for LL but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
conditions in practice, and (2) taking explicit account of the presence of the pre/asymptomatic crew and phased removal of infectious cases. The chain binomial model originally proposed in [19], belongs to the broader class of stochastic discrete-time SIR models, and has been commonly used in the analysis of infectious disease spread [20,21] such as measles [20,22] and ebola [20]. The chain binomial model occurs naturally in cohort studies such as the DP case considered here where the number of individuals at risk in one serial interval are the survivors from the previous serial interval so that the conditional distributions are binomial. Each binomial probability mass function is conditioned on previous function and "products of these probability mass functions give the probabilities of particular sequences of binomial realizations" [21]. "The simple chain structure allows for statistical inference based on likelihood theory" [21]. In our analysis for the DP data, the novelty of the work is that we incorporate the asymptomatic ratio in estimating the R 0 based on the chain binomial model, which has not been explored in the literature.

Data
We collected publicly available data on the outbreak on the DP from January 21 to February 19 [11][12][13]. We set January 21 as day 1, since January 20 was the start date (day 0) of the cruise. February 19 (day 30) was the date that most passengers were allowed to leave the ship. For those dates that Y, the number of new COVID-19 cases, was not reported, linear interpolation was used. As an example, there were 67 new cases on February 15, but no data were reported on February 14. After linear interpolation, Y on a daily basis became 33 and 34 for February 14 and 15 respectively. Based on the documented onset dates [11], there were 34 cases with onset dates before February 6, and we further adjusted the number of confirmed cases on February 3, 6, and 7, from 10, 10, and 41 cases to 17, 17, and 27 cases respectively. We chose serial intervals τ of 5 and 6 days as these are factors of 30 and are close to 7.5 days (95% CI 5.3-19) [8] and 4 days [23,24]. Then daily data were aggregated into 5-and 6-day intervals as binomial realizations. A S1 File contains R-code [25] for reproducing our calculations.

Chain binomial model with asymptomatic ratio
The chain binomial model assumes that an epidemic is formed from a succession of generations of infectious individuals from a binomial distribution [20,21]. For the DP data, the initial population size is N t = 0 = 3711, where time t is the duration measured in units of the serial interval. To model the dynamics on the ship for the case τ = 6, we make the following assumptions, which are stated in the order of time.
(a) From January 21 to 26 (t = 1, the first serial interval), infection contacts happened at random following the random mixing assumption. Let I t be the number of persons infected at time t. Then I t = 1 is a binomial random variable B(N t = 0 , p 1 ) with binomial transmission probability p 1 = 1 -exp(β × I t = 0 /N t = 0 ), where β is the transmission rate and I t = 0 = 1 (the first case who disembarked on January 25). As in the SIR model, the probability that a subject escapes infectious contact is assumed to be exp(β × I t = 0 /N t = 0 ).
(b) From January 27 to February 1 (t = 2), infection contacts again followed the random mixing assumption. Hence I t = 2 is a binomial random variable B(N t = 1 , p 2 ) with p 2 = 1 -exp(β × I t = 1 /N t = 0 ), and the number of persons at risk of infection is N t = 1 = 3711 -I t = 1 .
(c) For the period February 2 to 7 (t = 3), I t = 3 is a binomial random variable B(N t = 2 , p 3 ) with N t = 2 = N t = 1 -I t = 2 and p 3 = 1 -exp(β × (I t = 1 + I t = 2 )/N t = 0 ). As the quarantine started on February 5 and confirmed cases were removed, the number of persons infected, removed, and at risk of infection at the end of the t = 3 period were I t = 3 , (I t = 1 + I t = 2 ), and N t = 3 = N t = 2 -I t = 3 respectively.
(d) For t = 4 and t = 5 during the quarantine of passengers, N t = N t-1 -I t , and infectious cases were removed. We further make the following assumptions. (i) Of all infected cases, 86.8% (= 537/619) were passengers and 13.2% (= 82/619) crew [11]. We use these proportions to calculate the number of infected persons in each group, Ip t and Ic t , respectively, and I t = Ip t + Ic t . This assumption is imposed since there is no public data available for the time course of Ip t and Ic t . (ii) Crew members continued to work unless showing symptoms; hence the binomial transmission probability of crew remained the same as 1 -exp(β × I t-1 /N t-1 ), t = 4 and 5. In other words, crew were randomly mixing in the population on board. (iii) Passengers stayed in cabins most of the time. Assume that among infected passengers Ip t , the proportion of infections that occurred in cabins is r p , and that the average occupancy per cabin is 2. For those Ip t-1 cases, t = 4, 5, the binomial transmission probability to infect Ip t × r p passengers in cabins is 1 -exp(β/2). In [11], r p = 0.2 (= 23/115), while we assume r p = 0.2 and 0.3 when τ = 6. For this assumption, r p � Ip t-1 /Ip t , and thus r p cannot be made arbitrarily large. (iv) The other (1 -r p ) proportion of infected passengers' cases was possibly due to pre/asymptomatic crew members who continued to perform service [11], and their binomial transmission probability is assumed to be p t = 1exp(β × aratio × Ic t-1 /C t-1 ), where aratio is the pre/asymptomatic ratio and C t-1 is the number of crew members on board at time t-1. That is to say, these passengers were randomly mixing in the crew population with possible infectious contact with pre/asymptomatic crew. In our calculations, aratio = 0.4, 0.465 [15], 0.505 [13], and 0.6.
Assumptions (a)-(c) correspond to no quarantine assuming a random mixing condition, and assumption (d) to quarantine of passengers in cabins in which passengers may either get infected (d)(iii) by an infectious case in a shared cabin or (d)(iv) by pre/asymptomatic crew who continued to work. The maximum likelihood (ML) approach was used to estimate β. We developed some R [25] code by modifying some R-functions in [20] for the chain binomial model and the ML step was carried out using the R-package bbmle [26]. The associated Rcode is provided in the S1 File. For t = 4, 5, R 0 is the number of persons (passengers or crew) at risk (at time t) times either (d)(iv) 1 -exp(β × aratio/C t-1 ) for passengers potentially infected by pre/asymptomatic crew, or (d)(ii) 1 -exp(β/N t-1 ) for crew. In the case of τ = 6, r p = 0.2 [11], and aratio = 0.505 [13], 100 stochastic simulations of the chain binomial model with N = 3,700 and the estimated β are performed to visually compare the simulated uncontrolled epidemics and observed DP data.
The calculation for the case τ = 5 is analogous: period January 21 to 25 follows (a); period January 26 to 30 follows (b); period January 31 to February 4 follows (c); and periods February 5 to 9, February 10 to 14, and February 15 to 19 follow (d), and r p = 0.2, which is the maximum value given the constraint r p � Ip t-1 /Ip t .

Estimation of R 0
For the DP COVID-19 outbreak, Table 1 gives the estimates of β and their 95% confidence intervals. Since β is the basic reproductive number R 0 at the beginning of the epidemic (t = 1, 2, 3 when τ = 5, and t = 1, 2 when τ = 6), we observe from Table 1 that the estimated R 0 for the initial period is greater than 3 in every one of the scenarios that we considered. In addition, given τ and r p , the estimated β increases as aratio decreases. Thus if the aratio is smaller than 40%, the estimated β would be larger than those in Table 1. With r p = 0.2 in Table 1, comparing the estimated R 0 between τ = 5 and τ = 6, a longer interval leads to a larger estimate of R 0 and vice versa.
When aratio = 0.505 [13], the estimated R 0 as a function of t and its 95% CI are given in Table 2 and illustrated in Fig 1 for the case of r p = 0.2. We observe that when τ = 6 and r p = 0.2, the R 0 for passengers in (d)(iv) is increased from 3.78 at t = 3 to 4.73 and 4.39 at t = 4, 5, respectively, and the R 0 is decreased to 1.06 and 1.05 respectively for crew in (d)(ii). This shows that R 0 for some passengers increased from t = 3 to t = 4, 5 if they were in contact with pre/asymptomatic crew. On the other hand, the R 0 for crew at t = 4, 5 is small and close to 1, since infected passengers were removed and crew were exposed to fewer cases. With a higher r p = 0.3, the same τ = 6 and aratio = 0.505, the estimated β = 4.20 (Table 1) is larger than the case with r p = 0.2, and the R 0 for passengers in (d)(iv) is increased to 5.26 and 4.88 at t = 4 and 5 respectively ( Table 2); and the R 0 for crew in (d)(ii) is decreased to 1.18 and 1.17 respectively. For the case of τ = 5, r p = 0.2, aratio = 0.505, and estimated β = 3.27 (Table 1), Table 2 shows that the R 0 for passengers in (d)(iv) is again increased to 4.18, 4.08, and 3.74 at t = 4, 5, and 6 respectively, and the R 0 for crew in (d)(ii) is below 1, 0.92, 0.92, and 0.91 respectively. Other than those infections between passengers sharing the same cabin, the combined R 0 for passengers and crew are also given in Table 2, 2.90 and 2.73 respectively for t = 4 and 5 when τ = 6, r p = 0.2, and aratio = 0.505, decreasing from the initial R 0 = 3.78, which illustrates the limited effects of quarantine if pre/asymptomatic cases were present. Similarly, when τ = 5, r p = 0.2, and aratio = 0.505, the combined R 0 for passengers and crew is 2.55, 2.50, and 2.32 respectively for t = 4, 5 and 6. To understand the dynamics of no quarantine with a high R 0 , Fig 2 shows 100 stochastic simulations of the chain binomial model based on N = 3,700 and β = 3.78, assuming no quarantine, infected cases removed, τ = 6, and extrapolation to 90 days, with the observed epidemic (red line). It suggests that the quarantine on DP did prevent a more serious outbreak. If there was no quarantine, the cumulative number of cases at the end of 30 days has a mean 856 (SD = 440), and median 791 (IQR = 538), while the observed DP data of 621 cases [11] is at the 35th percentile. Among 99 of 100 simulations, the entire population is infected at the end of 54 days, and in the remaining one simulation, the entire population is immune, not infected at all.

Mathematical explanation
Let us explain mathematically why the R 0 increased for passengers in contact with pre/asymptomatic crew. Denote the number of passengers in assumption (d)(iv) as Pa t. To reduce their R 0 to a number smaller than the initial β, a sufficient condition is aratio� C t-1 /Pa t , and the DP crew-passenger ratio at t = 0 is 1045/2666 = 0.39 [12]. This suggests that as long as the aratio is �40%, R 0 for passengers in (d)(iv) would remain high due to pre/asymptomatic crew. The higher the aratio is, the more infectious contacts would happen. Thus, for a virus with a high aratio, the conventional quarantine procedure may not be effective to stop the spread of virus, highlighting the importance of early and effective surveillance.
We then derived a sufficient condition to achieve R 0 �1 for both passengers and crew: β×ar-atio×Pa t � C t-1 and C t � N t-1 /β. We attempt to interpret this condition as follows. For a β about 3, C t � N t-1 /β means that the number of people who continued to work during quarantine is less than one-third of the population, which is generally the case during quarantine. However, β×aratio×Pa t � C t-1 may not be satisfied depending on the aratio value. When the aratio is close to 0, this condition is satisfied, but when aratio is high, the pre/asymptomatic crew continue to spread the virus and passengers staying in cabins could not escape infectious contacts. This suggests that if the true aratio value is high, the current "stay-at-home" quarantine procedure may not be sufficient to reduce R 0 �1 and to eliminate the virus completely.

Discussion
The effects of pre/asymptomatic population on the spread of COVID-19 during quarantine continues to provide a research case with great possibilities for gaining a better understanding of the pandemic. Clinical observations and lab tests have confirmed the existence of a pre/ asymptomatic population infecting others [5][6][7]. It is not easy to give estimates of the size of this population, yet the DP outbreak provides useful real world data for this. 50.5% of passengers and crew members on the DP were pre/asymptomatic and an estimated 17.9% of the infected individuals never developed symptoms [13]. In a subsample of American DP travelers [16], 14 (21%) cases were pre-symptomatic while on the ship. In a retrospective study [27] of 104 DP COVID-19 cases, asymptomatic cases showed milder CT severity score than symptomatic cases.
In this study, the estimated R 0 for the initial period (Table 1) are all greater than 3, consistent with most estimates of R 0 reported earlier, showing that the COVID-19 virus is highly contagious [28][29][30]. The novelty of our approach has been to incorporate the pre/asymptomatic ratio into the chain binomial model to account for the possibly infectious contacts between quarantined passengers and pre/asymptomatic crew. The results show that with a serial interval of 6 days, R 0 is similar for t = 1-3, yet R 0 for some passengers in assumption (d)(iv) is higher for t = 4, 5. The results suggest that the observed proportions of infections, 86.8% (= 537/619) for passengers and 13.2% (= 82/619) for crew [11], is possible and we find evidence to support a US CDC report that "a high proportion of asymptomatic infections could partially explain the high attack rate among cruise ship passengers and crew" [15].
Some research has suggested that the pre/asymptomatic population, "silent carriers," are the main driving force behind this pandemic. A group [3] has estimated that the proportion of undocumented infections in China-including those who experience mild, limited or no symptoms and go undiagnosed-could be as high as 86% prior to January 23, 2020. They estimated the transmission rate of undocumented infections as 55% of the rate for documented infections, and yet that undocumented infections contributed to 79% of documented cases. Another group of researchers found that the total contribution from the pre/asymptomatic population is more than that of symptomatic patients [9]. Future studies to estimate the pre/ asymptomatic ratio and the time when asymptomatic persons become infectious are needed to evaluate the effects of various control strategies [10,31,32].
The strength of this analysis is that it incorporates pre/asymptomatic infections in the DP data in a way not explored earlier. However, there are also limitations. First, due to inadequate data on the time course of infection cases among crew and passengers, assumption (d)(i) assumes a constant proportion, which may vary with time in practice. Second, the values of the parameter r p assumed in the present study may not be sufficiently large due to mathematical constraints. A study [16] based on a subsample of American passengers on DP reported a high attack of 63% (27/43) for those sharing a cabin with an asymptomatic infected cabinmate. Third, the assumptions (a)-(d) under chain binomial models may not be sufficient to capture the complexity of the COVID-19 epidemics. Fuller data reporting is important for researchers to develop statistical methodology to help combat this pandemic.
Almost all of the passengers on DP were tested before they were evacuated. However, it is impractical to test everyone in the real world, especially for those pre/asymptomatic cases. On DP, crew members continued to perform service unless they showed symptoms. This provides a parallel to people doing "essential work" in society and thus exempt from shelter-in-place rules. Our study suggests that if the pre/asymptomatic ratio is high, the conventional quarantine might not be sufficient to reduce R 0 to below 1, implying that a combination of preventive measures is needed to stop the spread of virus [32]. The DP was docked in Vietnam and Taiwan on January 27-28 and January 31, 2020, respectively, and yet both reported low COVID-19 incidence rates [33], suggesting that the virus can be contained with early and appropriate measures.
Supporting information S1 File. File containing data and R-code related to this article.