Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Methodology for predicting hospital admissions and evaluating recovery rates for coronavirus disease in Japan

Abstract

In this study, we aimed to propose a method to predict the number of patients needing hospitalization using a combination of available technologies. We developed a method to predict the number of hospital admissions by combining a simple susceptible-infected-recovered (SIR) model with the relationship between the number of new positive cases and the number of hospital admissions, increasing the reliability of each prediction. The accuracy of the concordance between the actual number of patients and the predicted number of hospitalized patients was 99%. Owing to the high accuracy, we were also able to establish a method to evaluate recovery rates. This facilitated determination of the effectiveness of measures implemented throughout Japan to reduce the number of treatment days. The model developed in this study facilitates immediate estimation of the maximum number and timing of hospitalizations based on the peak of new positive cases. Moreover, it provides a statistically true value of the recovery rate required by the mathematical model for investigating countermeasures.

Introduction

The coronavirus disease (COVID-19) outbreak, which was declared a pandemic by the World Health Organization in March 2020, disrupted healthcare services and presented challenges for several healthcare professionals worldwide [14]. In Japan, the insufficiency of hospital beds to accommodate the rapidly increasing number of patients requiring treatment became a pertinent concern [5,6]. To surmount these issues in the future, a predictive model for patient hospitalization is needed. Mathematical models and artificial intelligence have been explored for developing methods to predict the spread of infection [79]. However, the existing methods for predicting the spread of COVID-19 have advantages such as good accuracy and good visualization as well as limitations such as the incorporation of incomplete or inaccurate data [10].

Applying predictive models of COVID-19 in clinical practice remains a cumbersome endeavor for medical practitioners [11]. Studies using mathematical models are limited by small sample sizes and insufficient testing of hypotheses, which can lead to overfitting [9,12]. To curtail these drawbacks, samples should be as large as possible, such as a community of several millions of people or more. The utilization of artificial intelligence currently requires the combination of various types of big data with intelligent tools such as machine learning to build predictive models [13]. Predicting the number of infected individuals is challenging with both methods, primarily because several factors are included as parameters for the prediction, and the amount and type of data required increase accordingly.

Forecasting the shortage of beds in a ward entails prediction of the number of hospital admissions, which corresponds to the quarantined persons () in the susceptible-infected-quarantined-recovered (SIQR) mathematical model [14]. The function is obtained by solving a simultaneous differential equation involving the susceptible (), infected (), and recovered () variables. However, the function has the same issues as those of the mathematical model described above. We proposed that infections can be decomposed by a linear combination of chain reactions [15]. Thus, if the spread of infection occurs within a community where mean-field approximation holds, it can be approximated using the SIR model function. This function can be used to fit the model to a single local peak and predict the future within the period of the peak.

Therefore, this study aimed to devise a method for estimating the number of hospital admissions based on the number of new positive cases. Determining the number of beds needed several days in advance will help maintain a functional and effective medical system.

As a corollary of this objective, we postulated that it would be possible to compare the actual treatment recovery rate of a quarantined person using the estimated recovery rate (0.1) used in the SIR model [15] of community-acquired infection spread. Since the time trend of the average recovery rate for hospitalized patients can be obtained as measured data, the mathematical model can be used to examine various infection control factors that require consideration.

Materials and methods

Data sources for the number of daily positive cases and hospital admissions

The open data used in this study included the number of daily positive cases and number of patients requiring treatment published by the Ministry of Health, Labour and Welfare in Japan. The data for daily positive cases are available at https://covid19.mhlw.go.jp/public/opendata/newly_confirmed_cases_daily.csv. The data for the number of people requiring treatment are available at https://covid19.mhlw.go.jp/public/opendata/requiring_inpatient_care_etc_daily.csv. Ethics approval and patient consent were not required as this study dealt only with published data on the number of positives and hospital admissions, and does not contain any personal, identifiable information.

The tabulation of patients requiring treatment was interrupted in September 2022 in some regions; thus, the period of investigation in this study was adjusted to coincide with the valid data range for each prefecture. Data compilation was suspended on September 26, 2022, in accordance with the government notification regarding 100% testing review [16,17]. We denoted the tabulated data of new positives as and the tabulated data of patients requiring treatment as . includes a few patients with severe illness.

Conceptual approach to estimating inpatient admissions

To predict the number of individuals requiring treatment, we derived an equation (described in the following section and shown in Fig 1) that correlates the number of new positive cases with the number of hospital admissions. The core procedure involved using the specified formula to convert published values and solutions derived from the mathematical model to prediction results. The inputs for new positives were daily aggregates of and or the solution to a mathematical model that conforms to community-acquired infections. Accordingly, the prediction system consisted of daily data, the solution of the mathematical model, and the prediction tool, as shown in Fig 1. The simplest SIR model with a solution fitted to the infection rate , recovery rate , and community population was adopted [15]. The prediction tool used two methods. The first entailed conversion of the number of new positives to the number of hospital admissions; the second was used to verify the concordance between the estimations and the ground truth. The actual recovery rate was verified using daily and values converted from the daily data. The new recovery rate obtained after this verification was the reciprocal of the number of treatment days of the quarantined patients. The recovery rate employed in the mathematical models that reproduce the spread of infection is for community-acquired infections, rather than the reciprocal of the number of inpatient treatment days. Nevertheless, various factors that contribute to the variation in the recovery rates of patients who are hospitalized over time can be used in mathematical models to predict the recovery rates of patients with community-acquired infections and in the data analysis [15] of the spread of infection. In that case, represents feedback for in the mathematical model.

thumbnail
Fig 1. Diagram of relationships for the prediction.

The prediction is characterized by independent routines for calculating the number of hospital admissions. The system is composed of daily data, prediction tools, and a simple mathematical model.

https://doi.org/10.1371/journal.pone.0334643.g001

The recovery rates of individuals are influenced by several factors, including the characteristics of the mutant strain, improvements in hospital treatment techniques, and government guidelines on control measures [18]. Consequently, a small sample leads to substantial uncertainty due to fluctuations, which renders the validation results unreliable. However, the dataset used in this study encompassed the entire Japanese population and was aggregated from all the prefectures. It is the largest such dataset, and all influential factors were averaged out to visualize only nationally representative trends.

The number of hospital admissions was predicted as follows. The solution of the SIR model was obtained by fitting a portion of the peak waveform to the present day. In the model, represented a value known at that time, was used as a fitting parameter, and was an estimated parameter as it provided the peak value of the spread of infection but was indeterminate at the start of the peak rise. The function , which indicates the number of new positives, is typically defined as the daily decrease in the number of susceptible cases, which was one of the solutions of the SIR model. The estimation of the value and timing of the peak number of hospital admissions was represented by adjusted for . The value of N was obtained by predicting , and was transformed from as follows.

Mathematical procedures for estimation methods

As shown in Fig 1, patients with infections detected by polymerase chain reaction testing were quarantined in the hospital or home as persons requiring treatment. The number of individuals requiring treatment increased with each new positive case but decreased with recovery after treatment. Accordingly, the differential equation for the number of individuals requiring treatment, , is expressed as follows:

(1)

where represents the recovery rate, or more precisely, the treatment completion rate, and its reciprocal is the number of days required for treatment. The solution of this differential equation is equation (2):

(2)

which can be expressed as a recurrence formula in equation (3), since and have discrete values for each day of unit time:

(3)

where , , and refer to ), , and , respectively. Iterating equation (3) yields the time variation of the number of people requiring treatment () from the daily number of people who have tested positive (). The time dependence of was determined to ensure the consistency of , with reported by the hospital. should be a constant or a step function that is constant for an appropriate duration so that it is not affected by daily data fluctuations. The number of admissions in equation (3) can be estimated using discrete values, such as the daily data or a function of the solution of the mathematical model.

Results

Calculation of the number of hospitalized patients and estimation of the recovery rate

Fig 2 shows the actual () and calculated () number of hospitalized patients in Tokyo from equation (1). The recovery rate was 0.1 for the entire period, which corresponds to 10 days of treatment [19]. The figure shows several periods where deviated from . To make them consistent, was approximated as a step function to make it as insensitive as possible to the fluctuations in daily data. The recovery rate of 0.1 was adequate for the period through December 2021, excluding the periods when variants emerged. These are the periods corresponding to the Alpha and Delta variants. The Omicron variant, including the BA.1, BA.2, and BA.5 sub-variants, emerged after January 2022. The same recovery rate of 0.1 was used in the mathematical model for community-acquired infections. This indicates that the mean hospital recovery rate can be employed in the actuarial model as the recovery rate for community-acquired infections. It was difficult to confirm  = 0.1 for the Omicron period. The genome was identified by the National Institute of Infectious Diseases (NIID) in Japan [20].

thumbnail
Fig 2. Number of in-patients and patients requiring treatment calculated from the new positive cases in Tokyo at a recovery rate of γ = 0.1. The solid line shows the data for inpatients in Tokyo, and the dashed line shows the calculated number of inpatients.

https://doi.org/10.1371/journal.pone.0334643.g002

The upper graph in Fig 3 shows the step function of , and the lower graph shows both curves and their overlap. Values of less than 0.1 indicated more days of treatment, which was observed for two events of worsening infections. The first lasted from January to February 2021, while the second lasted from February to May 2022; equaled to 0.08 in both instances. Conversely, values greater than 0.1 indicated fewer days of treatment, which were observed during two periods of declining infections. The first period lasted from October to November 2021 ( = 0.13), while the second lasted from August to September 2022 ( = 0.14). The concordance between the number of reported persons requiring treatment and the number of hospital admissions estimated from the reported new positive cases was approximately 99% or better. The coefficient of determination between the two was R² = 0.993. The test period was 985 days, from January 16, 2020, when the data became available, to September 26, 2022.

thumbnail
Fig 3. Number of inpatients and patients requiring treatment calculated from the number of new positive cases in Tokyo by introducing the recovery rate of the step function.

The step function for γ is shown at the top of the figure. The solid line shows the data for inpatients in Tokyo, and the dashed line shows the calculated number of inpatients.

https://doi.org/10.1371/journal.pone.0334643.g003

Recovery rates for all prefectures in Japan

A status diagram of values for all prefectures in Japan is shown in Fig 4, to facilitate the visualization of the systematic changes with common factors. In terms of the national trends of recovery rates, was greater than 0.1 for two periods, which is the same as the result for Tokyo. In contrast, was not less than 0.1 for any period or prefecture, and the national trend was not necessarily the same as that of Tokyo. Therefore, the low values of , indicating that patients required longer treatment during a surge in hospital admissions, can be attributed to the treatment process and environment at hospitals in each region. It is difficult to determine the cause of a phenomenon that is not common throughout the country because it requires meticulous investigation using detailed data. The two periods with larger , corresponding to fewer days of treatment, are roughly estimated to be September–November 2021 and July–October 2022, as indicated by . In July–August 2022, several prefectures had overlaps of the two states of and due to a steep surge in hospital admissions.

thumbnail
Fig 4. The γ distribution for each prefecture in Japan. Two periods with large recovery rates γlarge-ex were identified. The vertical axis shows the prefectures in order of the largest population from top to bottom, and the horizontal axis shows time on the same scale as that shown in Fig 2. To convey whether γ is greater or smaller than 0.1, the figure is color-coded in black and gray, respectively.

https://doi.org/10.1371/journal.pone.0334643.g004

Reflecting recovery rates in mathematical models

The for the two periods highlighted in the previous section can be considered as the recovery rate of community-acquired infections, since the same value, irrespective of prefecture, implies a universal phenomenon. As the recovery rate increases, the infection rate apparently declines. Studies using the equivalent SIR model of the chain reaction [15,19], which closely reproduces the Tokyo data, have reported an anomaly, i.e., a sudden reduction in the infection rate during the same two periods as . Reflecting the increased recovery rate of during these periods resolved this anomaly in the new positive case data proportional to community-acquired infections.

These two periods of greater values were close to the timing of the nationwide vaccinations [21,22]. Specifically, the two periods correspond to the second half of the peak of the Delta mutant infection and the final tail of the BA.2 mutant infection. Additional statistical verification is required before establishing definitive causality between the recovery rate and vaccination.

Architecture for estimating the maximum number of inpatients from the daily positive cases

After the appropriate recovery rate for community-acquired infections was determined, the SIR model was used to predict future trends. The objective of the prediction method was to estimate the number of hospitalized patients in advance as much as possible at the time of infection spread. This method could not predict future outbreaks. Given that the initial components of the SIR model function can be approximated using an exponential function, the infection rate can be determined by fitting a linear upward slope to the daily data of the curve, as illustrated in the logarithmic representation in Fig 5a. The upward slope was , which did not permit determination of with certainty. Consequently, we estimated by applying the principle that the peak of new positives is contingent upon the size of the basic community of infection. Fig 5b shows the logarithmic representation of the function with varying infection rate of 0.5, 0.4, 0.3, 0.2, 0.15, and 0.12. The initial condition was that there had to be one infected person at time t = 0.

thumbnail
Fig 5. Prediction procedure for new positive cases.

a) Estimation method with a single peak. b) Logarithmic representation of function P with varying infection rate β. N denotes the population of the community.

https://doi.org/10.1371/journal.pone.0334643.g005

The value of was determined when the data of new positives reached a maximum. At this time, the relationship between the maximum and based on equation (2) was as follows: was approximately multiplied by , and reached its maximum value approximately days after . These formulae were used to calculate the peak of patient hospitalizations as soon as the peak of new positive results could be estimated. Fig 6a shows the relationship between the maximum number of patients requiring treatment as and the time to reach it as, as well as that between the maximum number of new positive results as and the time to reach it as . In more precise terms, the dependence of the maximum values of and on the infection rate was calculated from equation (2), as illustrated in Fig 6b. Consequently, the maximum number of hospitalized patients was reached sooner as the infection rate increased. An increase in from 0.1 to 0.3 indicated that the maximum number of hospitalizations () declined from 10 to 8 multiplied by the maximum number of new positive cases, and the difference in the number of days between the two maximums () declined from 10 to 6, for both cases in which the value of was 0.1.

thumbnail
Fig 6. Prediction procedure for number of hospitalized patients.

a) Relationship between the peak values of P and H. b) Dependence of the relationship between both peak values on the infection rate β. Hmax represents the maximum number of admissions and Pmax is the maximum number of new positives.

https://doi.org/10.1371/journal.pone.0334643.g006

Discussion

In this study, we devised a method for predicting the number of hospital admissions based on the number of new positive cases. This method facilitated estimation of the number of individuals who would be quarantined using a simple SIR model. In addition, comparison of the predictions based on the number of new positive results and number of persons requiring treatment revealed variations in the number of days required for treatment. The recovery rate for community-acquired infections was derived from the number of treatment days required and can be incorporated into the mathematical model. This cyclical system can serve as a tool for forecasting the risk of hospital bed insufficiency. Concurrently, the temporal variation in the recovery rate can be used to discern alterations in the characteristics of the mutant strains and fluctuations in the efficacy of treatment in the hospital and national preventive measures.

Vaccination was the most probable cause of the increased recovery rates observed during the spread of the Delta strain, despite its tendency to cause severe disease [23]. The second increase in the recovery rate during the BA.5 mutant period was  = 0.14, which is slightly higher than the recovery rate  = 0.13 during the period of predominance of the Delta mutant. This may be attributed to the overlap of the higher recovery rates over a longer period since March 2022. This is consistent with reports that the Omicron strain is less likely to cause severe disease [24]. is also not inconsistent with reports of the high vaccine efficacy for up to less than 4 months [25,26]. Conversely, there were periods where the recovery rate was below 0.1, as shown in Figs 3 and 4. Since this value was not the same for all prefectures, it can be assumed that it depends on the situation of the region and hospitals, as mentioned above. The recovery rate could be determined for each hospital, and the results would be useful in initiating supportive measures for the hospital.

In light of these findings, this study focused on verifying whether the number of new positives and the number of hospital admissions can be used to accurately predict the recovery rate. According to previous studies, prediction of the number of hospital admissions using the SIQR model corresponds to the number of patients under quarantine [2729]. Their main focus was investigating the formulation and solution of coupled differential equations, and their scope did not extend to the reproduction of the number of hospital admissions over a long period. Given these circumstances, it would rather be beneficial to apply the recovery rates determined by this method to mathematical models such as SIR, SIQR, and SEIR, which require the recovery rate parameter. Equation (1) is a “conservation equation” for the number of hospitalized patients. This relationship should hold true for any closed system, irrespective of the complexity of the model employed.

A limitation of this method is that the scenario on which the relationship in equation (3) and this estimation method are based is no longer valid. In other words, all new positive cases are no longer managed in quarantine as (persons) needing treatment. In such circumstances, it is necessary to model and validate the relationship on which the estimation method is based when predicting hospital admissions, considering the conditions of the newly admitted patients (e.g., the severity of illness).

However, applying this to each hospital is expected to reveal differences in factors specific to each hospital, such as disease severity, age, and comorbidities of the patients, and the healthcare system. If the data collection system utilized in this study can be established as a countermeasure in the next pandemic, it could be widely adopted internationally to determine recovery rates in mathematical models and predict bed shortages.

Conclusions

To address the risks of insufficient hospital beds, it is important not only to predict hospital admissions but also to leverage artificial intelligence tools to address the needs of patients [30] according to their medical conditions, in the event of a surge in hospital admissions. It is hoped that these efforts will curtail human intervention in the medical field. To mitigate the risk of a shortage of hospital beds during a pandemic emergency, it is imperative that strategies such as tools to predict the number of patients requiring treatment within a few days in response to trends in new positive cases and measures to secure sickbeds be prepared on a global scale.

Acknowledgments

I would like to thank Editage (www.editage.com) for English language editing.

References

  1. 1. Berlin G, Singhal S, Lapointe M, Schulz J. Challenges emerge for the US healthcare system as COVID-19 cases rise. https://www.mckinsey.com/industries/healthcare/our-insights/challenges-emerge-for-the-us-healthcare-system-as-covid-19-cases-rise. 2020. Accessed 2024 June 10.
  2. 2. Nakahara S, Inada H, Ichikawa M, Tomio J. Japan’s Slow Response to Improve Access to Inpatient Care for COVID-19 Patients. Front Public Health. 2022;9:791182. pmid:35141187
  3. 3. World Health Organization WHO. COVID-19 continues to disrupt essential health services in 90% of countries. https://www.who.int/news/item/23-04-2021-covid-19-continues-to-disrupt-essential-health-services-in-90-of-countries. 2021. Accessed 2024 June 10.
  4. 4. World Health Organization WHO. The COVID-19 pandemic and continuing challenges to global health. A healthy return: investment case for a sustainably financed. World Health Organization.
  5. 5. Doi T. Weaknesses of medical care in Japan that may collapse due to corona: why is the number of days of hospitalization in Japan longer than the OECD average? https://toyokeizai.net/articles/-/342168. 2020. Accessed 2024 June 10.
  6. 6. Hayashi M. Accelerate structural reform of the healthcare delivery system. https://www.yomiuri.co.jp/choken/kijironko/ckmedical/20210728-OYT8T50092/. 2021. Accessed 2024 June 10.
  7. 7. Ghafouri-Fard S, Mohammad-Rahimi H, Motie P, Minabi MAS, Taheri M, Nateghinia S. Application of machine learning in the prediction of COVID-19 daily new cases: A scoping review. Heliyon. 2021;7(10):e08143. pmid:34660935
  8. 8. Odagaki T. Analysis of the outbreak of COVID-19 in Japan by SIQR model. Infect Dis Model. 2020;5:691–8. pmid:32935071
  9. 9. Zhao Y-F, Shou M-H, Wang Z-X. Prediction of the Number of Patients Infected with COVID-19 Based on Rolling Grey Verhulst Models. Int J Environ Res Public Health. 2020;17(12):4582. pmid:32630565
  10. 10. Shah S, Mulahuwaish A, Ghafoor KZ, Maghdid HS. Prediction of global spread of COVID-19 pandemic: a review and research challenges. Artif Intell Rev. 2022;55(3):1607–28. pmid:34305251
  11. 11. Shamsoddin E. Can medical practitioners rely on prediction models for COVID-19? A systematic review. Evid Based Dent. 2020;21(3):84–6. pmid:32978532
  12. 12. Vadyala SR, Betgeri SN, Sherer EA, Amritphale A. Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM. Array (N Y). 2021;11:100085. pmid:35083430
  13. 13. Pham Q-V, Nguyen DC, Huynh-The T, Hwang W-J, Pathirana PN. Artificial Intelligence (AI) and Big Data for Coronavirus (COVID-19) Pandemic: A Survey on the State-of-the-Arts. IEEE Access. 2020;8:130820–39. pmid:34812339
  14. 14. Bandekar SR, Ghosh M. Mathematical modeling of COVID-19 in India and its states with optimal control. Model Earth Syst Environ. 2022;8(2):2019–34. pmid:34127946
  15. 15. Maki K. Analytical tool for COVID-19 using an SIR model equivalent to the chain reaction equation of infection. Biosystems. 2023;233:105029. pmid:37690531
  16. 16. Ministry of Health, Labour and Welfare. Headquarters for promotion of countermeasures against infectious diseases of new-type coronaviruses, revision of the notification of all cases for the transition to a new phase of coronavirus. https://www.mhlw.go.jp/content/000993000.pdf
  17. 17. Ministry of Health, Labour and Welfare. Headquarters for promotion of countermeasures against infectious diseases of new-type coronaviruses, revision of the period of treatment, etc. for patients with COVID-19. https://www.mhlw.go.jp/content/000987473.pdf
  18. 18. Ministry of Health, Labour and Welfare. Headquarters for Promotion of Countermeasures against Infectious Diseases of New-type Coronaviruses, Handling of Admissions, Discharges, Concentrated Contacts, and Public Announcements Concerning Patients with Confirmed Infection with the B.1.1.529 Strain (Omicron Strain). 2024. https://www.mhlw.go.jp/content/000876461.pdf
  19. 19. Maki K. An interpretation of COVID-19 in Tokyo using a combination of SIR models. Proc Jpn Acad Ser B Phys Biol Sci. 2022;98(2):87–92. pmid:35153271
  20. 20. Ministry of Health, Labour and Welfare. PANGO lineage change of new Corona genome, Detection of novel coronaviruses by strain based on genome surveillance. 2022. https://www.mhlw.go.jp/stf/seisakunitsuite/newpage_00061.html
  21. 21. Ministry of Health, Labour and Welfare. Summary of the number of vaccinations by vaccination date. https://www.mhlw.go.jp/content/001243481.csv
  22. 22. Research and training institute. Changes in the emergency declaration periods, etc., for novel coronavirus infections. Japan: Ministry of Justice. 2024. https://hakusyo1.moj.go.jp/jp/69/nfm/n69_2_7_2_0_3.html
  23. 23. Fisman DN, Tuite AR. Evaluation of the relative virulence of novel SARS-CoV-2 variants: a retrospective cohort study in Ontario, Canada. CMAJ. 2021;193(42):E1619–25. pmid:34610919
  24. 24. Kumar A, Asghar A, Singh HN, Faiq MA, Kumar S, Narayan RK, et al. SARS-CoV-2 Omicron Variant Genomic Sequences and Their Epidemiological Correlates Regarding the End of the Pandemic: In Silico Analysis. JMIR Bioinform Biotechnol. 2023;4:e42700. pmid:36688013
  25. 25. Thomas SJ, Moreira ED Jr, Kitchin N, Absalon J, Gurtman A, Lockhart S, et al. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine through 6 Months. N Engl J Med. 2021;385(19):1761–73. pmid:34525277
  26. 26. Tseng HF, Ackerson BK, Luo Y, Sy LS, Talarico CA, Tian Y, et al. Effectiveness of mRNA-1273 against SARS-CoV-2 Omicron and Delta variants. Nat Med. 2022;28(5):1063–71. pmid:35189624
  27. 27. Atkins S, Malisoff M. Robustness of feedback control for SIQR epidemic model under measurement uncertainty. MCRF. 2025;15(1):68–100.
  28. 28. Odagaki T. Exact properties of SIQR model for COVID-19. Physica A. 2021;564:125564. pmid:33250562
  29. 29. Öz Y. Analytical investigation of compartmental models and measure for reactions of governments. Eur Phys J E Soft Matter. 2022;45(8):68. pmid:35978210
  30. 30. Lalmuanawma S, Hussain J, Chhakchhuak L. Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review. Chaos Solitons Fractals. 2020;139:110059. pmid:32834612