Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Dynamical variations, impact factors, and prediction of echinoco-ccosis in Xinjiang by ARIMA-Random Forest Hybrid Model

  • Fenghan Wang,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Writing – original draft

    Affiliation Shanghai 411 Hospital, China RongTong Medical Healthcare Group Co.Ltd./411 Hospital, Shanghai University, Shanghai, China

  • Xuedong Yang,

    Roles Conceptualization, Data curation, Formal analysis, Methodology

    Affiliation Shanghai 411 Hospital, China RongTong Medical Healthcare Group Co.Ltd./411 Hospital, Shanghai University, Shanghai, China

  • Qianqian Zhang,

    Roles Software, Visualization

    Affiliation School of Global Health, Chinese Center for Tropical Diseases Research, Shanghai Jiao Tong University School of Medicine, Shanghai, China

  • Zengyun Hu ,

    Roles Conceptualization, Validation, Writing – original draft, Writing – review & editing

    huzengyun@ms.xjb.ac.cn

    Affiliation School of Global Health, Chinese Center for Tropical Diseases Research, Shanghai Jiao Tong University School of Medicine, Shanghai, China

  • Xian Zhang,

    Roles Investigation, Validation

    Affiliation School of Public Health, Zhengzhou University, Zhengzhou, China

  • Jiangshan Zhao,

    Roles Investigation, Supervision, Writing – review & editing

    Affiliation Center for Disease Control and Prevention of Xinjiang Uygur Autonomous Region, Urumqi, China

  • Nazrullozoda Sulaimon

    Roles Investigation, Visualization

    Affiliation Institute of Veterinary Medicine of the Tajik Academy of Agricultural Sciences, Dushanbe, Republic of Tajikistan

Abstract

Background

Xinjiang is the second largest pastoral areas, and the main arid and semi-arid regions in China. The echinococcosis in Xinjiang has been brought serious challenge and large pressure for the disease control and prevention.

Methods

We comprehensively investigated the echinococcosis temporal variations at multiple time scales in Xinjiang during the period of 2004–2020. The relationships between the echinococcosis and the impact factors (i.e., Tmp: temperature, Pre: precipitation, RH: relative humidity, and SD: sunshine duration), and MR (medicine rate accounting in gross domestic product) are detected. Moreover, the echinococcosis is predicted by the combined model: ARIMA (autoregressive integrated moving average) and RF(random forest) hybrid model using the five factors.

Results

The results indicate the echinococcosis has a significant increased trend for both confirmed cases and incidence rates with the annual trend values of 94.48 cases per year, and 0.339 new cases per 100,000 population per year. Moreover, the echinococcosis in Xinjiang has the nonlinear characteristics with the multiple periods of the 3-, 6-, 13-, 40-, and 67-month for the confirmed cases, and 3-, 6-, 12-, 34-, and 73-month for the incidence rates. In terms of the impact factors, Tmp has the positive impacts on echinococcosis, and SD has the negative impact at annual and seasonal scales. Pre has the positive impact on echinococcosis at annual, June, July and August (JJA), and September, October, and November (SON). RH has the positive relationship at JJA. MR has the significant positive relationship with echinococcosis. The ARIMA-RF hybrid model has high performance in predicting the echinococcosis variations.

Conclusions

Echinococcosis in Xinjiang has a significant increased trend during the period of 2004–2020. Tmp and MR have the positive impact on the echinococcosis. The ARIMA-RF hybrid model can well predict the disease variations. Our finding can provide more characteristics about the echinococcosis variations in Xinjiang, which is the basic and important information for the disease control and prevision.

1. Introduction

With the ongoing global climate change and intensified human activities, the zoonotic spillover risk has been seriously increased [1]. It is found that 58% of known human pathogenic diseases have been aggravated by climate change [2]. Zoonoses seriously threaten the public health and global security and have caused the majority of recent global pandemics in humans [3]. China and the southeast Asia have the high population density and large wildlife. Their ecosystem environment is seriously impacted by the climate change and human activities. Therefore, China and the southeast Asia are the zoonosis hotspot regions [4]. Recent studies display that the living environment and behavioral patterns of the wildlife have been significantly impacted by the climate change and human activities which will cause the new contact between the humans and wildlife and eventually provide the advantages for the new epidemics of the zoonoses [1].

As one of the global distributed zoonoses prevailing in human and animals, echinococcosis is caused by adult or larval stages of tapeworms (cestodes) belonging to the genus Echinococcus (family Taeniidae) [5]. It can be found on all continents, with highest prevalence in parts of Eurasia (especially Mediterranean countries, the Russian Federation and adjacent independent states, and China), north and east Africa, Australia, and South America [6]. The alveolar echinococcosis (AE) and cystic echinococcosis (CE) are listed as the neglected tropical diseases and neglected zoonoses, which have 2–3 million people affected and 200,000 new cases diagnosed annually, and CE has more than 1 million disability-adjusted of life years (DALYs) [68]. AE and CE are highlighted as the second and third most important foodborne parasitic diseases by WHO and the Food and Agriculture Organization of the UN, respectively [9]. Moreover, costs of treatment for humans and economic losses to the livestock industry have been estimated larger than 2 billion dollars [10]. Only the annual global cost of CE is more than 750 million dollars, which highly exacerbates the economic burden of already low-income regions [11].

To control and decrease the CE prevalence, large efficient measures are employed. For example, the health education can effectively reduce the transmission of echinococcosis [12]. Improved control of stray dogs, echinococcidal treatments of working sheep dogs, and providing means for safe disposal of slaughtered sheep offal can lead to a decline in prevalence of E. granulosus in dogs [13]. In 2006, the Chinese government employed the national control program for echinococcosis, and the human echinococcosis prevalence had decreased from 1.08% in 2004, to 0.28% in 2016 after its completion [1416].

The living environment of wild animals are largely controlled by the climate changes and land use and land cover which results in the echinococcus disease control depending on them [17]. Temperature (Tmp), precipitation (Pre), and relative humidity (RH) are the main climate risk factors for echinococcosis, and high RH can result in high echinococcosis risk [18]. Land surface Tmp in spring has negative impact on the prevalence of human CE in western China [19]. Echinococcus granulosus eggs are sensitive to changes in Tmp and humidity, which make them more likely to survive in low Tmp and high humidity environments [20]. The annual average precipitation has the significant nonlinear relationship with the prevalence of CE [21]. Economic condition also plays the key role on the prevalence of CE, better economy with lower prevalence [21,22].

Hence, it is important to explore the echinococcosis characteristics and the relationships related to the climate changes. As one of the high prevalence echinococcosis regions in China, Xinjiang faces the large disease control and prevention pressure because of its scarce medical resources due to the low economic level. To have an efficient control and prevention for echinococcosis in Xinjiang, three questions should be urgently answered as following (1) what are the variation characteristics of the echinococcosis? (2) what are the relationships between the echinococcosis and impact factors (climate factors and medical resource)? (3) whether the echinococcosis variations can be predicted by the impact factors using the artificial intelligence approaches?

Therefore, to address the above questions, we will explore the multiple time scale characteristics of the echinococcosis in Xinjiang during the period of 2004–2020, and detect the relationships between the echinococcosis and impact factors.

2. Study area, datasets, and methods

2.1 Study area and datasets

As the core area of the Silk Road, Xinjiang is located at the northwestern China and far from the sea covering the area of 1.66 × 106 km2 (Fig S1 in S1 File). As controlled by the westerly circulation, Xinjiang has the arid and semiarid climate characteristics with the annual total precipitation of 150 mm and annual mean air temperature of 8°C [23]. As the second largest grazing land in China, the grassland area is 5.73 × 107 hm2 accounting for 36% of the whole Xinjiang area. Until the end of 2022, the total number of the livestock is more than 59.85 million. As the major zoonotic in Xinjiang, the echinococcosis seriously threatens the health of the livestock and the residents.

The monthly echinococcosis data of Xinjiang is the national surveillance data downloaded from the Public Health Science Data Center of China CDC with the period of 2004–2020 (https://www.phsciencedata.cn/Share/), including the two variables of the echinococcosis confirmed case (EC) and incidence rate (IR). The climate factors include the temperature (Tmp), precipitation (Pre), relative humidity (RH), and sunshine duration (SD) which are downloaded from the China Meteorological Administration (https://data.cma.cn/). The medicine rate of the total health expenses on the GDP (Gross Domestic Product) (MR) is from the National Bureau of Statistics (https://www.stats.gov.cn/sj/ndsj/).

2.2 Methods

The temporal variations of the echinococcosis in Xinjiang are illustrated at annual and seasonal time scales. The four seasons are defined as spring (MAM: March, April, and May), summer (JJA: June, July, and August), fall (SON: September, October, and November) and winter (DJF: December, January, and February). The annual data and seasonal data are summed by the monthly data. The linear trend of the echinococcosis is obtained by the linear least square method and its significance is detected by the Student’s t-test at the 95% confidence level (P < 0.05).

In this study, the Ensemble empirical mode decomposition (EEMD) is used to explore the multiple periods of the echinococcosis. ARIMA model and Random Forest model are used to predict the echinococcosis variations, and the statistical metrics are used to measure the model’s performance. 80% data is used to train the model, and the other 20% data is used to test the model.

The research framework of this study is illustrated in Fig 1.

2.2.1 Ensemble empirical mode decomposition (EEMD) method.

The multiple periods of the temporal variations are analyzed by the ensemble empirical mode decomposition (EEMD) method. EEMD can well extract the multi-periods characteristics of the original time series (including the nonlinear and non-stationary time series. Because of its effectiveness, EEMD has been widely used in large areas [24]. The EEMD method is composed by the following processes. For a time series , a white noise with finite amplitude is added, then we have

(1)

is decomposed by the intrinsic mode functions (IMFs), , as

(2)

Where is the residue of , after number of IMFs with different periods are extracted. The nonlinear trend is reflected by the residue term After a large amount of decompositions with different noise realizations added and an ensemble, the noise added cancel each other and obtain the final oscillation components () and the residual.

2.2.2 ARIMA model and Random Forest model.

Constructed by Box and Jenkins, the Autoregressive Integrated Moving Average (ARIMA) models are one of the widely used time series models in the simulation and prediction. The basic form of ARIMA model is ARIMA (p, d, q), where the non-negative integers p and q are the orders of autoregressive and moving average polynomials respectively; d is the non-seasonal differencing required to make data stationary. An ARIMA (p, d, q) model can be expressed using lag polynomial as the following equation

(3)

where is a random error at time , and are the coefficients.

Generally, ARIMA model can capture both nonseasonal and seasonal patterns of time series. There are three steps to forecast the time series: model identification, parameter estimation, and diagnostic checking of the model. In the first step of model identification, the stationarity and seasonality of the time series are determined, which need to be modeled before parameter estimation. The augmented Dickey-Fuller (ADF) test is used to detect whether the time series is stationary. If the P values of the ADF test is less than 0.05, which indicates that the time series is stationary. If the time series is non-stationary, an autocorrelation function (ACF) plot is used to judge it as stationarity with the differencing transformation, and the parameter d is determined. Seasonality can be obtained by taking seasonal differencing and regenerating ACF and partial autocorrelation function (PACF) plots.

For the ARIMA model identification, ACF and PACF plots are also helpful to determine the values of parameters of p and q. The commonly used method: maximum likelihood is employed to estimate the parameters of the appropriately selected model. In the end, the overall adequacy of the model is checked by the Ljung and Box test. In this study, the R 4.0.5 version is applied to construct the ARIMA model in simulating and predicting the echinococcosis time series including the confirmed case time series and the incidence rate time series during the period of 2004–2020. The parameters of p and q are set as 5 and 0 based on the AIC criteria.

The fundamental concept of random forest model involves creating multiple independent decision tree models by randomly sampling data from the dataset. By averaging their prediction results, this technique mitigates errors and overfitting issues inherent in single models, thus improving the accuracy and robustness of predictions. In epidemiology, the supervised machine learning method of random forest regression has been used to predict the recent spatiotemporal spread of COVID-19 globally, yielding promising results. Studies have demonstrated that random forest outperforms other algorithms in predicting the relationships between cases, deaths, and infections of diseases such as influenza and dengue, as well as their transmission in relation to climatic factors, landscape elements, and human behaviors.

In this study, the echinococcosis is predicted by the ARIMA model and the random forest model, respectively. Then, it is predicted by the combined model of ARIMA-RF hybrid model. In the combined model, the echinococcosis is firstly predicted by the ARIMA model, and then the residual time series is predicted by the combined model.

2.2.3 Statistical metrics in measuring the model’s performance.

The relationship between the echinococcosis and climate factors is measured by the correlation coefficient (CC) at 95% confidence level. After obtained the relationships, the echinococcosis variations are predicted by four artificial intelligence models.

To quantify the simulation performance of the model, some statistical metrics were employed, including the correlation coefficient (CC), absolute error (AE), root mean square error (RMSE), and Distance between Indices of Simulation and Observation (DISO) [2527]. They are expressed as follows:

(4)(5)(6)(7)

where ai and bi(i= 1, 2, …, n) represent the observed and simulated data, respectively. NAE and NRMSE are normalized by the average values of the observed time series.

3. Result

3.1 Linear temporal characteristics of the echinococcosis in Xinjiang at multiple time scales

In this section, the comprehensive temporal characteristics of the echinococcosis in Xinjiang at different time scales are displayed in Fig 2, including the annual, seasonal and monthly scales. For the annual echinococcosis, the average value of EC is 1288 during the period of 2004–2020 with the maximum value of 2362 in 2017, and the minimum value of 173 in 2004, and the average value of IR is 5.35% with the corresponding maximum value of 9.76%, and the minimum value of 0.93% (Fig 2, Table S1 in S1 File). A significant increasing linear trend (P < 0.05) of EC is observed with the value of 94.79 per year in 2004–2020. Particularly, the annual echinococcosis is persistently increased from 173 in 2004–1525 in 2012, and decreased in 2013–2015 with the values of 1502, 1440, and 1370, respectively. It becomes increased tendency in 2016 and 2017 with the values of 1915 and 2488. And then, the confirmed number is decreased at 2018, 2019, and 2020 with the values of 1747, 1757, and 1046 (Fig 2A). For IR, it has the significant increasing linear trend with the value of 0.34% per year (P < 0.05), and the temporal variation is similar as the EC (Fig 2B). The largest value in 2017 mainly is caused by the mass screening, and the decreased values of echinococcosis after 2017 is resulted from the efficient control measurements.

thumbnail
Fig 2. Temporal variations of the annual echinococcosis, (A) for EC (confirmed cases), and (B) for IR (cases per 100,000 population) during the period of 2004-2020, where the black straight line is the linear trend obtained by the linear least square method, the blue area is the 95% confidence interval.

https://doi.org/10.1371/journal.pone.0326433.g002

MAM, JJA, SON, and DJF have the significant linear increased tendencies with the values of 24.32, 23.93, 20.19, and 25.95 per year for EC, and 0.09%, 0.08%, 0.06%, and 0.13% per year for IR (Fig S2 in S1 File), For the intra-annual variation, it is decreased from January to October, and it becomes increase to December(Fig S3 in S1 File). The monthly EC and IR have the significant positive linear trends with the values of 0.648 per month, and 0.002 per month during the (Fig S4 in S1 File, Text S1).

3.2 Multiple periods of the echinococcosis variations obtained by EEMD

The multiple periods of the confirmed cases and incidence rates of the echinococcosis in Xinjiang during the period of 2004–2020 are decomposed by EEMD in Fig 3. It shows that the confirmed cases have multiple periods with the values of 3, 6, 13, 40, 67, and 190 months for IMF1, IMF2, IMF3, IMF4, IMF5 and IMF6, and the corresponding contributions are 21%, 23%, 16%, 14%, 21%, and 5% (Fig 3A, Table S2 in S1 File). For the incidence rates, the EEMD results are similar as the results of the confirmed cases. The multiple periods of the incidence rate are 3, 6, 12, 34, 73, and 190 months for IMF1, IMF2, IMF3, IMF4, IMF5 and IMF6 with the corresponding contributions of 22%, 20%, 14%, 11%, 24%, and 8% (Fig 3B, Table S2 in S1 File).

thumbnail
Fig 3. The decomposition results of the monthly confirmed cases (A) and incidence rates (B) in Xinjiang obtained by EEMD during the period of 2004-2020, where IMF1, IMF2, IMF3, IMF4, IMF5, and IMF 6 are the decomposed period time series, and is the nonlinear trend.

https://doi.org/10.1371/journal.pone.0326433.g003

The above results suggest that the echinococcosis variations in Xinjiang during the period of 2004–2020 have multiple periods, which can be used for the prediction in future and provide important information for the disease control and prevision.

3.3 Relationships between the echinococcosis and the impact factors in Xinjiang during the period of 2004–2020

The temporal characteristics of the impact factors at multiple time scales are provided in the Text S2, which show the significant positive trend of Tmp with the value of 0.03 °C per year, RH (−0.17 per year), and SD (−8.53 per year) (Table S3 in S1 File). The relationships between the echinococcosis and the impact factors in Xinjiang are explored during the period of 2004−2020. The CC values are applied to measure their relationships in Fig 4, and the significant is detected by the t test at 95% confidence level. For the annual scale, the MR has the largest positive impact on the echinococcosis among the five factors, and the CC values are 0.81 for EC, and 0.75 for IR (Fig 4). Followed by the Tmp and Pre with the positive impacts on the echinococcosis, and the corresponding CC values are 0.31 and 0.19 for EC, and 0.24 and 0.24 for IR. SD has the negative impact on EC and IR with the CC values of −0.47 and −0.38.

thumbnail
Fig 4. Correlation coefficient results between the echinococcosis and the impact factors during the period of 2004-2020 at multiple time scales: ANN, MAM, JJA, SON, DJF, and MON, where (A) for EC, (B) for IR.

https://doi.org/10.1371/journal.pone.0326433.g004

For the seasonal scale, Tmp has the positive impact on EC and IR at the four seasons with the largest impact at JJA (CC = 0.49, 0.4), and followed at SON, DJF, and MAM. Pre has the positive impact on EC and IR at JJA (CC = 0.11, 0.14) and SON (CC = 0.29, 0.33), It is negative correlation between Pre and echinococcosis at DJF with the CC values of −0.21 for EC and −0.17 for IR. For RH, except the positive correlation with echinococcosis (CC = 0.25 for EC, 0.24 for IR), it has the negative impact on echinococcosis at the other three seasons, and the strongest negative impact appears at MAM with the CC = −0.24 for both EC and IR. SD has the negative impact on echinococcosis at all the seasons with the largest magnitudes at JJA. For MR, it has the largest significant positive impact (P < 0.05) on echinococcosis than the other factors in all the seasons, with the CC values of 0.83, 0.79, 0.76, and 0.92 for EC, 0.78, 0.72, 0.7, 0.9 for IR. For the monthly scale, MR has the significant positive impact on echinococcosis, and the other factors have no significant impact (Fig 4).

3.4 Prediction of the echinococcosis by ARIMA-RF hybrid forecast model

Based on the above result in Section 4, in this section, we provide the prediction results compared by the single model (i.e., ARIMA and RF) and combined model (ARIMA-RF Hybrid), which illustrate that the echinococcosis variations of Xinjiang will be well predicted by ARIMA-RF hybrid model. The prediction results are displayed in Figs S5 and S6 in S1 File, and Fig 5, and the models’ performances are evaluated by CC, AE, RMSE, and DISO in Table S4 in S1 File. Because the echinococcosis confirmed cases and the incidence rates have the similar temporal variations, the echinococcosis confirmed cases is only predicted by the above three models.

thumbnail
Fig 5. Prediction results of the echinococcosis confirmed cases obtained by the combined model: ARIMA-RF hybrid model.

https://doi.org/10.1371/journal.pone.0326433.g005

Fig S5 in S1 File shows that the temporal variations of the echinococcosis confirmed cases are well captured by the ARIMA model. Moreover, the peak values are also predicted. The corresponding statistical metrics are 19.48, 27.77, 0.87, and 1.42 for AE, RMSE, CC, and DISO, respectively (Table S4 in S1 File).

The ARIMA prediction is only based on the temporal characteristics of the echinococcosis confirmed cases. However, the four climate factors (i.e., Tmp, Pre, RH, and SD) and MR are not considered in the model. Therefore, the four climate factors and MR are included as the input data to predict the echinococcosis confirmed cases by RF. The prediction result of random forest model is provided in Fig S6 in S1 File.

Fig S6 in S1 File show that the prediction result of random forest considering the four climate factors and MR has the higher accurate than the result of ARIMA in Fig S5 in S1 File. The statistical metrics are 12.53, 19.31, 0.95, and 0.95 for AE, RMSE, CC, and DISO, respectively (Table S4 in S1 File). However, the prediction of the peak values should be improved. Then, the combined model of ARIMA-RF is employed to improve the model’s performance (Fig 5). In the combined model, the echinococcosis confirmed cases are first predicted by ARIMA, and the residual time series is predicted by random forest. The predicted result is displayed in Fig 5, which indicates that the prediction of the combined model has the best performance than the ARIMA and random forest. The corresponding statistical metrics of the combined model are 9.81, 13.99, 0.97, and 0.71 for AE, RMSE, CC, and DISO in Table S4 in S1 File.

4. Discussion

As one of the twenty neglected tropical diseases by the WHO and targeted for control for more than several decades, the echinococcosis poses a significant threat to public health and livestock industry [28,29] More than one million people are globally affected by echinococcosis, and the annual cost of cost of treatment and losses to the livestock industry is estimated to be $760 million [30]. In western China, about 50 million people are at the risk of echinococcosis infection [31]. Therefore, taking the Xinjiang as a case study, we comprehensively investigate the echinococcosis characteristics and predict its temporal variations based on the combined model: ARIMA-RF.

The linear trends of the temporal variations and the multiple periods of the echinococcosis in Xinjiang are obtained during the period of 2004–2020. The clinical symptoms and diagnosis of the echinococcosis are not analyzed. According to the ultrasonography screening of hepatic cystic echinococcosis in sheep flocks in a county of Xinjiang [32], it suggested that culled aged sheep play a key role in the transmission of CE. A deep convolutional neural network model was developed to identify echinococcosis and its types, which indicated that this model showed significantly better performance compared with senior radiologists from a high-endemicity area [33]. The epidemic characteristics of the human cystic and alveolar echinococcosis in Kyrgyzstan were explored.

Our previous studies constructed the dynamic models using the ordinary difference equations to explore the effects of increasing the sheep number and health education on the echinococcosis control [12,34,35]. This study is using the combined model of ARIMA-RF to predict the temporal variations of the echinococcosis using the four climate factors and MR, which can provide important scientific basis for the disease control and prevention. Our result suggest that RH has the positive relationship which is consistent with some previous works [18,20]. MR has the significant positive relationship with echinococcosis, which shows that the regions have better economy with lower prevalence for CE [22].

Besides the four climate factors and MR, other driving factors are also important for the CE prevalence. Health education, improvements in sanitation, and interventions targeted at humans (ultrasound screening, surgical and albendazole treatment) and dogs (management and deworming) were the main measures implemented [12,18].In order to investigate the dynamic variations of the echinococcosis transmission between impact factors, human, dogs, and livestock, more datasets should be added, such as the dog data and the livestock data, and more impact factor data. Other statistical models are also applied to predict the diseases [3638]. A novel Bayesian spatio-temporal model is proposed to predict emerging infectious disease [37]. The multivariable linear regression model and the stepwise regression model, the multinomial logistic regression model, the naive Bayesian classification model, and the classification and regression tree model (CART) was established, and the CART model had the highest accuracy, sensitivity, and specificity values, and the multinomial logistic regression model had the highest precision value [38].Overall, more important research topics about the echinococcosis in Xinjiang should be illustrated in future. For example, the spatial distribution of the echinococcosis can be analyzed when the related data is available. The dynamic epidemic model is necessary to explore the dynamic behaviors of the echinococcosis transmission among human and animals. Moreover, the echinococcosis disease burden is also urgent.

5. Conclusion

In this study, we firstly investigated the echinococcosis temporal variations (linear trend) at multiple time scales: annual, seasonal and monthly scales in Xinjiang during the period of 2004–2020. The nonlinear characteristics (multiple periods) of the echinococcosis are explored by the EEMD method. The relationships between the echinococcosis and the four climate factors (i.e., Tmp, Pre, RH, and SD), and MR are also detected from the annual, seasonal, and monthly scales. At last, the echinococcosis is predicted by two single models: ARIMA and random forest, and their combined model. The major results are concluded as follows.

  1. (1) The echinococcosis in Xinjiang has a significant increased trend for both confirmed cases and incidence rates. The annual linear trend of the confirmed cases and incidence rates are 94.48 per year, and 0.339 per year, and the corresponding monthly linear trends are 0.65 per year, and 0.0023 per year.
  2. (2) According to the EEMD result, the echinococcosis in Xinjiang has the nonlinear characteristics with the multiple periods of the 3-, 6-, 13-, 40-, and 67-month for the confirmed cases, and 3-, 6-, 12-, 34-, and 73-month for the incidence rates.
  3. (3) Among the climate factors, Tmp has the positive impacts on echinococcosis, and SD has the negative impact at annual and seasonal scales. Pre has the positive impact on echinococcosis at annual, JJA, and SON. RH has the positive relationship at JJA. MR has the significant positive impact with the CC values larger than 0.6, which indicates that the increased medical resource can detect more confirmed cases.
  4. (4) ARIMA model and random forest model can capture the echinococcosis temporal variations in Xinjiang. The combined model of the ARIMA-RF has the best performance than the two single models with the CC value of 0.97, and DISO value of 0.71.

Echinococcosis is one of the important zoonoses in Xinjiang, which plays a key role for the social-economic development and the local human health. The impact factors and high accurate simulation and prediction of the echinococcosis variations can provide the basic and important information for the disease control and prevision. In our future study, we should focus on the one health concept including the environment, animal, and human. More environment data, more animal data (e.g., dog, cattle and sheep), and more confirmed case data are still the important scientific information to reveal more echinococcosis characteristics, to predict or even early warn the echinococcosis transmission. In the end, based on the more echinococcosis characteristics, higher accurate prediction, we can propose more scientific and reasonable response measures to the local government.

Supporting information

S1 File. PLOS ONE supplementary material mouthguard compliance.

https://doi.org/10.1371/journal.pone.0326433.s001

(DOCX)

References

  1. 1. Carlson CJ, Albery GF, Merow C, Trisos CH, Zipfel CM, Eskew EA, et al. Climate change increases cross-species viral transmission risk. Nature. 2022;607(7919):555–62. pmid:35483403
  2. 2. Mora C, McKenzie T, Gaw IM, Dean JM, von Hammerstein H, Knudson TA, et al. Over half of known human pathogenic diseases can be aggravated by climate change. Nat Clim Chang. 2022;12(9):869–75. pmid:35968032
  3. 3. Morse SS, Mazet JAK, Woolhouse M, Parrish CR, Carroll D, Karesh WB, et al. Prediction and prevention of the next pandemic zoonosis. Lancet. 2012;380(9857):1956–65. pmid:23200504
  4. 4. Allen T, Murray KA, Zambrana-Torrelio C, Morse SS, Rondinini C, Di Marco M, et al. Global hotspots and correlates of emerging zoonotic diseases. Nat Commun. 2017;8(1):1124. pmid:29066781
  5. 5. Cenni L, Simoncini A, Massetti L, Rizzoli A, Hauffe HC, Massolo A. Current and future distribution of a parasite with complex life cycle under global change scenarios: Echinococcus multilocularis in Europe. Glob Chang Biol. 2023;29(9):2436–49. pmid:36815401
  6. 6. Eckert J, Schantz P, Gasser R. Geographic distribution and prevalence. In: Eckert J, Gemmell MA, Meslin F-X, Pawlowski ZS, editors. WHO/OIE manual on echinococcosis in humans and animals: a public health problem of global concern. Paris: World Organization for Animal Health; 2001. p. 100–41.
  7. 7. WHO. Global report on neglected tropical diseases [EB/OL. (2023⁃01⁃29) [2023⁃02⁃26]; 2023. Available from: https://www.who.int/publications/i/item/9789240067295
  8. 8. Casulli A. Recognising the substantial burden of neglected pandemics cystic and alveolar echinococcosis. Lancet Glob Health. 2020;8(4):e470–1. pmid:32199112
  9. 9. Paternoster G, Boo G, Wang C, Minbaeva G, Usubalieva J, Raimkulov KM, et al. Epidemic cystic and alveolar echinococcosis in Kyrgyzstan: an analysis of national surveillance data. Lancet Glob Health. 2020;8(4):e603–11. pmid:32199126
  10. 10. Atkinson J-AM, Gray DJ, Clements ACA, Barnes TS, McManus DP, Yang YR. Environmental changes impacting Echinococcus transmission: research to support predictive surveillance and control. Glob Chang Biol. 2013;19(3):677–88. pmid:23504826
  11. 11. WHO. Ending the neglect to attain the Sustainable Development Goals: a road map for neglected tropical diseases 2021-2030. World Health Organization; 2021. Available from: https://www.who.int/publications/i/item/9789240010352
  12. 12. Cui Q, Zhang Q, Hu Z. Modeling and analysis of Cystic Echinococcosis epidemic model with health education. AIMS Math. 2024;9(2):3592–612.
  13. 13. Jiménez S, Pérez A, Gil H, Schantz P, Ramalle E, Juste R. Progress in control of cystic echinococcosis in La Rioja, Spain: decline in infection prevalences in human and animal hosts and economic costs and benefits. Acta Trop. 2002;83(3):213–21. pmid:12204394
  14. 14. National Ministry of Health of the People’s Republic of China. National control plan on the prevention and control of key parasitic diseases (2006-2015). Available from: http://www.gov.cn/gzdt/2006-03/30/content_240456.htm
  15. 15. Wu W, Wang H, Wang Q, Zhou X, Wang L, Zheng C. A nationwide sampling survey on echinococcosis in China during 2012-2016. Chin J Parasitol Parasitic Dis. 2018;36(1):1–14.
  16. 16. Yu Q, Xiao N, Han S, Tian T, Zhou X-N. Progress on the national echinococcosis control programme in China: analysis of humans and dogs population intervention during 2004-2014. Infect Dis Poverty. 2020;9(1):137. pmid:33008476
  17. 17. Wen H, Vuitton L, Tuxun T, Li J, Vuitton DA, Zhang W, et al. Echinococcosis: advances in the 21st century. Clin Microbiol Rev. 2019;32(2):e00075-18. pmid:30760475
  18. 18. Yin J, Wu X, Li C, Han J, Xiang H. The impact of environmental factors on human echinococcosis epidemics: spatial modelling and risk prediction. Parasit Vectors. 2022;15(1):47. pmid:35130957
  19. 19. Huang D, Li R, Qiu J, Sun X, Yuan R, Shi Y, et al. Geographical environment factors and risk mapping of human cystic echinococcosis in Western China. Int J Environ Res Public Health. 2018;15(8):1729. pmid:30103558
  20. 20. Veit P, Bilger B, Schad V, Schäfer J, Frank W, Lucius R. Influence of environmental factors on the infectivity of Echinococcus multilocularis eggs. Parasitology. 1995;110 ( Pt 1):79–86. pmid:7845716
  21. 21. Ma T, Jiang D, Quzhen G, Xue C, Han S, Wu W, et al. Factors influencing the spatial distribution of cystic echinococcosis in Tibet, China. Sci Total Environ. 2021;754:142229. pmid:33254864
  22. 22. Schurer JM, Rafferty E, Farag M, Zeng W, Jenkins EJ. Echinococcosis: an economic evaluation of a veterinary public health intervention in rural Canada. PLoS Negl Trop Dis. 2015;9(7):e0003883. pmid:26135476
  23. 23. Hu Z, Zhang C, Hu Q, Tian H. Temperature changes in Central Asia from 1979 to 2011 based on multiple datasets*. J Clim. 2014;27(3):1143–67.
  24. 24. Hu Z, Zhou Q, Chen X, Qian C, Wang S, Li J. Variations and changes of annual precipitation in Central Asia over the last century. Int J Climatol. 2017;37(S1):157–70.
  25. 25. Peng Y, Zhang H, Zhang Z, Tang B, Shen D, Yin G, et al. Future challenges of terrestrial water storage over the arid regions of Central Asia. Int J Appl Earth Obs Geoinf. 2024;132:104026.
  26. 26. Hu Z, Chen X, Zhou Q, Chen D, Li J. DISO: a rethink of Taylor diagram. Int J Climatol. 2019;39(5):2825–32.
  27. 27. Zhou Q, Chen D, Hu Z, Chen X. Decompositions of Taylor diagram and DISO performance criteria. Int J Climatol. 2021;41(12):5726–32.
  28. 28. Gu H, Hu Y, Guo S, Jin Y, Chen W, Huang C, et al. China’s prevention and control experience of echinococcosis: a 19-year retrospective. J Helminthol. 2024;98:e16. pmid:38305033
  29. 29. Tian T, Miao L, Wang W, Zhou X. Global, regional and national burden of human cystic echinococcosis from 1990 to 2019: a systematic analysis for the Global Burden of Disease Study 2019. Trop Med Infect Dis. 2024;9(4):87. pmid:38668548
  30. 30. Casulli A, Siles-Lucas M, Cretu CM, Vutova K, Akhan O, Vural G, et al. Achievements of the HERACLES project on cystic echinococcosis. Trends Parasitol. 2020;36(1):1–4. pmid:31753546
  31. 31. Qian M-B, Abela-Ridder B, Wu W-P, Zhou X-N. Combating echinococcosis in China: strengthening the research and development. Infect Dis Poverty. 2017;6(1):161. pmid:29157312
  32. 32. Qi X, Song T, Li Z, Jiang T, Zhang Z, Wu C, et al. Ultrasonography screening of hepatic cystic echinococcosis in sheep flocks used for evaluating control progress in a remote mountain area of Hejing County, Xinjiang. BMC Vet Res. 2024;20(1):207. pmid:38760783
  33. 33. Yang Y, Cairang Y, Jiang T, Zhou J, Zhang L, Qi B, et al. Ultrasound identification of hepatic echinococcosis using a deep convolutional neural network model in China: a retrospective, large-scale, multicentre, diagnostic accuracy study. Lancet Digit Health. 2023;5(8):e503–14. pmid:37507196
  34. 34. Cui Q, Shi Z, Yimamaidi D, Hu B, Zhang Z, Saqib M, et al. Dynamic variations in COVID-19 with the SARS-CoV-2 Omicron variant in Kazakhstan and Pakistan. Infect Dis Poverty. 2023;12(1):18. pmid:36918974
  35. 35. He Y, Cui Q, Hu Z. Modeling and analysis of the transmission dynamics of cystic echinococcosis: effects of increasing the number of sheep. Math Biosci Eng. 2023;20(8):14596–615. pmid:37679150
  36. 36. Abdykerimov KK, Kronenberg PA, Isaev M, Paternoster G, Deplazes P, Torgerson PR. Environmental distribution of Echinococcus- and Taenia spp.-contaminated dog feces in Kyrgyzstan. Parasitology. 2024;151(1):84–92. pmid:38018240
  37. 37. Kim J, Lawson AB, Neelon B, Korte JE, Eberth JM, Chowell G. A novel Bayesian spatio-temporal surveillance metric to predict emerging infectious disease areas of high disease risk. Stat Med. 2024;43(28):5300–15. pmid:39385731
  38. 38. Xue C, Liu B, Kui Y, Wu W, Zhou X, Xiao N, et al. Developing a geographical-meteorological indicator system and evaluating prediction models for alveolar echinococcosis in China. J Expo Sci Environ Epidemiol. 2025;35(2):254–63. pmid:38654145