A regressive analysis of the main environmental risk factors of human echinococcosis in 370 counties in China

Background Echinococcosis is a natural focal, highly prevalent disease in China. Factors influencing the spread of echinococcosis are not only related to personal exposure but also closely related to the environment itself. The purpose of this study was to explore the influence of environmental factors on the prevalence of human echinococcosis and to provide a reference for prevention and control of echinococcosis in the future. Methods Data were collected from 370 endemic counties in China in 2018. By downloading Modis, DEM and other remote-sensing images in 2018. Data on environmental factors, i.e., elevation, land surface temperature (LST) and normalized difference vegetation index (NDVI) were collected. Rank correlation analysis was conducted between each environmental factor and the prevalence of echinococcosis at the county level. Negative binomial regression was used to analyze the impact of environmental factors on the prevalence of human echinococcosis at the county level. Results According to rank correlation analysis, the prevalence of human echinococcosis in each county was positively correlated with elevation, negatively correlated with LST, and negatively correlated with NDVI in May, June and July. Negative binomial regression showed that the prevalence of human echinococcosis was negatively correlated with annual LST and summer NDVI, and positively correlated with average elevation and dog infection rate. The prevalence of human cystic echinococcosis was inversely correlated with the annual average LST, and positively correlated with both the average elevation and the prevalence rate of domestic animals. The prevalence of human alveolar echinococcosis was positively correlated with both NDVI in autumn and average elevation, and negatively correlated with NDVI in winter. Conclusion The prevalence of echinococcosis in the population is affected by environmental factors. Environmental risk assessment and prediction can be conducted in order to rationally allocate health resources and improve both prevention and control efficiency of echinococcosis.


Introduction
Echinococcosis, also known as hydatidosis, is a zoonotic parasitic disease caused by the larvae of Echinococcus which is of worldwide concern.Two types of human echinococcosis that are currently prevalent in China, namely, cystic echinococcosis (CE) which is caused by the larvae of Echinococcus granulosus and alveolar echinococcosis (AE) which is caused by the larvae of Echinococcus multilocularis [1].China is one of the countries with the highest prevalence in the world, mainly in pastoral and semi-pastoral areas in the west and north, leading to detrimental impacts on the health of local populations and socio-economic development [2].40% of CE cases and 91% of AE cases worldwide occur in China where the disease burden of CE and AE accounts for 40% and 95% of the global disability-adjusted life years (DALYs), respectively [2][3].In 2018, 47,278 echinococcosis cases have been recorded in China with 44,730,268 people being exposed over 370 endemic counties from 10 provinces or autonomous regions.All 370 counties were endemic for CE while 115 were also endemic for AE [4].
The prevalence of echinococcosis in China displays a strong spatial autocorrelation with a spatial distribution depending upon geographical, meteorological, biological and socio-economic factors [4][5].Because it is a natural focal disease, its prevalence is closely related to the local natural environment.Environmental and ecological factors play a crucial role in the life cycle of Echinococcus, and these environmental factors can affect the spread of echinococcosis to humans [5].Landscape factors are an important driving force for the spread of echinococcosis [6].For example, elevation is positively correlated with the prevalence of CE [7][8].The transmission cycle of E. multilocularis involves wild animals, and small rodents as intermediate hosts.The distribution of small mammals is related to the natural land cover and changes will affect their habitat [9].The risk of human AE is related to the population density of small mammals [10][11].Environmental factors are directly or indirectly influencing the survival rate of Echinococcus eggs, the distribution of wild animal population, the spatial distribution of echinococcosis and the risk of disease in the human population [12].Most researches on influencing factors focus on human behavior.Environmental factors commonly used in echinococcosis studies include elevation, temperature and vegetation index.
This study covered all endemic counties in China and studied the main environmental factors affecting the epidemic of echinococcosis, including elevation, land surface temperature (LST) and normalized difference vegetation index (NDVI).By using univariate rank correlation analysis and fitting county-level negative binomial regression model, we aimed to understand the effects of elevation, LST, and NDVI on the incidence of echinococcosis, providing reference for better targeted prevention and control measures and rationally allocating health resources.

Ethics statement
This survey was approved by the Ethical Review Committee of the National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (No. 20160810).The performed activities were all within the scope of the national project for echinococcosis control.All participants were informed of the content and purpose of the investigation and examinations, potential complications, consequences as well as benefits before examination.Those who agreed to participate were required to sign written informed consent forms.All participants were given feedback.All echinococcosis diagnosed patients provided written agreements to participate and were provided with free drug treatment or subsidized surgical costs.
With the progress of the Central Government's Transfer Payment Project for Echinococcosis Control (CGTPPEC), each county carried out population screening with a coverage rate of over 90%.The incidence rate of echinococcosis was assessed in each township (town, street) by abdominal ultrasound examination of local residents of 2 years old and above.The diagnosis was done according to WS 257 standards [13] based B-ultrasound assisted serological examination.The infection rate of Echinococcus in dogs was assessed by necropsy or arecoline hydrobromide catharsis on more than 20 dogs, including domestic and stray dogs, in each administrative village/community.In villages/communities with less than 20 dogs, all dogs were tested.Detection in dogs is mainly carried out through fecal antigen detection reagents [14].Prevalence data and dog infection rates from the 370 epidemic counties were obtained from the annual report system of the Annual Task of CGTPPEC in 2018.

Acquisition of environmental data
Elevation, LST and Normalized Difference Vegetation Index (NDVI) were extracted as follows.For average elevation data in each county, SRTM3 data with an accuracy of 90m covering the whole area were downloaded.MODIS data MOD11A2 were downloaded from the NASA website (https://search.earthdata.nasa.gov),range and time were set and data were filtered for downloading.Satellite image cloud removal was performed and global data were synthesized.Mosaic, projection, splicing, cutting, data inspection and zonal statistics were then carried out to extract monthly LST in each epidemic county.NDVI is a quantitative index of vegetation coverage and characteristics of vegetation changes.NDVI data require MODIS Reprojection Tool (MRT) for formatting and projection conversion.In order to eliminate the influence of outliers, the maximum composite method (MVC) was used to synthesize NDVI data, and the maximum monthly NDVI image was used to characterize the vegetation coverage.Furthermore, seasonal and annual average LST and NDVI were calculated as follows: spring LST and NDVI refer to the average in March, April and May; summer LST and NDVI refer to the average in June, July and August; autumn LST and NDVI refer to the average in September, October and November; and winter LST and NDVI refer to the average in December, January and February.A comprehensive information database was then integrated and constructed.

Statistical analysis
The normality of the prevalence distribution of echinococcosis at the county level was determined by the Shapiro-Wilk test and its correlation with relevant environmental variables was analyzed.If the distribution was normal, the Pearson correlation coefficient was then used to describe the correlation.If the data did not follow a normal distribution, the Spearman rank correlation coefficient was used.Data in this study displayed a variance higher than the mean, a non-randomness and a spatial autocorrelation of the distribution.Therefore, they were fitted with a negative binomial distribution.The negative binomial model is a generalized linear model with logarithmic links yielding binomial random variables [15][16].In order to test the over dispersion hypothesis and the preference of negative binomial model compared with Poisson model, a Lagrange multiplier test was used.The Proc Genmod program in SAS9.1 (SAS Institute, Cary, NC) was used to model the prevalence at the county level (natural logarithm) by negative binomial regression.Normality test and rank correlation analyses were performed on IBM SPSS 19.0 (Statistical Package for the Social Science).A P value lower-than 0.05 was defined as statistically significant.

Normality of the distribution
The Shapiro-Wilk test showed that the prevalence rate of echinococcosis at the county level did not follow a normal distribution.The W value was 0.428 (p<0.05), which was a normal skew distribution, as shown in Fig 1.The highest LST was 39.47˚C in Shanshan County, Turpan, Xinjiang in July, and the lowest LST was -27.68˚C in Chenbarhu Banner, Hulun Buir, Inner Mongolia in January.The LST range was described in groups.The LST distribution in the 370 endemic counties is shown in Table 1 and Fig 2. The lowest NDVI was 0.00015 in Fuhai County, Altay Prefecture, Xinjiang in January while the highest NDVI was 0.86 in Daguan County, Zhaotong City, Yunnan in July.The NDVI range was described in groups.The distribution of NDVI in 370 endemic counties in each province is shown in Table 2 and Fig 3.

Rank correlation between environmental variables and prevalence of echinococcosis at the county level
Rank correlation analysis was carried out between LST and NDVI of each month, quarter and annual average and the prevalence rate of county-level population, as shown in Table 3.When considering LST and the prevalence of echinococcosis, except for February and December, all months, spring, summer, autumn, winter and annual average LST were significantly negatively correlated with the prevalence of echinococcosis (P <0.05).In 2018, the minimum value of average NDVI in the overall 370 epidemic counties was 0.1468 in January, gradually increased from January to August to reach the maximum value of 0.4577.It then gradually decreased.A positive rank correlation was found in January and February (P <0.05), both of which were 0.115, while a negative correlation was found in April, May and June (P <0.05).
The lowest elevation among the 370 counties is 178.69 m, in Horqin district, Tongliao City, Inner Mongolia Autonomous Region.The highest elevation is 5155.67 m in Ritu County, Ali Prefecture, Tibet Autonomous Region.Most endemic counties are located between 2000-3000 m (Table 4).Endemic areas in Inner Mongolia, Ningxia and Shaanxi were all of low elevation (less than 3000m) whereas most endemic counties in Tibet were located in high elevation areas.All kind of elevation, i.e., lowest elevation, average elevation and highest elevation were positively correlated (P <0.05) with rank correlation coefficients of 0.568, 0.649 and 0.600, respectively.With respect to correlation analysis between elevation and dog infection rate,

Fitting negative binomial model
The mean of human echinococcus prevalence (1/100000) in 370 counties was 283.9297779, with a variance of 701.6497667 2 .Given that the variance was far higher than the mean, the data were over discrete.Thus, we chose negative binomial regression to explore the impact of environmental factors on the prevalence of human echinococcosis.Taking into account the consistent trend of LST across all seasons, which decreases with increasing elevation, we included the average annual LST in the negative binomial regression model.In contrast, due to the inconsistent seasonal trends of NDVI, we incorporated the spring NDVI, summer  5, there was severe multicollinearity among the independent variables.As shown in the right column of Table 5, after eliminating certain variables, we were able to solve the problem of multicollinearity.A negative binomial regression was performed with the data of human echinococcosis prevalence (1/100000) as dependent variable and the remaining environmental factors as independent variables (Table 6).
Among them, annual average LST, spring NDVI, summer NDVI, average elevation and the infection rate of dogs had statistical significance.Prevalence data of human echinococcosis in the 370 counties were over discrete, with a dispersion of 2.822, and the Lagrange multiplier test had no statistical significance, indicating that the data did not follow a Poisson distribution.It was thus reasonable to fit the negative binomial regression model.Logarithmic conversions of meaningful variables are shown in Table 7.The fitting negative binomial regression equation is: Assuming that other variables remained constant, the mean prevalence of human echinococcosis (1/100000) decreased by 15.5407% when annual average LST increased by 1˚C.Similarly, the mean prevalence of human echinococcosis (1/100000) decreased by 83.541% when NDVI increased by one unit in summer.Conversely, the mean prevalence of human echinococcosis (1/100000) increased by 0.090041% with the increase of 1m in average elevation.Furthermore, for every 1% increase in the infection rate in dogs, the mean prevalence of human echinococcosis (1/100000) increased by 5.274391%.By repeating the above steps, we conducted separate negative binomial regression fitting on the prevalence rates of human AE and CE in the 370 counties, respectively.Due to various constraints such as funding, we used the Enzyme-Linked Immunosorbent Assay (ELISA) to detect canine Echinococcus infection.However, ELISA can only confirm whether the dogs are infected with Echinococcus spp., without distinguishing between E. granulosus or E. multilocularis infections.Therefore, we did not include the infection rate of dogs in the model when performing negative binomial regression fitting on AE and CE, respectively.For the prevalence of human cystic echinococcosis, we additionally included the prevalence rate of domestic animals into the model.Negative binomial regression was fitted with the data of human CE prevalence (1/100000) as the dependent variable, and the annual average LST, spring NDVI, summer NDVI, autumn NDVI, winter NDVI, average elevation, and the prevalence rate of domestic animals as independent variables.The final results are shown in Table 8.The fitting negative binomial regression equation is: The annual average LST was inversely correlated with the prevalence of human CE.When other variables were held constant, for each increase of 1˚C in the annual mean LST, the mean prevalence of human CE (1/100000) decreased by 13.40256588%.Conversely, both the average elevation and the prevalence rate of domestic animals were positively correlated with the prevalence of human CE.
Negative binomial regression was fitted with the data of human CE prevalence (91/100000) as the dependent variable, and the annual average LST, spring NDVI, summer NDVI, autumn NDVI, winter NDVI, and average elevation as independent variables.Results are shown in Table 9.The fitting negative binomial regression equation is: The average elevation was positively correlated with the prevalence of human AE.When all other variables were kept constant, for each increase of one meter in the average elevation, the mean prevalence of human CE (1/100000) increased by 0.230264703 percent.The NDVI was positively correlated with human AE prevalence in autumn, whereas it was inversely correlated with human AE prevalence in winter.

Discussion
Echinococcus requires two mammalian hosts to complete its life cycle [5].The definitive host, dogs, excretes feces containing Echinococcus eggs which pollute local water sources, food and pasture.The intermediate hosts of Echinococcus, livestock, are infected during grazing.The viscera of infected livestock are fed to dogs, and the parasite develops into the adult stage in dogs to initiate a spreading cycle.Host transmission of E. multilocularis occurs between stray dogs or foxes as definitive host and small rodents such as voles or pikas as intermediate hosts [17].The spatial overlap and predation relationship between the definitive host and the intermediate host are related to landscape factors, which can directly affect the spread of Echinococcus.Elevation, LST and NDVI are the main landscape factors affecting the prevalence of echinococcosis [12,[18][19].
Temperature and humidity are the main determinants of the survival rate of parasite eggs in the environment [20][21].Echinococcus eggs are sensitive to high temperature but resistant to cold.Areas with low LST are more likely to be infected with echinococcosis [22].LST reflects the change of temperature in the environment [12].Our results indicated a significant negative correlation between the prevalence of human echinococcosis and the mean LST for each season, as well as the annual average LST.In the negative binomial regression model, annual average LST was significantly negatively correlated with the prevalence of human echinococcosis.Similarly, the prevalence of human CE also showed a significant negative correlation with the annual average LST.Some studies have found a negative correlation between Spring LST and CE prevalence [23].It shows that LST is the main environmental factor affecting the prevalence of echinococcosis in the population.The lower the LST, the higher the risk for the population.Temperature also affects geographical distribution and changes the composition of small mammal communities [24][25].Climate has been identified as a factor leading to changes in the distribution and number of red foxes, which are the definitive hosts of E. multilocularis [26].In Western China, the elevation is high with a typical plateau and mountain climate where the LST is kept low all year round.Climatic conditions are very conducive to the survival of echinococcus eggs [6].Early winter and early spring are high incidence seasons for dogs infected with Echinococcus.During the traditional Chinese Spring Festival, the number of slaughtered livestock increases making domestic dogs more likely to be in contact with viscera [27].Extreme cold weather often occurs in this season, which may cause the intermediate host animals to freeze to death in the wild, directly increasing the field transmission cycle.
In our work, the average elevation was positively correlated with the prevalence of human echinococcosis, whether AE or CE.This confirms previous studies [6,8,18].The higher the elevation, especially above 3000 meters, the smaller the proportion of agricultural production, while the proportion of animal husbandry increases, with grasslands gradually replacing farmland.The area with large grassland proportion not only increases the number of intermediate hosts and livestock by provides a suitable environment.At the same time, the number of definitive hosts such as dogs also increases sharply.Herdsmen are raising dogs to protect livestock during grazing.Together, all these factors contribute to a significant increase in the total number of hosts and strengthen the spreading cycle of Echinococcus.
Previous studies on echinococcosis have confirmed that the change of land cover is related to the increase of population density of the critical intermediate host of Echinococcus [28].NDVI is an important factor affecting the animal host distribution of Echinococcus [29].The winter prevalence of echinococcosis in Western China is positively correlated with NDVI [23].However, in winter, most of the high elevation areas with a concentrated number of cases are covered by snow and ice for extended periods, resulting in low NDVI in these areas.Therefore, there is no significant relationship between NDVI in winter or December and the prevalence of human echinococcosis in this study.In summer, pastures lead to an increase of NDVI values.In autumn, as the temperature decreases, pastures and farmlands become barren which is not conducive to the growth of small rodents and the grazing behavior of livestock.Grassland vegetation coverage will directly affect the distribution of intermediate hosts, and livestock play an important ecological role in the spread of CE [28].Grassland is one of the living conditions for the growth and reproduction of horses, cattle, sheep and other intermediate hosts [29][30].In pastoral areas, when yaks, sheep, horses and other livestock jointly graze, the grassland coverage and vegetation height will be reduced, which makes the population density of plateau pika larger than that of natural grassland [31].There is a significant positive correlation between the prevalence of AE and the forest, grassland and shrub vegetation near villages and a negative correlation with the cultivated land area [32][33].A study in France showed that the population of voles in these areas erupted periodically and the population density increased sharply [30].It was also reported in China that the prevalence of AE in the human population increased in areas with a high proportion of meadows while very few cases have been found in areas with poor vegetation (marshland) [6].Lowland pastures are described as heavily grazed pastures scattered with forest or shrub cover, which is related to a high human incidence rate [10].
In 1999, the project of returning farmland to forests was implemented to restore the previous ecological environment through three types of land transformation, namely, farmland to grassland, farmland to forest and wasteland to forest.These changes in land cover are likely to promote the spread of E. multilocularis, which will increase the density and distribution of small mammals [34].With the process of deforestation, the increase of grasslands or shrubs is conducive to the creation of near domestic habitats for small mammals and the development of near domestic cycles involving dogs.The distribution of small mammals in Gansu Province is also due to the short-term increase of grasslands and shrubs after deforestation [27,35].In eastern France, voles and vole population outbreaks have been reported in areas where cultivated land has been converted to permanent grassland [28].Reforested lowland pastures are by definition covered with forests or shrubs leading to a higher prevalence of human AE [10].A study in Ningxia Hui Autonomous Region showed that the abundance of degraded lowland pastures is related to the higher prevalence of AE [10].Landscape features may directly or indirectly determine the feeding behavior, growth rate, reproductive efficiency and immune mechanism of livestock [36].Grazing or trampling will affect the quality of grassland and the length of forage grass, which will provide a better habitat for small mammals [37].
The geographical distribution of echinococcosis in China is uneven.The Qinghai Tibet Plateau is a hot spot of echinococcosis epidemic in China, with an average elevation of more than 4000 m [4].The Qinghai Tibet Plateau is generally a dry and cold tundra, but there are differences in different regions.The annual precipitation in the western region is less than 100 mm, making it dry and sparsely populated, while the annual precipitation in the eastern region is 500-700 mm and thus a greater livestock with a higher population density can be maintained [30].The cultivated area in the Qinghai Tibet Plateau accounts for less than 1% of the surface and is restricted to regions lower than 3500 m.The main vegetation is grass and sedge (excluding Carex, Ceratoides, Ferns and Kobresia) and pastures account for most of the area, including alpine meadows and alpine grasslands, with a total surface of 8.7 million hectares [38].These natural conditions lead to a higher prevalence of echinococcosis in the Qinghai Tibet Plateau.This region is mostly devoted animal husbandry and the socio-economic situation is relatively lagging.Basic healthcare is far from perfect and the awareness about disease prevention is relatively poor.Moreover, increased host range and enhanced parasitic transmission between definitive and intermediate hosts, caused by environmental changes, might put humans at risk of increased echinococcosis transmission [39].Collectively, these factors contribute to the spatial distribution characteristics of echinococcosis.Additional knowledge about how environmental conditions affect E. granulosus transmission would be helpful in planning CE control and management programs [40].Since dogs are the definitive hosts of E. granulosus and E. multilocularis [41] and humans are accidental intermediate hosts ingesting contaminated food or water [42], domestic dogs and stray dogs in villages make the primary source of infection [43].
Firstly, this study explored the environmental risk factors of human echinococcosis at the county level, which is a relatively broad scale.In the future, if possible, we can delve deeper by conducting research at the township or even village level.Secondly, the incubation period, as a confounding factor, might have some impact on the results of this study.However, considering that environmental factors don't vary greatly in the same region across different years, our results are still reasonable.Thirdly, fecal ELISA detection technology lacks specificity, using ELISA to determine the infection rate in dogs has its limitations.In the future, further PCR testing should be conducted on dogs that tested positive with ELISA to determine whether they are infected by E. granulosus or E. multilocularis.

Conclusion
This article addressed the impact of environmental factors on population prevalence in all endemic counties.The prevalence of human hydatid disease is affected by environmental factors, such as annual average LST, spring NDVI, summer NDVI, average elevation.It is necessary to pay special attention to these areas, strengthen environmental risk monitoring, carry out targeted prevention and control, and rationally allocate health resources.We cannot reduce the prevalence of echinococcosis by intervening in environmental factors.But our research can suggest in which environments the prevalence of echinococcosis will be more severe, thus focusing on prevention and control.

Table 4 . Distribution of different elevations in endemic areas.
NDVI, autumn NDVI, and winter NDVI separately into the negative binomial regression model.Subsequently, the data of human echinococcosis prevalence (1/100000) was taken as the dependent variable, and the annual mean LST, spring NDVI, summer NDVI, autumn NDVI, winter NDVI, average elevation, and infection rate of dogs determined from an epidemiological survey were taken as independent variables to fit the negative binomial regression model.Firstly, we fitted a multivariate linear regression to check for multicollinearity among the independent variables.If the tolerance level is less than 0.1 or the variance inflation factor (VIF) is more than 10, it signifies severe multicollinearity.As shown in the left column of Table https://doi.org/10.1371/journal.pntd.0012131.t004