Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana

Kojo Nketia; Dziedzom K. de Souza

doi:10.1371/journal.pone.0304681

Abstract

Background

Schistosomiasis is a neglected disease prevalent in tropical and sub-tropical areas of the world, especially in Africa. Detecting the presence of the disease is based on the detection of the parasites in the stool or urine of children and adults. In such studies, typically, data collected on schistosomiasis infection includes information on many negative individuals leading to a high zero inflation. Thus, in practice, counts data with excessive zeros are common. However, the purpose of this analysis is to apply statistical models to the count data and evaluate their performance and results.

Methods

This is a secondary analysis of previously collected data. As part of a modelling process, a comparison of the Poisson regression, negative binomial regression and their associated zero inflated and hurdle models were used to determine which offered the best fit to the count data.

Results

Overall, 94.1% of the study participants did not have any schistosomiasis eggs out of 1345 people tested, resulting in a high zero inflation. The performance of the negative binomial regression models (hurdle negative binomial (HNB), zero inflated negative binomial (ZINB) and the standard negative binomial) were better than the Poisson-based regression models (Poisson, zero inflated Poisson, hurdle Poisson). The best models were the ZINB and HNB and their performances were indistinguishable according to information-based criteria test values.

Conclusion

The zero-inflated negative binomial and hurdle negative binomial models were found to be the most satisfactory fit for modelling the over-dispersed zero inflated count data and are recommended for use in future statistical modelling analyses.

Citation: Nketia K, de Souza DK (2024) Using zero-inflated and hurdle regression models to analyze schistosomiasis data of school children in the southern areas of Ghana. PLoS ONE 19(7): e0304681. https://doi.org/10.1371/journal.pone.0304681

Editor: Jean Coulibaly, Universite Felix Houphouet-Boigny, SWITZERLAND

Received: November 17, 2023; Accepted: May 16, 2024; Published: July 12, 2024

Copyright: © 2024 Nketia, de Souza. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files. The python, R codes and data for model comparison and analysis can be found at https://github.com/Kojo-Nketia/Using-hurdle-models-to-analyze-schistosomiasis-count-data.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Schistosomiasis is a parasitic disease caused by trematode parasites of the genus Schistosoma [1]. It affects approximately 251.4 million people globally across 78 countries, with an estimated cases of 90% in Africa [2, 3]. The main causes of schistosomiasis are Schistosoma haematobium and S. mansoni [4]. The presence of S. haematobium results in genitourinary disease, while S. mansoni causes hepatobiliary schistosomiasis. Complications of schistosomiasis may include liver fibrosis, varices, and bladder carcinoma. Additionally, the infection is linked to anemia, deficiencies in nutrition, and stunted growth. It also has negative impacts on cognitive development, reducing physical activity, school performance, work capacity, and productivity [5]. Schistosomiasis results in high morbidity and socio-economic burden to affected individuals and communities, with risk factors including poor sanitation, prolonged exposure to infested freshwater bodies, and occupational hazards due to farming and fishing [6, 7].

In order to show the full burden of infection due to schistosomiasis, it is crucial to understand the epidemiology of the disease. The number of parasitic worms present in a group of people determines the transmission intensity of schistosomiasis, which is indirectly measured by the quantity of discharged eggs in stool or urine [8]. Appropriate statistical models can be used to study the epidemiology of infection. However, while using statistical techniques to analyze count data, methodological issues and the nature of the data must be taken into consideration [9].

Count data are common in health science research, especially when researchers investigate topics such as parasitic infections, COVID 19 among students, and the frequency of serious incidents in medical laboratories [10]. Count data can assume a probability distribution such as a Poisson. Count data, however, can also display more variability known as overdispersion (i.e., variance >mean) [11]. Often, a Poisson or negative binomial (NB) model cannot adequately account for the sample’s number of zeros and may result in distorted test statistics, goodness-of-fit and estimated standard errors [12]. Zero-inflated (ZI) and hurdle models are two models that are often used [9], to account for zero-inflated datasets. Further, hurdle models are an extension of the zero inflated (ZI) models. Hurdle models are also known as two part models [13], where the first part is a Bernoulli (for zero and non-zero counts) probability and the second part, which handles positive counts, is the zero-truncated Poisson (ZTP) or zero-truncated negative binomial (ZTNB) distribution [10]. Previous studies have demonstrated that if excessive zeros are not taken into consideration, both the zeros and nonzero counts would fit poorly [14].

The approach used to model zero counts is the primary difference between ZI and hurdle models. Zero observations in a ZI model [15] might come from either “sampling zeros” that are part of the underlying sampling distribution (Poisson, or negative binomial) and “structural zeros” that cannot score anything other than zero [16]. That is to say, the total number of zeros are divided into excess zeros and zeros generated by the distribution. Whereas the ZI models assume that both sources of zeros are involved, hurdle models assume only structural sources [10]. The typical Poisson or negative binomial distribution is used for the sampling of zeros, and it is presumed that they occur by chance [17]. This is the nature of the schistosomiasis count data, especially in low prevalence settings.

The purpose of this study is to assess the performance of the statistical models Poisson, NB and their related ZI and hurdle models in relation to different set of predictors and infection intensity for schistosomiasis to inform the effective analysis of data.

Materials and methods

This is a secondary analysis of de-identified data collected within a larger epidemiological study on allergy and parasitic infections (“GLOFAL GHANA”) [18, 19]. The data constitute 1345 participants (635 male, 710 female)—within 12 communities in southern Ghana. These communities were grouped into two areas; rural and urban. The age range of participants was 4–21 years, distributed from nursery & kindergarten (considered as pre-school), primary and junior high school. The ages were grouped into three age groups with group one consisting of children aged 4–9, group two consisting of children aged 10–15 and group three consisting of children aged 16–21. Parents’ occupations were also grouped into the main occupational activities of farming and fishing, and others (including teaching, driving, masonry, trading, etc). Water sources used by communities are pipe borne water, tanker (treated), tanker (untreated), river/stream, well/borehole. The count data is available in S1 Appendix in the supporting information section below.

Demographic covariates

The covariates considered for this analysis are the age group, sex, educational level, area (i.e. rural or urban), the occupation of parents, and sources of water usage. These covariates have been identified as predictors for schistosomiasis infection informed by literature [20–22]. These covariates, also taken as predictors, formed the regression matrix for the models used. The information about these covariates have been listed in Table 1.

Download:

Table 1. Data summary.

https://doi.org/10.1371/journal.pone.0304681.t001

Statistical methods

In a generalized linear model (GLM), the response variable is related to a linear combination of the regression variables through a non-linear link function [23], (1) Poisson regression is a type of GLM used to model discrete count data whereby the link function is exponential and E(y|x) ≔ λ represents the expected number of counts given X and is expressed as, (2) Inverting this the problem reduces to standard linear regression against the logarithm of the mean [24], (3) where b₀, b₁, …, b_n are the intercepts and slopes we fit to minimize the residual of the errors. The probability of observing the count y_i is Poisson distributed as per the following probability mass function: (4)

For more complex models, where the Poisson assumption does not hold, there may be more parameters such as overdispersion parameter α, or zero-truncation probability π, each of which can be fitted with a different link function and in a similar way [17].

For a given mean, the negative binomial distribution has a higher variance than the Poisson distribution [25]. Its probability mass function is expressed as (5) The mean and variance of the negative binomial distribution are E(y) = λ, and Var(y) = λ(1 + αλ), where α is the dispersion parameter and if α = 0, negative binomial approaches the Poisson [24].

Zero-inflated models introduce extra probability mass to account for the excess zeros in the outcome. This results in a two-state mixture distribution with a PMF defined by the ZIP model; (6) In a case where overdispersion and excess zeros occur in data, recommending the search for an alternate model to the ZIP model. The zero inflated negative binomial (ZINB) is given by the equation, (7) where y_i is the observed counts and π is the probability of non-infection. Eq (7) models two processes, where the first process generates the excess zeros, with the probability π, and the second one uses the negative binomial to generate counts (including zeros).

In contrast, the hurdle models also have two components which are the zero count and the positive counts for both the Poisson and negative binomial model [9]. The first component handles the zero count and the other part deals with the positive counts which is referred to as the zero truncated Poisson or NB [17]. In Eqs (8) and (9) below, (1 − π) is the likelihood of overcoming the “hurdle” and generating a non-zero count [26]. The hurdle Poisson model is expressed as (8) For the negative binomial it would look as follows: (9) where y_i is the observed counts taking values y = 0, 1, 2, ‥ and π is the probability of non-infection. Also, the second equation in the piecewise function of Eqs (8) and (9) are the zero-truncated Poisson and zero-truncated negative binomial distribution respectively.

The parameters of each model under different socio-demographic predictors were estimated by using maximum likelihood estimation (MLE) to simulate the model. The binary component, which determines if an observed count is zero (negative infection) or non-zero (positive infection), of the HNB model uses a logit binary link [10], (10) and the schistosomiasis positive counts as a dependent variable for our proposed regression model is (11) where X is the regression variable matrix, γ₀, γ, β₀ and β are the vectors of parameters.

Models comparison

Several tests have been used for comparing models. An example is the Vuoung test [27] for non-nested models, such as ZINB and HNB, in several applications. Alternatives include the Akaike, Bayesian information-theoretic criteria and an approach that embeds the alternative models in an artificial compound model [26, 28]. We used the Akaike and Bayesian information criteria to understand the model performances against each other and discriminate against other tests. The Akaike Information Criteria (AIC) is likely to perform well when heterogeneity is small. However, when heterogeneity is large, which may result in overfitting, the Bayesian information criterion (BIC) will often perform better because of the stronger penalty provided [29]. By adding a penalty term for the number of parameters in the model, the BIC solves this issue. The AIC and BIC were used to test the goodness-of-fit of the models [30]. The mathematical formulation of the AIC/BIC with their basic properties is presented in Jiawei’s paper [31]. Better model fit is indicated by a lower AIC or BIC value indicating the data is more likely to have derived from the distribution in question. Both tests are based on the MLE method. The formula for the AIC is given as follows: where represents the log-likelihood of the data under the model, and k is the number of model parameters that is the number of variables and the intercept in the model [32]. The BIC is formally defined as where is the maximized value of the likelihood function, n is the sample size and k is the number of parameters estimated by the model.

Models compared are Poisson, NB, ZIP, ZINB, HP and HNB and if the discrepancy in the AIC or BIC values between competing models is less than 2.5, they cannot be distinguished, whereas a difference exceeding 10 (>10) indicates substantial evidence in favour of the model with the lowest criterion [9]. However, if the difference between two models lies in the range 2.5–6 then the preferred model is that with the smallest value if the sample size n >256. In the same light, if the difference between the models is in the range 6–9 then the preferred the model is the one with the lowest value if also n >64 [30].

The statsmodels, pandas and matplotlib libraries in python version 3.11.8 and political science computational laboratory (PSCL) package in the statistical software R version 4.3.1 were used for model fitting [33] and plotting figures.

Measures of association

Modelling was done using the most significant variables from the cross-tabulation with the chi-square test. However, the output results of each model include estimated p-values representing the association of a predictor, in relation with other predictors, to the dependent variable (schistosomiasis counts). So, we tabulated infection status paired with each of the chosen predictor individually, aiming to investigate association separately with the infection status, having a binary outcome (positive or negative).

Ethics statement

Not applicable. This is a secondary analysis of previously collected data. All data was de-identified prior to receipt from the original study investigators.

Results

Descriptive statistics

A large percentage (94.1%) of the count data were zeros (non-infected individuals), with the remainder being positive counts of infected individuals. The median age of participants was 11 years. The mean of schistosomiasis egg count, excluding the zeros, was 33.4 egg per gram, with a standard deviation of 61.8 and an over-dispersion parameter α = 2.1. The distribution of the count data is shown in Fig 1A.

Download:

Fig 1. Schistosomiasis egg counts.

A: Histogram showing the distribution of egg counts for schistosomiasis; with sample size 1345. B: Histogram showing the distribution of egg counts for schistosomiasis in log axis. The divided regions; light (1–99 epg), moderate (100–400 epg) and high (>400 epg), equate the infection intensity count categories.

https://doi.org/10.1371/journal.pone.0304681.g001

Grouping by infection intensity

Light, moderate, and high infection intensity categories are defined as counts between 1–99, 100–399, ≥400 epg respectively [34]. From the histogram of these counts given in Fig 1B, although the distribution is positively skewed there are a few gaps between counts after the 100 mark on the horizontal axis (or light intensity range). Due to this we took the low intensity count of the data with zeros (n = 1336, mean = 0.8 epg, variance = 33.3) and the “all intensity” count data (i.e. all sample) to evaluate the model performances. Upon excluding the moderate (n = 9, mean = 183.6 epg, variance = 5275) with no value in the high intensity range, the mean and variance of the low intensity sample became smaller than that of the actual mean (2 epg, including the zeros) and variance (286.7) of the count data but still resulted in over-dispersion (variance >mean) with the overdispersion parameter α = 74.0. We assessed how the model performed by neglecting moderate and high intensity values (>100 epg) with the same predictors and made a comparison between both levels of intensity. The results of this assessment, presented in Table 2, show a significant difference in both information criteria values between the two levels of intensity. An approximated difference, between using low intensity and all intensity values, of 200 was observed, against all levels of intensity, implying a better performance by the low intensity values compared to using all values.

Download:

Table 2. Model’s AIC and BIC fit summary statistics.

https://doi.org/10.1371/journal.pone.0304681.t002

From Table 2, it is observed that the difference between the AIC and BIC values of the ZINB and HNB model for using five predictors scenario is ≤3. Hence their performance is indistinguishable (AIC = 1206, BIC = 1305 for ZINB; AIC = 1203, BIC = 1302 for HNB). Also, a difference ≤3 between ZINB and HNB was observed using the ten predictors (all intensity) and the implication of this result is that, the ZINB model did not outperformed the HNB model under this scenario (AIC = 1209, BIC = 1360 for ZINB; AIC = 1207, BIC = 1358 for HNB). Overall, the NB and its associated models (NB, ZINB, HP) performed better than the Poisson-based models (Poisson, ZIP, HP) as shown in Fig 2 where a greater difference between the NB-based model’s AIC values and the Poisson-based models.

Download:

Fig 2. Comparison of the AIC values of models used.

Comparison of the AIC values between Poisson and NB-based regression models (the standard, ZI and hurdle). This graph displays the AIC values for each model under different scenarios; A. shows all intensity (i.e. all sample (n = 1345)) simulated with 10 predictors. B. presents all sample size simulated with 5 predictors. C shows the low intensity samples (n = 1336) simulated with 10 predictors. D shows low intensity samples simulated with 5 predictors. For each sub-figure, the AIC value for the negative binomial-based models is much greater than its corresponding Poisson-based model.

https://doi.org/10.1371/journal.pone.0304681.g002

Using different set of predictors

Different sets of variables were used to determine the performance of the model. The first five predictor variables were age groups, sex, educational level, area and parent’s occupation. The next ten predictor variables used were the first five in addition to their sources of water namely pipe-borne water, treated water from water tanker, untreated water from water tankers, river/stream and well/borehole. Table 2 shows their respective performance on the model. As a result, the AIC and BIC values for using five predictors are greater than that of the ten predictors for each model and scenario except for the HNB model results. For the HNB test values, using all intensity, the difference in the five predictor’s AIC value (1203) and ten predictors AIC value (1207) is 4. Hence, the best fit is in favour of both using both predictors. Similarly, the difference in their BIC values is 56, favouring the use of the 10 predictors (which has the lowest BIC value). However, the p-value of the likelihood ratio test between using the five and ten predictor for the HNB is not <0.05, concluding that compared to the 5 predictors, the 10 predictors did not give a significant fit improvement.

Interpreting model fitting

To evaluate the fit of regression models, individuals regularly examine the deviations of observations from predicted values using many approaches. One way to achieve this is when the observed frequencies are plotted against or overlaid on the predicted/expected values. Rootograms, as proposed by Kleiber and Zeileis [35] is a new way to assess count models, such that it presents the square roots of the predicted values as a continuous curve overlaid on that of the bars of observed counts. Fig 3 simply shows the rootogram for the first 50 number of eggs’ frequencies. Further, Fig 4 shows the differences between observed and expected counts, with bars hanging above and below from the zero line to highlight overestimated and underestimated values respectively, providing information about the model residuals rather than the fitted values shown in Fig 3.

Download:

Fig 3. Each model’s predictions overlaid on the observed bar counts.

The hurdle Poisson and hurdle negative binomial both captured 1265 zeros, which is equal to the observed zero count. Additionally, zero captured for zero inflated Poisson (ZIP) is 1265, zero inflated negative binomial (ZINB) is 1264 but the Poisson and NB model underestimated the zeros (513 and 412 zeros respectively) and overestimated the positive counts. The ZIP and HP expected values were lower than the observed for the first 10 counts whilst ZINB and HNB expected values were almost equal to the observed with a difference by a small margin.

https://doi.org/10.1371/journal.pone.0304681.g003

Download:

Fig 4. Alternate plot showing the low and high values of each model’s prediction.

Here, the but not . The standard Poisson and negative regression models did not account for all the zeros observed; with zeros difference of 513 and 412 for Poisson and negative binomial from the number of observed zeros (low prediction). Moreover, the zero-inflated and hurdle models accounted for all zeros. However, we observe an overestimation for the positive counts for the standard Poisson and both underestimation and overestimation for the other models.

https://doi.org/10.1371/journal.pone.0304681.g004

Figs 3 and 4 highlight that the Poisson model underestimated the zeros (752 zeros captured) and overestimated the positive counts, with a difference of 211 for positive count 1, 90 for count 2, 54 for count 3 and so on; whilst the negative binomial model predicted less zeros (853 captured; overestimating) and positive counts which were overestimated (positive count 1: overestimated by 183; 2 by 72; 3 by 43, …). Due to overdispersion and zero-inflation, the standard Poisson and negative binomial model were too high for low values. The zero-inflated and hurdle model’s zero prediction was equal to the number of zeros observed (1265). From Fig 3, without the ZINB and HNB models, the ZIP and HP was a bit high for low values as the counts moves away from zero but gave a better approximation to the observed frequencies, as it gets larger. S1 Fig presents the rectangles/bars of the observed frequencies plotted against the predicted ones for first 15 egg counts and in log axis.

ZINB and HNB regression model

The regression output results for the zero-inflated negative binomial and hurdle model parameter estimates where all sample size with ten predictors was used to simulate the models are shown in Table 3. From the results shown in Table 3, it can be seen that two variables (area, pipe borne, water by tankers (untreated) and well/borehole) of the negave variables (age groups <10 and 10–15, area, pre-school (educational level) and untreated water from tankers) have a significant effect as their p-values (Pr(>|Z |)) <0.05 for the ZINB and HNB model respectively. Also, three variables—area, pipe-borne water and water from well/borehole were significant with the positive counts with p-values, 0.017, 0.015 and 0.009 respectively for the ZINB model. Four variables—area, pipe-borne water, water from tankers (untreated) and well/borehole were significant with the positive counts with the p-values 0.009, 0.021, 0.005 and 0.002 respectively. Their respective standard errors too are attached.

Download:

Table 3. Zero hurdle model coefficients (binomial with logit link) and count model coefficients (truncated negbin with log link) with 95% CI.

https://doi.org/10.1371/journal.pone.0304681.t003

Modelling and interpreting main effects

Table 4 shows the odds ratios (ORs) obtained from the logistic portion of the HNB model. Some confidence intervals for the ORs were observed to be wide. From Table 4, it is noted that people who live in urban areas have a lower risk of getting schistosomiasis infection as compared to people who live in rural areas. We can assess that children within the ages 10–15 (OR = 1.88, 95% CI: (0.56, 6.32)) have a higher risk of getting infected as compared to children whose age are greater than 15. Also, children who use untreated water delivered by water tankers (OR = 2.33, 95% CI: (1.00, 4.97)) have a greater risk of harboring schistosomiasis egg counts than those who use water from other sources.

Download:

Table 4. Zero hurdle model coefficients (binomial with logit link) and count model coefficients (truncated negbin with log link) with 95% CI.

https://doi.org/10.1371/journal.pone.0304681.t004

Discussion

Schistosomiasis, an ailment caused by blood flukes of the Schistosoma genus, is a waterborne helminthic disease [36]. It predominantly affects underprivileged rural communities residing in tropical and subtropical areas, where access to safe drinking water and proper sanitation is limited. Infection occurs when individuals come into contact with infested natural freshwater bodies while engaging in activities such as fishing, agriculture, laundry, or swimming [37].

For the analysis of schistosomiasis egg count data, six distinct models applying either the Poisson or negative binomial distribution were taken into account. Various scenarios were considered to simulate the model, however, the focus of the discussions will be on the results when ten predictors and all the sample data was used. The count data was overdispersed and zero-inflated. The standard Poisson and NB performed poorly when fitted to the count data since they both recorded a higher information-theoretic criteria’s value in Table 2 compared to the zero-inflated (ZI) and hurdle models. Further, the ZI and hurdle models were fitted to the count data and by the AIC and BIC fit statistics with the model’s predictions shown in Fig 3 indicated that the ZINB (AIC = 1209, BIC = 1360) and HNB (AIC = 1207, BIC = 1358) model best fitted the data. Table 5 presents each model’s number of zeros captured when five and ten predictors were used against all the sample data and both the standard Poisson (717 zeros captured) and NB (801 zeros captured) failed to capture the number of observed zeros (1265 zeros) with a difference of 548 and 464 respectively when five predictors were used. Additionally, by introducing extra predictors, the Poisson (752 zeros captured) and NB (853 zeros captured) did not the number of observed zeros with a difference of 513 and 412 respectively. On the other hand, the ZI and hurdle models effectively accounted for all 1265 observed zeros when five predictors were employed. By incorporating additional predictors, all ZI and hurdle models successfully captured all zeros, except for the ZINB model which captured 1264 zeros, resulting in a difference of 1. This implies that there is no necessity to introduce additional parameters for capturing the number of observed zeros in the ZI and hurdle models. These findings emphasize the advantages of using ZI and hurdle models over the standard Poisson and NB models for analyzing zero-inflated count data. Moreover, the likelihood ratio test comparing the use of five and ten predictors for the ZINB model (p-value = 0.0711) and HNB model (p-value = 0.0969) does not show significance.

Download:

Table 5. Model’s zero capturing.

https://doi.org/10.1371/journal.pone.0304681.t005

A review study [36] showed that the southern regions in Ghana have a mean schistosomiasis prevalence of 25.8% with a range of 3.3% to 83.9%. In this study, the infection prevalence among participant was 5.9%. We assessed the infection status and intensity in relation to factors such as age, sex, educational level, area of residence, parent’s occupation and their water sources. We found out that the factors linked to schistosomiasis infection include age (10–15 years), area of residence, pre-school education level, and the use of untreated water distributed by tankers. Furthermore, the factors associated with the intensity of schistosomiasis infection (positive counts) include place of residence and water sources such as untreated water from tankers, piped water, and well/borehole water. This confirms that high risk of infection are associated with specific areas and water sources [38–40]. Some research [41, 42] have shown that infection among children is associated with parent’s occupation but the occupation of parents did not show any significant influence on the involvement in schistosomiasis egg counts in this study. The implementation of control strategies is necessary to monitor the water sources, educate adults in rural areas on schistosomiasis infection, and guide parents in preventing their preschool children from accessing contaminated water. However, the results imply that safe drinking and well-treated water must be made available in endemic areas. An integrated approach to control that include sanitation improvement, health education, and less human-water interaction is required for success [43].

Conclusion

The zero-inflated negative binomial (ZINB) and hurdle negative binomial (HNB) models, which are frequently employed to model zero-inflated count data with under-dispersion and over-dispersion, provided the most satisfactory fit among all the models used and performed the best. These models (ZINB and HNB) were used to identify key factors affecting the quantity of schistosomiasis egg counts in children. Factors such as age (10–15 years), residence in rural areas, pre-school educational level, reliance on untreated water sources like well/borehole water, untreated waters from tankers, and piped water were found to be associated with the numbers of positive egg counts. It was observed that the hurdle negative binomial outperformed the zero-inflated negative binomial model when ten set of predictors were used and this is based on the information-theoretic criteria used. Moreover, all zero-inflated models accommodated the excess zeros, recommending it for modelling count data with excess zeros. Extending the modelling approaches to perform geospatial analysis to predict areas of high transmission in order to inform control strategies is recommended.

Supporting information

S1 Fig. A bar relationship between observed and model’s expected frequencies (in log axis).

https://doi.org/10.1371/journal.pone.0304681.s001

(TIF)

S1 Appendix. Data and code availability.

The data, R and python codes for model comparison and analysis can be found at https://github.com/kojonketia/Using-hurdle-models-to-analyze-schistosomiasis-count-data.

https://doi.org/10.1371/journal.pone.0304681.s002

(DOCX)

References

1. Sturrock RF. The Schistosomes and their intermediate hosts. In: Tropical medicine. 2001. p. 7–83. Available from: https://doi.org/10.1142/9781848161511_0002
2. Ahmed SH. Schistosomiasis (Bilharzia): Background, Pathophysiology, Etiology. eMedicine. 2023 Mar 23; Available from: https://emedicine.medscape.com/article/228392-overview?form=fpf
- View Article
- Google Scholar
3. Caffrey CR. Schistosomiasis and its treatment. Future Med Chem. 2015;7(6):675–676. pmid:25996057
- View Article
- PubMed/NCBI
- Google Scholar
4. Bosompem KM, Bentum IA, Otchere J, Anyan WK, Brown CA, Osada Y, et al. Infant schistosomiasis in Ghana: a survey in an irrigation community. Trop Med Int Health. 2004 Aug;9(8):917–22. pmid:15303998
- View Article
- PubMed/NCBI
- Google Scholar
5. Sumbele IUN, Tabi DB, Teh RN, Njunda AL. Urogenital schistosomiasis burden in school-aged children in Tiko, Cameroon: a cross-sectional study on prevalence, intensity, knowledge and risk factors. Trop Med Health. 2021 Sep 16;49(1):75. pmid:34530935
- View Article
- PubMed/NCBI
- Google Scholar
6. Adenowo AF, Oyinloye BE, Ogunyinka BI, Kappo AP. Impact of human schistosomiasis in sub-Saharan Africa. Braz J Infect Dis. 2015 Mar-Apr;19(2):196–205. Epub 2015 Jan 27. pmid:25636189
- View Article
- PubMed/NCBI
- Google Scholar
7. Grimes JE, Croll D, Harrison WE, Utzinger J, Freeman MC, Templeton MR. The roles of water, sanitation and hygiene in reducing schistosomiasis: a review. Parasit Vectors. 2015 Mar 13;8:156. pmid:25884172
- View Article
- PubMed/NCBI
- Google Scholar
8. Chipeta MG, Ngwira B, Kazembe LN. Analysis of Schistosomiasis haematobium infection prevalence and intensity in Chikhwawa, Malawi: an application of a two part model. PLoS Negl Trop Dis. 2013;7(3):e2131. Epub 2013 Mar 21. pmid:23556017
- View Article
- PubMed/NCBI
- Google Scholar
9. Khan A, Ullah S, Nitz J. Statistical modelling of falls count data with excess zeros. Inj Prev. 2011 Aug;17(4):266–70. Epub 2011 Jun 8. pmid:21653652
- View Article
- PubMed/NCBI
- Google Scholar
10. Bhaktha N. Properties of Hurdle Negative Binomial Models for Zero-Inflated and Overdispersed Count data. Doctoral dissertation, Ohio State University. 2018. Available: http://rave.ohiolink.edu/etdc/view?acc_num=osu1543573678017356.
11. Cameron AC, Trivedi PK. Model Specification and Estimation. In: Regression Analysis of Count Data. Cambridge: Cambridge University Press; 2013. p. 21–68. (Econometric Society Monographs).
12. Mohammed. Zero-Inflated Models. Otago University. Available: https://www.otago.ac.nz/ripe/otago301201.pdf.
13. Heilbron DC. Zero‐Altered and other Regression Models for Count Data with Added Zeros. Biometrical Journal [Internet]. 1994 Jan 1;36(5):531–47. Available from: https://doi.org/10.1002/bimj.4710360505.
- View Article
- Google Scholar
14. Perumean-Chaney SE, Morgan C, McDowall D, Aban I. Zero-inflated and overdispersed: what’s one to do? Statistical Computation and Simulation/Journal of Statistical Computation and Simulation [Internet]. 2013 Sep 1;83(9):1671–83. Available from: https://doi.org/10.1080/00949655.2012.668550
- View Article
- Google Scholar
15. Lambert D. Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics. 1992 Feb 1;34(1):1. Available from: https://doi.org/10.2307/1269547.
- View Article
- Google Scholar
16. Hu MC, Pavlicova M, Nunes EV. Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial. Am J Drug Alcohol Abuse. 2011 Sep;37(5):367–75. pmid:21854279
- View Article
- PubMed/NCBI
- Google Scholar
17. Feng CX. A comparison of zero-inflated and hurdle models for modeling zero-inflated count data. J Stat Distrib Appl. 2021;8(1):8. Epub 2021 Jun 24. pmid:34760432
- View Article
- PubMed/NCBI
- Google Scholar
18. Obeng BB, Aryeetey YA, de Dood CJ, Amoah AS, Larbi IA, Deelder AM, et al. Application of a circulating-cathodic-antigen (CCA) strip test and real-time PCR, in comparison with microscopy, for the detection of Schistosoma haematobium in urine samples from Ghana. Ann Trop Med Parasitol. 2008 Oct;102(7):625–33. pmid:18817603
- View Article
- PubMed/NCBI
- Google Scholar
19. Aryeetey YA, Essien-Baidoo S, Larbi IA, Ahmed K, Amoah AS, Obeng BB, et al. Molecular diagnosis of Schistosoma infections in urine samples of school children in Ghana. Am J Trop Med Hyg. 2013 Jun;88(6):1028–31. Epub 2013 Mar 25. pmid:23530072
- View Article
- PubMed/NCBI
- Google Scholar
20. Dassah S, Asiamah GK, Harun V, Appiah-Kubi K, Oduro A, Asoala V, et al. Urogenital schistosomiasis transmission, malaria and anemia among school-age children in Northern Ghana. Heliyon. 2022 Sep 2;8(9):e10440. pmid:36119865
- View Article
- PubMed/NCBI
- Google Scholar
21. Angora EK, Boissier J, Menan H, Rey O, Tuo K, Touré AO, et al. Prevalence and Risk Factors for Schistosomiasis among Schoolchildren in two Settings of Côte d’Ivoire. Trop Med Infect Dis. 2019 Jul 23;4(3):110. pmid:31340504
- View Article
- PubMed/NCBI
- Google Scholar
22. Nelwan M. Risk factors of schistosomiasis. Social Science Research Network [Internet]. 2020 Jan 1; Available from: https://doi.org/10.2139/ssrn.3722691.
- View Article
- Google Scholar
23. Agresti A. Categorical data analysis. Wiley series in probability and statistics. 2002. Available from: https://doi.org/10.1002/0471249688
- View Article
- Google Scholar
24. Liaqat M, Kamal S, Fischer F, Zia N. Zero-inflated and hurdle models with an application to the number of involved axillary lymph nodes in primary breast cancer. Journal of King Saud University—Science. 2022;34(4):101932.
- View Article
- Google Scholar
25. Baetschmann G, Winkelmann R. A dynamic hurdle model for zero-inflated count data: with an application to health care utilization. Social Science Research Network. 2014;(151).
- View Article
- Google Scholar
26. Ridout M, Demétrio CGB, Hinde J. Models for count data with many zeros. International Biometric Conference. Cape Town. 13. 1–13; 1998.
27. Vuong QH. Likelihood ratio tests for model selection and Non-Nested hypotheses. Econometrica [Internet]. 1989 Mar 1;57(2):307. Available from: https://doi.org/10.2307/1912557
- View Article
- Google Scholar
28. Miaou SP. The relationship between truck accidents and geometric design of road sections: Poisson versus negative binomial regressions. Accident Analysis and Prevention. 1994 Aug 1;26(4):471–82. Available from: https://doi.org/10.1016/0001-4575(94)90038-8 pmid:7916855
- View Article
- PubMed/NCBI
- Google Scholar
29. Brewer MJ, Butler A, Cooksley SL. The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods in Ecology and Evolution [Internet]. 2016 Jun 1;7(6):679–92. Available from: https://doi.org/10.1111/2041-210x.12541
- View Article
- Google Scholar
30. Hilbe JM. Modeling count data. 1st ed., Cambridge University Press, 2014.
31. Zhang J, Yang Y, Ding J. Information criteria for model selection. Wiley Interdisciplinary Reviews Computational Statistics. 2023 Feb 20;15(5). Available from: https://doi.org/10.1002/wics.1607.
- View Article
- Google Scholar
32. Aswi A, Astuti SA, Sudarmin S. Evaluating the performance of Zero-Inflated and Hurdle Poisson models for modeling overdispersion in count data. Inferensi: Jurnal Statistika. 2022;5(1):17.
- View Article
- Google Scholar
33. Zeileis A, Kleiber C, Jackman S. Regression Models for Count Data in R. J. Stat. Soft. [Internet]. 2008 Jul. 29 [cited 2024 Mar. 26];27(8):1–25. Available from: https://www.jstatsoft.org/index.php/jss/article/view/v027i08.
- View Article
- Google Scholar
34. Aemero M, Berhe N, Erko B. Status of Schistosoma mansoni prevalence and intensity of infection in geographically apart endemic localities of Ethiopia: a comparison. Ethiop J Health Sci. 2014 Jul;24(3):189–94. pmid:25183924
- View Article
- PubMed/NCBI
- Google Scholar
35. Kleiber C, Zeileis A. Visualizing count data regressions using rootograms. The American Statistician. 2016;70(3):296–303.
- View Article
- Google Scholar
36. Boateng EM, Dvorak J, Ayi I, Chanova M. A literature review of schistosomiasis in Ghana: a reference for bridging the research and control gap. Trans R Soc Trop Med Hyg. 2023 Jun 2;117(6):407–417. pmid:36688317
- View Article
- PubMed/NCBI
- Google Scholar
37. World Health Organization: WHO. WHO launches new guideline for the control and elimination of human schistosomiasis. WHO [Internet]. 2022 Feb 22; Available from: https://www.who.int/news/item/22-02-2022-who-launches-new-guideline-for-the-control-and-elimination-of-human-schistosomiasis.
38. World Health Organization: WHO. Schistosomiasis. WHO [Internet]. 2023. Available from: https://www.who.int/news-room/fact-sheets/detail/schistosomiasis.
39. Reitzug F, Ledien J, Chami GF. Associations of water contact frequency, duration, and activities with schistosome infection risk: A systematic review and meta-analysis. PLoS Negl Trop Dis. 2023;17(6):e0011377. Published 2023 Jun 14. pmid:37315020
- View Article
- PubMed/NCBI
- Google Scholar
40. Hajissa K, Muhajir AEMA, Eshag HA, et al. Prevalence of schistosomiasis and associated risk factors among school children in Um-Asher Area, Khartoum, Sudan. BMC Res Notes. 2018;11(1):779. Published 2018 Oct 31. pmid:30382901
- View Article
- PubMed/NCBI
- Google Scholar
41. Abubakar BM, Abubakar A, Moi IM, et al. Urinary schistosomiasis and associated risk factors among primary school students in the Zaki Local Government area, Bauchi State, Nigeria. Dr Sulaiman Al Habib Medical Journal. 2022;4(4):196–204.
- View Article
- Google Scholar
42. Jin Y, Cha S, Kim Y, et al. Association Between the Prevalence of Schistosomiasis in Elementary School Students and Their Parental Occupation in Sudan. Korean J Parasitol. 2022;60(1):51–56. pmid:35247955
- View Article
- PubMed/NCBI
- Google Scholar
43. Sokolow S. The History of Schistosomiasis in Ghana. Stanford University. Available: https://schisto.stanford.edu/pdf/Ghana.pdf.

[ref1] 1. Sturrock RF. The Schistosomes and their intermediate hosts. In: Tropical medicine. 2001. p. 7–83. Available from: https://doi.org/10.1142/9781848161511_0002

[ref2] 2. Ahmed SH. Schistosomiasis (Bilharzia): Background, Pathophysiology, Etiology. eMedicine. 2023 Mar 23; Available from: https://emedicine.medscape.com/article/228392-overview?form=fpf
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Caffrey CR. Schistosomiasis and its treatment. Future Med Chem. 2015;7(6):675–676. pmid:25996057
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref4] 4. Bosompem KM, Bentum IA, Otchere J, Anyan WK, Brown CA, Osada Y, et al. Infant schistosomiasis in Ghana: a survey in an irrigation community. Trop Med Int Health. 2004 Aug;9(8):917–22. pmid:15303998
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref5] 5. Sumbele IUN, Tabi DB, Teh RN, Njunda AL. Urogenital schistosomiasis burden in school-aged children in Tiko, Cameroon: a cross-sectional study on prevalence, intensity, knowledge and risk factors. Trop Med Health. 2021 Sep 16;49(1):75. pmid:34530935
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref6] 6. Adenowo AF, Oyinloye BE, Ogunyinka BI, Kappo AP. Impact of human schistosomiasis in sub-Saharan Africa. Braz J Infect Dis. 2015 Mar-Apr;19(2):196–205. Epub 2015 Jan 27. pmid:25636189
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref7] 7. Grimes JE, Croll D, Harrison WE, Utzinger J, Freeman MC, Templeton MR. The roles of water, sanitation and hygiene in reducing schistosomiasis: a review. Parasit Vectors. 2015 Mar 13;8:156. pmid:25884172
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref8] 8. Chipeta MG, Ngwira B, Kazembe LN. Analysis of Schistosomiasis haematobium infection prevalence and intensity in Chikhwawa, Malawi: an application of a two part model. PLoS Negl Trop Dis. 2013;7(3):e2131. Epub 2013 Mar 21. pmid:23556017
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref9] 9. Khan A, Ullah S, Nitz J. Statistical modelling of falls count data with excess zeros. Inj Prev. 2011 Aug;17(4):266–70. Epub 2011 Jun 8. pmid:21653652
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref10] 10. Bhaktha N. Properties of Hurdle Negative Binomial Models for Zero-Inflated and Overdispersed Count data. Doctoral dissertation, Ohio State University. 2018. Available: http://rave.ohiolink.edu/etdc/view?acc_num=osu1543573678017356.

[ref11] 11. Cameron AC, Trivedi PK. Model Specification and Estimation. In: Regression Analysis of Count Data. Cambridge: Cambridge University Press; 2013. p. 21–68. (Econometric Society Monographs).

[ref12] 12. Mohammed. Zero-Inflated Models. Otago University. Available: https://www.otago.ac.nz/ripe/otago301201.pdf.

[ref13] 13. Heilbron DC. Zero‐Altered and other Regression Models for Count Data with Added Zeros. Biometrical Journal [Internet]. 1994 Jan 1;36(5):531–47. Available from: https://doi.org/10.1002/bimj.4710360505.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref14] 14. Perumean-Chaney SE, Morgan C, McDowall D, Aban I. Zero-inflated and overdispersed: what’s one to do? Statistical Computation and Simulation/Journal of Statistical Computation and Simulation [Internet]. 2013 Sep 1;83(9):1671–83. Available from: https://doi.org/10.1080/00949655.2012.668550
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref15] 15. Lambert D. Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics. 1992 Feb 1;34(1):1. Available from: https://doi.org/10.2307/1269547.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref16] 16. Hu MC, Pavlicova M, Nunes EV. Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial. Am J Drug Alcohol Abuse. 2011 Sep;37(5):367–75. pmid:21854279
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref17] 17. Feng CX. A comparison of zero-inflated and hurdle models for modeling zero-inflated count data. J Stat Distrib Appl. 2021;8(1):8. Epub 2021 Jun 24. pmid:34760432
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref18] 18. Obeng BB, Aryeetey YA, de Dood CJ, Amoah AS, Larbi IA, Deelder AM, et al. Application of a circulating-cathodic-antigen (CCA) strip test and real-time PCR, in comparison with microscopy, for the detection of Schistosoma haematobium in urine samples from Ghana. Ann Trop Med Parasitol. 2008 Oct;102(7):625–33. pmid:18817603
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref19] 19. Aryeetey YA, Essien-Baidoo S, Larbi IA, Ahmed K, Amoah AS, Obeng BB, et al. Molecular diagnosis of Schistosoma infections in urine samples of school children in Ghana. Am J Trop Med Hyg. 2013 Jun;88(6):1028–31. Epub 2013 Mar 25. pmid:23530072
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref20] 20. Dassah S, Asiamah GK, Harun V, Appiah-Kubi K, Oduro A, Asoala V, et al. Urogenital schistosomiasis transmission, malaria and anemia among school-age children in Northern Ghana. Heliyon. 2022 Sep 2;8(9):e10440. pmid:36119865
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref21] 21. Angora EK, Boissier J, Menan H, Rey O, Tuo K, Touré AO, et al. Prevalence and Risk Factors for Schistosomiasis among Schoolchildren in two Settings of Côte d’Ivoire. Trop Med Infect Dis. 2019 Jul 23;4(3):110. pmid:31340504
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref22] 22. Nelwan M. Risk factors of schistosomiasis. Social Science Research Network [Internet]. 2020 Jan 1; Available from: https://doi.org/10.2139/ssrn.3722691.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref23] 23. Agresti A. Categorical data analysis. Wiley series in probability and statistics. 2002. Available from: https://doi.org/10.1002/0471249688
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref24] 24. Liaqat M, Kamal S, Fischer F, Zia N. Zero-inflated and hurdle models with an application to the number of involved axillary lymph nodes in primary breast cancer. Journal of King Saud University—Science. 2022;34(4):101932.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref25] 25. Baetschmann G, Winkelmann R. A dynamic hurdle model for zero-inflated count data: with an application to health care utilization. Social Science Research Network. 2014;(151).
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref26] 26. Ridout M, Demétrio CGB, Hinde J. Models for count data with many zeros. International Biometric Conference. Cape Town. 13. 1–13; 1998.

[ref27] 27. Vuong QH. Likelihood ratio tests for model selection and Non-Nested hypotheses. Econometrica [Internet]. 1989 Mar 1;57(2):307. Available from: https://doi.org/10.2307/1912557
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref28] 28. Miaou SP. The relationship between truck accidents and geometric design of road sections: Poisson versus negative binomial regressions. Accident Analysis and Prevention. 1994 Aug 1;26(4):471–82. Available from: https://doi.org/10.1016/0001-4575(94)90038-8 pmid:7916855
View Article
PubMed/NCBI
Google Scholar

[86] View Article

[87] PubMed/NCBI

[88] Google Scholar

[ref29] 29. Brewer MJ, Butler A, Cooksley SL. The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods in Ecology and Evolution [Internet]. 2016 Jun 1;7(6):679–92. Available from: https://doi.org/10.1111/2041-210x.12541
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref30] 30. Hilbe JM. Modeling count data. 1st ed., Cambridge University Press, 2014.

[ref31] 31. Zhang J, Yang Y, Ding J. Information criteria for model selection. Wiley Interdisciplinary Reviews Computational Statistics. 2023 Feb 20;15(5). Available from: https://doi.org/10.1002/wics.1607.
View Article
Google Scholar

[94] View Article

[95] Google Scholar

[ref32] 32. Aswi A, Astuti SA, Sudarmin S. Evaluating the performance of Zero-Inflated and Hurdle Poisson models for modeling overdispersion in count data. Inferensi: Jurnal Statistika. 2022;5(1):17.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref33] 33. Zeileis A, Kleiber C, Jackman S. Regression Models for Count Data in R. J. Stat. Soft. [Internet]. 2008 Jul. 29 [cited 2024 Mar. 26];27(8):1–25. Available from: https://www.jstatsoft.org/index.php/jss/article/view/v027i08.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref34] 34. Aemero M, Berhe N, Erko B. Status of Schistosoma mansoni prevalence and intensity of infection in geographically apart endemic localities of Ethiopia: a comparison. Ethiop J Health Sci. 2014 Jul;24(3):189–94. pmid:25183924
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref35] 35. Kleiber C, Zeileis A. Visualizing count data regressions using rootograms. The American Statistician. 2016;70(3):296–303.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref36] 36. Boateng EM, Dvorak J, Ayi I, Chanova M. A literature review of schistosomiasis in Ghana: a reference for bridging the research and control gap. Trans R Soc Trop Med Hyg. 2023 Jun 2;117(6):407–417. pmid:36688317
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref37] 37. World Health Organization: WHO. WHO launches new guideline for the control and elimination of human schistosomiasis. WHO [Internet]. 2022 Feb 22; Available from: https://www.who.int/news/item/22-02-2022-who-launches-new-guideline-for-the-control-and-elimination-of-human-schistosomiasis.

[ref38] 38. World Health Organization: WHO. Schistosomiasis. WHO [Internet]. 2023. Available from: https://www.who.int/news-room/fact-sheets/detail/schistosomiasis.

[ref39] 39. Reitzug F, Ledien J, Chami GF. Associations of water contact frequency, duration, and activities with schistosome infection risk: A systematic review and meta-analysis. PLoS Negl Trop Dis. 2023;17(6):e0011377. Published 2023 Jun 14. pmid:37315020
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref40] 40. Hajissa K, Muhajir AEMA, Eshag HA, et al. Prevalence of schistosomiasis and associated risk factors among school children in Um-Asher Area, Khartoum, Sudan. BMC Res Notes. 2018;11(1):779. Published 2018 Oct 31. pmid:30382901
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

[ref41] 41. Abubakar BM, Abubakar A, Moi IM, et al. Urinary schistosomiasis and associated risk factors among primary school students in the Zaki Local Government area, Bauchi State, Nigeria. Dr Sulaiman Al Habib Medical Journal. 2022;4(4):196–204.
View Article
Google Scholar

[124] View Article

[125] Google Scholar

[ref42] 42. Jin Y, Cha S, Kim Y, et al. Association Between the Prevalence of Schistosomiasis in Elementary School Students and Their Parental Occupation in Sudan. Korean J Parasitol. 2022;60(1):51–56. pmid:35247955
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref43] 43. Sokolow S. The History of Schistosomiasis in Ghana. Stanford University. Available: https://schisto.stanford.edu/pdf/Ghana.pdf.

Figures

Abstract

Background

Methods

Results

Conclusion

Introduction

Materials and methods

Demographic covariates

Statistical methods

Models comparison

Measures of association

Ethics statement

Results

Descriptive statistics

Grouping by infection intensity

Using different set of predictors

Interpreting model fitting

ZINB and HNB regression model

Modelling and interpreting main effects

Discussion

Conclusion

Supporting information

S1 Fig. A bar relationship between observed and model’s expected frequencies (in log axis).

S1 Appendix. Data and code availability.

References