The authors have declared that no competing interests exist.
Conceived and designed the experiments: RE VC NF OB. Analyzed the data: OB RE. Wrote the paper: OB RE JI NM VC. Collected the data: AD NM SBB. Revised the manuscript: RE NF VC NM JI AD OB.
Vector control is a major step in the process of malaria control and elimination. This requires vector counts and appropriate statistical analyses of these counts. However, vector counts are often overdispersed. A nonparametric mixture of Poisson model (NPMP) is proposed to allow for overdispersion and better describe vector distribution. Mosquito collections using the Human Landing Catches as well as collection of environmental and climatic data were carried out from January to December 2009 in 28 villages in Southern Benin. A NPMP regression model with “village” as random effect is used to test statistical correlations between malaria vectors density and environmental and climatic factors. Furthermore, the villages were ranked using the latent classes derived from the NPMP model. Based on this classification of the villages, the impacts of four vector control strategies implemented in the villages were compared. Vector counts were highly variable and overdispersed with important proportion of zeros (75%). The NPMP model had a good aptitude to predict the observed values and showed that: i) proximity to freshwater body, market gardening, and high levels of rain were associated with high vector density; ii) water conveyance, cattle breeding, vegetation index were associated with low vector density. The 28 villages could then be ranked according to the mean vector number as estimated by the random part of the model after adjustment on all covariates. The NPMP model made it possible to describe the distribution of the vector across the study area. The villages were ranked according to the mean vector density after taking into account the most important covariates. This study demonstrates the necessity and possibility of adapting methods of vector counting and sampling to each setting.
Malaria is still a major public health issue in SubSaharan Africa. In 2010, this region bore 91% of the global disease death burden estimated to 655,000 deaths
The most common indicator to evaluate vector control interventions such as LLIN and IRS relies on malaria transmission through estimation of the Entomological Inoculation Rate (EIR). EIR is the product of the Human Biting Rate (HBR; number of bites of malaria vectors per human per unit time) and the prevalence of
In 28 villages of Southern Benin, a recent cluster randomized controlled trial (RCT) aiming at comparing the efficacy of combined LLIN and carbamate IRS or carbamatetreated plastic sheeting (CTPS) with a background of LLIN coverage did not show benefits of the combination for reducing HBR and EIR
The most ancient and popular statistical distribution used to describe count data is the Poisson distribution that assumes equidispersion of the counts. However, in real datasets, these counts are often overdispersed
To deal with such overdispersed data with excess zeros, Johnson and Kotz
Besides these wellknown models, other finite mixture distribution models have been proposed (e.g., McLachlan and Peel
In the present work, we assessed the ability of Poisson, NB, ZIP, ZINB and NPMP to fit the distribution of counts of malaria vectors measured in 28 villages in southern Benin where a clinical trial was implemented to evaluate the efficacy of vector control interventions for malaria prevention
The data analyzed in the present study stem from mosquito collections carried out every 6 weeks between January and December 2009 (i.e. 8 surveys) in 28 villages of the sanitary region of OuidahKpomassèTori (OKT) in South Benin. Of the 58 villages screened at the baseline, 28 were enrolled. The other villages were excluded because they did not fulfill inclusion criteria i.e. distance between two villages >2 km, population size between 250 and 500 inhabitants with nonisolated habitations and absence of any local health care centre.
Entomological surveys were performed using the HLC technique, on two successive nights (22:00 to 06:00) at four sites (both indoor and outdoor) per village. Collectors were hourly rotated along collection sites and/or position (indoor/outdoor). Malaria vectors collected on humans were identified using morphological keys
These villages were divided into four groups (seven villages per group) where four different vector control measures were implemented (see Corbel et al.
The IRD (Institut de Recherche pour le Développement) Ethics Committee and the National Research Ethics Committee of Benin approved the study (CNPERS, reference number IRB00006860). The study was also registered with Current Controlled Trials, number ISRCTN07404145. All necessary permits were obtained for the described field studies. No mosquito collection was done without the approval of the village chief, the owner and occupants of the collection house. Mosquito collectors gave their written informed consent and were treated free of charge for malaria presumed illness throughout the study.
The following data were collected: the average distance (in km) from each village to the nearest freshwater body (Toho lake), the presence of market gardening 2 km around each village, the presence of cattle farms inside the village, the presence of water conveyance in the village, and the population density. The layout (or structure) of each village was described by the distribution of its clusters of houses, these clusters being separated by vegetated strips. Two modalities were then considered: singlecluster vs. multicluster villages. Daily rainfall data from 8 weather stations were spatially interpolated to compute the cumulated rainfall (in mm) and the number of rainy days in each village during the 15 days preceding each survey. The Normalized Difference Vegetation Index (NDVI) was derived from a “Satellite pour l'Observation de la Terre (SPOT5)” satellite image acquired on 12/28/2003. The mean NDVI was computed in a buffer area of 50 m diameter around each mosquito collection site (house).
The meanvariance relationship regarding the number of collected malaria vectors was analyzed graphically to explore data dispersion. A linear relationship of slope 1 (variances equal to means) indicated a Poisson distribution without overdispersion whereas a linear relationship with slope >1 or a quadratic relationship indicated overdispersion. We also assess the “excess of zero” through a graphical representation of the distribution of vector counts.
The approximations of the distribution of the number of collected malaria vectors by the Poisson, ZIP, NB, ZINB and NPMP distributions were compared using the maximum likelihood (ML) estimation. Poisson, ZIP, NB and ZINB models were fitted using the function
Given the hierarchical structure of the data collection system, another NPMP model was considered to allow for various components of the variance of the counts. In this model, the counts of malaria vectors were assessed according to environmental and climatic covariates with the “village” as a random effect. It is thus a “conditional” model (on “village”). The latter model allows for the following variables: average distance to Toho lake (in km), water conveyance (0 = absence, 1 = presence), market gardening (0 = absence, 1 = presence), cattle farms inside the village (0 = absence, 1 = presence), the layout of the village (0 = multicluster, 1 = singlecluster), population density (inhabitants per 100 m^{2}), both the mean cumulated rainfall over the 8 surveys (in mm) and the deviation from this mean at each survey, both the mean cumulated number of rainy days over the 8 surveys and the deviation from this mean at each survey, both the averaged NDVI over the 4 collection houses per village and the deviation from this average for each house and, finally, the specific collection site (0 = inside of the house, 1 = outside of the house).
According to the current recommendation for the use of hierarchical models, each covariate was centered on its mean before introduction into the model
In the NPMP conditional model, the number of malaria vectors
Function
For each village, a posteriori probability of belonging to each class after adjustment on all the covariates is estimated by the NPMP conditional model. Here, “a posteriori probability” means the conditional probability for a village to belong to a given class, given the data. For a village
Hence, each village is assigned to one of the classes based on the maximum of the a posteriori probabilities (MAP). This provides a classification of the villages according to the average number of malaria vectors collected at a given site over a given night after adjustment on all the covariates.
In order to assess the impact of TLLIN, ULLIN, ULLIN+CTPS and TLLIN+IRS vector control strategies, the village grouping for implementation of these vector control strategies and the classification resulting from the NPMP conditional model were compared using a KruskallWallis test.
The relevance of a NPMP model also called Poisson latent classes model, be it marginal or conditional, depends jointly on its ability to provide a close distribution to that of the observed counts and on its ability to assign each count one of the classes. Essentially, two criteria contributed to the choice of the number of classes: the closeness of the predicted values to the observed ones, which is the deviance expressed under the form of a Bayesian Information Criterion
Total of 2,994 malaria vectors were collected during 3,584 humannights of mosquito collection. This corresponded to an average HBR of 0.835 bites per human per night. Among these vectors, 1,872 belonged to the
Village, survey, and villagesurvey mean numbers of malaria vector collected per night on humans were plotted with their corresponding variances in
Panel D shows a bar diagram of the distribution of mosquito counts at each collection site (the scale of the Xaxis was limited to 14). On panels A, B and C: the dotted lines represent a linear link between the means and the variance (
Frequency plot of the collected malaria vectors is shown in
Parameters  
Distribution  Mean (SE)  Proportion (SE)  Dispersion parameter  −2logL 
Standard Poisson  0.835 (0.015)  1 ()  13492.470  
Zeroinflated Poisson (ZIP)  9229.370  

0 ()  0.736 (0.008)  

3.169 (0.062)  0.264 (0.008)  
Poisson mixture model with 4 latent classes (NPMP)  7591.700  

0 (7×10^{−6})  0.630 ()  

0.923 (0.029)  0.296 ()  

6.555 (0.161)  0.070 ()  

24.480 (1.281)  0.004 ()  
Negative Binomial (NB)  0.835 (0.038)  1 ()  0.156 (0.007)  7581.856 
Zeroinflated negative binomial (ZINB)  7581.856  

0 ()  3.6×10^{−6}(1.7×10^{−5})  

0.835 (0.038)  0.999 (1.7×10^{−5})  0.156 (0.007) 
−2logL: −2 times the loglikelihood
There were a significant decrease of the model deviance when a ZIP model was used instead of the standard Poisson model and also when the NPMP was used instead of the ZIP model (
Level and covariate  Relative Risk (95% CI) 


Distance to a freshwater body (per additional km)  0.885 (0.871–0.899) 
Presence of water conveyance (Yes vs. No)  0.411 (0.348–0.485) 
Presence of market gardening (Yes vs. No)  1.146 (1.016–1.292) 
Presence of cattle (Yes vs. No)  0.817 (0.700–0.954) 
Layout of the village (single vs. multicluster)  0.466 (0.377–0.574) 
Population density (per additional inhabitant/100 m^{2})  1.335 (1.079–1.651) 
Mean rain quantity over all surveys (per additional mm)  1.325 (1.292–1.359) 
Mean number of rainy days over all surveys (per additional day)  2.148 (1.675–2.754) 
Mean NDVI (per additional grade)  0.849 (0.827–0.872) 


Deviation from the mean NDVI of the village (per additional grade)  0.990 (0.978–1.003) 
Collection site (outside vs. inside)  1.182 (1.100–1.270) 


Deviation 
0.993 (0.989–0.997) 
Deviation 
0.902 (0.827–0.984) 
Difference between the mean value over all surveys and the value at a given survey
Village  Latent class  Mean number of mosquitoes  Proportionof villages  MAP 
Hekandji  1  0.050  0.036  0.992 
Aidjedo  2  0.137  0.218  0.997 
Assogbenou  1  
Ayidohoue  0.990  
Dokanmey  0.998  
Hounkponouhoue  1  
Abenihoue  1  
Adjame  3  0.324  0.466  1 
Amoulehoue  1  
Adjahassa  0.924  
Kindjitokpa  1  
Vidjinnagnimon  1  
Guezohoue  1  
Hla  1  
Agokon  0.968  
Dekponhoue  0.998  
Lokohoue  1  
Todo  1  
Wanho  1  
Zoume  0.994  
Agouako  4  0.713  0.280  0.775 
Hinmadou  0.925  
Manguevie  0.925  
Satre  0.925  
Soko  0.925  
Tanto  0.925  
Tokoli  0.925  
Agadon  0.925 
A KruskalWallis test did not show a significant association between villages classification obtained from the model and the villages grouping for vector control strategies (Chi2 = 2.029, pvalue = 0.566). Thus, according to HBR, a significant difference in term of impact of vector control strategies (TLLIN, ULLIN, ULLIN+CTPS and TLLIN+IRS) is not showed.
Knowledge of malaria vector density in a given area is often needed for implementing and evaluating vector control interventions. This requires vector counts at several sites of the area and statistical analyses of these counts.
McCullagh and Nelder
Moreover, the numbers of collected vectors during the 8 surveys are assumed to be uncorrelated although one may speculate about a correlation structure along time. Nevertheless, the correlation between mosquito counts from successive surveys is deemed to be very low because the time span between two successive surveys is 6 weeks whereas the lifespan of the vectors is only 3 to 4 weeks. Studying the correlation between counts from two nights during the same survey may reveal interesting results.
In southern Benin, both spatial and temporal heterogeneities in vector densities were mentioned by Djènontin et al.
One unexpected finding of the present study was that the NDVI was negatively correlated with the density of malaria vectors. This finding contrasts with several studies that used satellite imagery at a lower resolution
In this work, villages were ranked into four classes of increasing mean malaria vector density but we were not able to find any relationship between this grouping structure and the vector control intervention implemented in the village. This confirms the finding of Corbel et al.
In conclusion, we found that the NPMP model was useful to assess the relationships between vectors density and villages or environmental characteristics. It might therefore be an efficient tool to compute risk maps of the hostvector contact. Moreover, the NPMP model provided a classification of the villages after taking into account some covariates. Such a classification could be used at a prestudy step to improve the study design of mosquito collection and adapt the sampling effort according to the village characteristics, especially in region with high spatial and temporal heterogeneities of mosquito density, like in the OKT region. Furthermore, NPMP model could help in the study design of RCT when a stratified sampling is needed. The same model may be adapted and used in other settings for the study of the distribution of vectors of other diseases.
We thank the populations and authorities of the OKT district for their kind support and collaboration. We also thank Pr. JeanFrançois Etard for his helpful contribution to the conception and design of the present study.