Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control

Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the ‘resistance curve’ and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.


Introduction
Scaling up of malaria prevention and treatment has averted over 660 million cases of malaria since 2000 [1].The vast majority of this reduction has come from mosquito control with pyrethroid insecticide-based interventions, primarily the use of long lasting insecticide-treated bednets (LLINs) and, to a lesser extent, indoor residual spraying (IRS).Resistance to insecticides in major malaria vectors such as Anopheles funestus threatens the continued success of these interventions.Unless resistance is managed, the massive reduction of malaria transmission from scaling up these interventions could be reversed [2].A key prerequisite for resistance management is to understand the evolution of insecticide resistance to predict the speed and direction of spread of resistance and provide vital information to implement successful control strategies [3].There are multiple mechanisms of insecticide resistance, these include changes to insecticide target molecules that render the insecticide unable to bind, behavioural changes leading to the avoidance of insecticide contact, thickening of the insect's cuticle to prevent the insecticide reaching its target and detoxification of the insecticide before it reaches its target (metabolic resistance).Of these, metabolic resistance has the greatest operational significance [3,4], yet it remains unclear how mosquito populations exhibiting these mechanisms respond to insecticide-based interventions including LLINs.Selective sweeps associated with target-site resistance have been assessed in mosquito species [5,6] but no such assessment has been made for metabolic resistance.In the major malaria vector An.funestus, pyrethroid resistance is mainly conferred by metabolic resistance associated with a major quantitative trait locus (QTL) at which two duplicated cytochrome P450 monooxygenases (CYP6P9a and CYP6P9b) are the main resistance genes [7] [8].The predominance of metabolic resistance in An. funestus makes this species very suitable to assess metabolic resistance-based evolutionary responses of mosquitoes to the massive scale up of pyrethroid-based vector control interventions across Africa.
In the context of increasing reports of insecticide resistance in malaria vectors such as An.funestus across Africa [9][10][11][12][13][14][15][16], it is important to determine the relative contributions to this of gene flow and of the autochthonous appearance of insecticide resistance.Previous studies suggest significant genetic structure among An.funestus populations across Africa [17].Whether such differences in genetic structure explain the contrasting insecticide resistance patterns seen in this species [9][10][11][12][13][14][15][16] and could help predict the speed and direction of spread of resistance remains to be determined.
Here, using microsatellite analysis, whole genome sequencing and fine-scale sequencing at a resistance locus, we elucidated the Africa-wide population structure of An. funestus and detected a strong selective sweep occurring at a major cytochrome P450-based pyrethroid resistance locus.Moreover, we demonstrated that this selective sweep is driven by the scale-up of insecticide-based malaria control in Africa, highlighting the risk that if this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions could be compromised.

1-African-wide microsatellite diversity shows evidence of positive selection on An. funestus chromosome 2R
Analysis of genetic diversity at 11 microsatellites from across the An.funestus genome sampled in six populations from West (Ghana and Benin), Central (Cameroon), East (Uganda) and southern Africa (Malawi and Mozambique) revealed reduced diversity in two microsatellites at the telomeric end of chromosome arm 2R (Fig 1A).These microsatellites; AFUB6, located in a genomic region spanning a pyrethroid resistance QTL (rp1) and FunR, located in the same QTL in the 5' un-translated region of the pyrethroid resistance gene CYP6P9a (Fig 1B ), showed few alleles and low heterozygosity (S1 Table ).
We hypothesised that this low diversity was due to a selective sweep in this QTL region driven by pyrethroid-based vector control interventions.To further localise the putative sweep, we genotyped 5 additional microsatellites upstream and downstream of FunR and AFUB6 on chromosome 2R.This revealed that the reduced diversity was restricted to the two markers within rp1 (S2 Table ).Consistent with our hypothesis, the lowest gene diversity at FunR was seen in the two southern African populations from Malawi and Mozambique (Fig 1A ), which also show the highest levels of pyrethroid resistance [4,18].
Continent-wide population structure of Anopheles funestus.To investigate the population structure of An. funestus throughout Africa, we removed the two microsatellites putatively under selection to analyse the remaining 9 neutral markers.Genetic divergence (F st ) among countries ranged from 2.9% for Ghana-Uganda to 10.8% for Uganda-Mozambique (S2 Table ).Relative divergence among samples from different countries was in broad agreement with their geographical locations, with a major exception: Uganda clustered with Ghana and Cameroon (Fig 1C ; F st 2.9-4.1%)while Benin formed an outlier from the other samples (Fig 1C ; F st 4.9-9.6%).The southern African populations from Malawi and Mozambique were both highly divergent from the other samples (Fig 1C ; F st 6.5-10.8%)though, interestingly, also from each other (6.3%).
The impact of the loci putatively under selection on chromosome 2R was investigated by repeating the analyses on two datasets, one comprising 8 microsatellites on chromosome 2R and one comprising 8 microsatellites on the other chromosomes (S1 Fig) This apparent discontinuity between southern and the rest of Africa is consistent with the contrasting resistance patterns and mechanisms seen among An.funestus populations [17] suggesting possible barriers to gene flow between regions.This is supported for example by the complete absence throughout southern Africa of the DDT 119F resistance allele of the glutathione-S transferase gene GSTe2, which is predominant in West/Central Africa [15,19].
Bayesian analysis of population structure was also undertaken on the 9 putatively neutral loci.The most likely number of clusters estimated from the data was 2, and represents the major divergence between southern Africa and elsewhere (Fig 1D).When analyses were repeated using 8 2R microsatellites and 8 non-2R microsatellites, the most likely number of clusters were 3 and 2, respectively.Analysis of the cluster assignment probabilities (to 3 clusters) for all three datasets showed that Malawi and Mozambique were consistently assigned to their own 'southern' cluster, even with the 2R marker set (S3 Table ).By contrast, the other populations were more variable, and affected by markers on 2R.For instance, using non-2R markers Ghana and Benin were assigned to the same cluster (44% and 41%, respectively), but using 2R markers they were clearly assigned to separate clusters (Ghana cluster 1: 76% and Benin cluster 2: 65%).
Overall, the analyses are consistent in indicating that gene flow is restricted between southern and the rest of Africa.They also suggest that selection on chromosome 2R may be independent among different geographical regions but appears to be common to both southern African An. funestus populations.Africa-wide population structure analysis.A) Gene diversity of 11 microsatellites from throughout the genome, showing a loss of diversity at the telomeric end of chromosome 2R.B) Gene diversity of 8 microsatellites on chromosome 2R (including AFUB6, FunR and FunO) shows that the loss of diversity is restricted to AFUB6 and FunR, the microsatellites located near the pyrethroid resistance QTL rp1.C) Neighbor-Joining tree based on pairwise F st among population samples, estimated using 9 neutrally evolving microsatellites from throughout the genome (excluding AFUB6 and FunR on chromosome 2R, which may be evolving under positive selection).D) Barplot of assignment probabilities of individual genotypes to two ancestral clusters (the most likely number estimated from the data) estimated using Bayesian population structure analysis under an admixture model.Each bar is an individual genotype for 9 neutrally evolving microsatellites from throughout the genome (excluding AFUB6 and FunR).In panel A, points from left to right represent the following microsatellites: FunQ (on chromosome X); AFUB6, FunR, FunO (on 2R); AFUB11, FunL, AFUB10 (on 2L); AFND7, AFND19 (on 3R); FunF and AFUB12 (on 3L).In panel B, points from left to right represent AFUB3, AFND40, AFUB6, FunR, AFND6, AFND30, AFND32 and FunO (all on 2R).), which also produced geographical clustering based on an ML-tree (Fig 2B).The extent of this low polymorphism in Malawi is characterised by the complete absence of polymorphic sites at +61kb from CYP6P9a (BAC95), only 1 polymorphic site at +86kb, and 2 polymorphic sites at +36kb from CYP6P9a.
Analysis of the population from Cameroon shows that the relative diversity remains high across the rp1 genomic fragment although a slight reduction is observed at CYP6P9a compared to other loci.Individual analysis of these loci shows that at -34kb from CYP6P9a (BAC 0) high haplotype diversity is noted.No predominant haplotype is observed with the most common haplotype, "CAM1", having a frequency of 13.1% (5/38) (S2E Fig) .A high proportion of haplotypes were singletons (16 out of 23) even in Malawi (7) and Mozambique (4).Additionally, there was a high number of mutational steps observed between haplotypes from each (>12) or between (>20) country (S2E  ).
Maximum likelihood phylogenetic tree and haplotype network for CYP6P9a.Analysis of the ML tree of CYP6P9a further provided evidences of selection acting on this gene in southern Africa in contrast to other regions.CYP6P9a from Malawi and Mozambique formed a defined clade from the other four populations, dominated by a single haplotype (MAL/ MOZ23-H25), which is the allele recently shown to exhibit the highest catalytic efficiency in metabolising pyrethroids (S3A Fig) [20].Elsewhere, Benin and Ghana did not form a common clade despite their geographical proximity.Benin formed a cluster with Cameroon while Microsatellite genetic diversity pre-and post-intervention.When the microsatellite diversity of pre-and post-intervention samples was compared, the only major variation occurs on the 2R chromosome, with higher genetic diversity in pre-intervention samples (H = 0.45) (Fig 3A).As already seen for the Africa-wide analysis, reduced diversity was detected on the 2R chromosomes around the rp1 notably for the FunR, AFUB6 and AFND6, markers located close to the pyrethroid resistance gene.Per-locus gene diversity was compared among preand post-intervention samples.In both Malawi and Mozambique, the FunR locus showed a striking reduction in gene diversity between collection time-points, from 0.45 in 2002 to 0-0.15 in 2009-2011.The fact that this rp1 region is the only area of the genome with a significant change of diversity level between pre and post-intervention samples further supports the recent occurrence of a selective sweep associated with implementation of vector control interventions.
Signature of selective sweep around 120kb genomic region of rp1 in southern Africa.Fine-scale sequence analysis of polymorphism in the 120kb genomic region spanning the rp1 locus revealed a loss of genetic diversity in the post-intervention samples occurring most strongly around the CYP6P9a resistance gene (S3B Fig; S4 Table).In contrast, samples collected from both Cameroon and pre-intervention in southern Africa showed no reduction of diversity across the entire rp1 region (S4A and S4B  ).The predominant post-intervention haplotype comprises 95% (68/ 72) of the sequences, highlighting the risk of a resistance allele becoming rapidly fixed over an 8 year period, from an original frequency of only 7.5%.Additionally, pre-intervention In contrast, pre-intervention samples have 68 combined polymorphic sites, higher diversity, and no signature of selection.The ML tree revealed that pre-intervention samples are highly diverse in contrast to post-intervention samples, which are highly homogenous (Fig 3E).The contrast was also evident from the haplotype network, with a predominance of singleton haplotypes pre-intervention (82.5%) compared to only 7.5% post-intervention and the presence of a predominant haplotype in postintervention samples (67.5%) corresponding to the resistance haplotype previously shown to be driving resistance in southern Africa (S6A Fig) [8].When an Africa-wide comparison is performed for CYP6P9a polymorphisms, both pre-intervention samples are divergent and diverse while post-intervention samples form a cluster with low divergence (S6B Fig) .Further analysis of the polymorphisms of the CYP6P9a coding region detected key amino acid changes shown to provide the highest catalytic efficiency when metabolising pyrethroids (S5B Fig) [20].

4-Whole genome sequencing validates the selective sweep at the rp1 locus as the major genomic difference between pre-and postintervention mosquitoes
Pooled template sequencing was carried out on two pools of mosquito genomic DNA: one from mosquitoes collected in 2014 and one from mosquitoes collected in 2002 in Malawi (S7 Table ).Sequences were aligned to the FUMOZ reference genome (S8 Table) and stringent filtering performed to remove SNPs at the extremes of coverage depth (S9 Table ).Analysis of a total of 979,808 variant sites genotyped in both samples detected 3,078 variant sites (on 368 genomic scaffolds) with significantly different allele frequencies between 2002 and 2014 collections.A significant correlation was observed between the number of significant sites plotted against total scaffold length (p<<0.01 for both Pearson's and Spearman's tests) (Fig 4A).However, a number of outlier scaffolds appear to be enriched for significant sites.The most extreme of these is scaffold KB669169, with more than 400 significant variants.This scaffold contains the rp1 locus and most of the significant sites are clustered around this region (Fig 4B ) with a striking loss of diversity between 2002 and 2014 evident across the locus (Fig 4C and 4D).This valley of reduced variability around rp1 is the typical signature of selective sweeps as previously observed in other pool-seq studies [21,22].In no other scaffold was the signature so striking (S7 Fig), which validated the results of our microsatellite and targeted sequencing analyses.The rp1 region of scaffold KB669169 contains a number of sequencing gaps.To ensure this did not affect our conclusions, sequence data were additionally aligned to the sequenced 120kb BAC clone containing rp1 [7].This confirmed the results observed in the whole genome analysis (Fig 4E ; S8 Fig) .The 2014 post-intervention sample exhibits a pattern similar to FUMOZ (the pyrethroid resistant laboratory colony) at rp1 with reduced diversity spanning the cytochrome P450 cluster including the duplicated CYP6P9a and CYP6P9b genes, whereas the 2002 pre-intervention sample is highly diverse across the locus.

Discussion
In order to help maintain the continued success of current insecticide-based malaria control interventions, this study has established the Africa-wide population structure and the full genomic signature of pyrethroid-based interventions in a major malaria-transmitting mosquito providing key evidence of the evolutionary response of mosquito populations to the massive scale up of insecticide-based interventions in Africa.

1-Patterns of genetic structure across An. funestus populations support the presence of barriers of gene flow
This study revealed that southern African populations of An. funestus are more genetically differentiated to other populations as they always form a unique cluster compared to other African regions based on both Bayesian analyses and Fst estimates.The population from Uganda appears to be intermediate between southern and West/Central Africa.This result is similar to patterns of genetic structure previously reported for this species [17].Patterns of genetic structure observed in this study support the contrast in resistance patterns between populations of An. funestus and suggest the presence of barriers of gene flow between populations of this species.The causes of these barriers remain unknown although it could be associated with the absence of An. funestus around the Equatorial belt, or the presence of the Rift Valley which affects the population genetic structure of An. gambiae [23].Such hypothesis will need to be validated by assessing more populations, e.g. from both sides of the Rift Valley.The patterns of gene flow described here for An.funestus give an indication on the risk and speed of spread of insecticide resistance alleles between these populations.This is further supported by the observation that the 119F resistance allele of the GSTe2 gene conferring DDT resistance probably arose in West or Central Africa, where it is common, but remains absent from southern Africa, despite selection pressure from DDT use [19].

2-Pyrethroid resistance is associated with a signature of positive directional selection
This study identified a major selective sweep on the 2R chromosomal arm, its location coinciding with that of the main pyrethroid-resistance QTL explaining 85% of genetic variance to resistance, and containing key cytochrome P450 genes conferring pyrethroid resistance [7,24].Overall, multiple analyses provide evidence for a selective sweep on the 2R chromosome driven by metabolic cytochrome P450-based pyrethroid resistance; (i) Reduced diversity of microsatellites flanking pyrethroid resistance genes on 2R.This study identified a valley of reduced diversity at the AFUB6 and FunR loci on the 2R chromosome, a typical signature of selective sweep similar to that described in malaria parasites in Southeast Asia in response to pyrimethamine treatment [25].The reduced heterozygosity of these microsatellite markers and their position around the rp1 pyrethroid resistance QTL point to a selective sweep associated with pyrethroid resistance.This is further supported by the lower diversity observed in the more resistant Southern populations of Malawi and Mozambique [4,18,26].
(ii) Reduced polymorphism of the 120kb rp1 genomic region.Selection in the southern populations is confirmed by the continuous reduced genetic diversity for Malawi and Mozambique and also the reduced haplotype diversity in both countries.Selection is strongest around -9kb from CYP6P9a (BAC-25).This is due to its proximity to the two resistance genes, CYP6P9a and CYP6P9b.However, several other tests that can detect positive selection (MK, HKA, dN/dS and the Ka/Ks ratios) did not show any sign of positive selection.This can be explained by the fact that in a situation of near fixation of a selective sweep as seen in this study for the southern African populations, directional selection is rather indicated by reduced genetic variation [27].Nevertheless, when estimating Ka/Ks ratios, a 2.64 and 2.5 fold reduction of the Ka/Ks estimates was observed in post intervention samples from Malawi and Mozambique respectively compared to the pre-intervention.The selection on the rp1 region appears to be more extensive in Malawi than in Mozambique in line with previous reports [8].In Malawi, the region under selection spans from -9kb from CYP6P9a and beyond +86kb, which reasonably could be above a region of 100kb.In Mozambique the region under selection spans from -9kb to +61kb, which is around 70kb.
(iii) Directional selection acting on CYP6P9a gene.The strong directional selection observed on CYP6P9a is only seen in southern Africa and correlates with the extensive pyrethroid resistance observed in this region with a high over-expression of this gene [4,8,18,26].The reduced CYP6P9a haplotype diversity with limited mutational steps between haplotypes in southern Africa is indicative of selection while in contrast the west and central African countries maintained high diversity.This lack of selection in other African regions could suggest that rp1 may not be the main resistance region in other populations of An. funestus such as Uganda and Benin where pyrethroid resistance has been reported but with a less extreme over-expression of CYP6P9a [11,13,19].
The predominant haplotype in Malawi and Mozambique is a CYP6P9a allele recently shown to have greater efficiency in metabolising pyrethroids compared to susceptible alleles [8,20].Vector control interventions such as LLINs and IRS are largely implemented in these countries [4,28].This signature of directional selection on CYP6P9a is similar to the selection observed in the glutathione S-transferase GSTe2 gene, another detoxification gene conferring resistance to DDT in An. funestus populations West/Central Africa [19].Resistant samples from Benin exhibited a nearly fixed GSTe2 haplotype in contrast to DDT susceptible populations where high genetic diversity is observed [19].Evidence of a selective sweep around insecticide resistance genes has been reported in other species, such as CYP6G1 in D. melanogaster [29] for which a single haplotype, containing a partial Accord transposable element in the 5' UTR, confers DDT resistance [30].
Patterns of genetic differentiation based on CYP6P9a aligned with F ST estimates obtained with microsatellites, where Malawi and Mozambique were significantly genetically differentiated from other African populations.The overall pattern of differentiation obtained here with CYP6P9a is also similar to that obtained with GSTe2 gene [19] further supporting the presence of barriers to gene flow between southern Africa and other regions.Furthermore, the predominant CYP6P9a haplotype in southern Africa was completely absent from the other countries in line with the low level of gene flow observed between southern and West/Central Africa.This contrasting distribution of the CYP6P9a resistance haplotype is similar to the distribution of resistance mutations previously reported in An. funestus such as the L119F GSTe2 mutation, which is completely absent in Southern Africa [19].However, one cannot exclude that the CYP6P9a resistant haplotype will spread to other regions with time as recently observed for the A296S RDL mutation in An. funestus.Indeed the 296S resistant allele initially located only in West to East Africa and completely absent from southern Africa in 2010 [16] was recently detected in southern Africa, although at low frequency [15].Similar geographical spread of resistance alleles over time has been observed for other resistance mutations such as knockdown resistance (kdr) mutations in mosquito species such as An.gambiae [31] or Ae.aegypti [32,33].
(iv) Temporal analysis of southern African populations revealed that the scale-up of insecticide-based interventions is driving selection for insecticide resistance.In theory, causes other than insecticide use in public health could be selecting for insecticide resistance (e.g.agricultural pesticides or general pollution).However, while these may play a role, the scale-up of insecticide-based interventions across southern Africa was the only major influence that changed in such a short period of time (7-8 years).Based on the genetic diversity, haplotype networks, and the predicted maximum likelihood phylogenetic tree and the whole genome sequencing approach, the selective sweep observed around the rp1 in general and particularly on CYP6P9a only happened after implementation of control interventions.Furthermore, the selection to near fixation of an allele highly efficient to metabolize pyrethroids [20] strengthens the hypothesis that this selective sweep was driven by these pyrethroid-based interventions.Such selection of a favourable allele is similar to previous examples of positive selection observed for target site mutations (ex.kdr) conferring resistance in many mosquito vectors [34][35][36][37][38] and selection linked to the Ace-1 gene in An. gambiae [39].In contrast to these examples, this is the first evidence of a signature of pyrethroid-based vector control such as LLINs on metabolic resistance mechanisms.The near fixation of a resistance haplotype over an 8 year period highlights the need to implement suitable resistance management early enough to prevent control failure.Selection due to pyrethroid interventions is especially alarming because currently pyrethroids are the only insecticide approved for use in bed nets.

Conclusion
Altogether, this study provides conclusive evidence of the extensive selective sweep acting on cytochrome P450-based metabolic resistance to insecticides in mosquitoes.We conclude that positive selection on the region spanning the rp1 pyrethroid resistance locus in Anopheles funestus has occurred in southern Africa between 2002 and 2009 in response to the increased use of pyrethroid-treated bednets.No major change in agricultural use of pyrethroids has occurred in southern Africa over this period, but LLIN use and, to a lesser extent pyrethroidbased IRS, has been scaled up massively in this period.This highlights the risk of relying on a single insecticide class for vector control and emphasizes the need for novel insecticides and vector control tools to tackle the spread of resistant vector populations.

Microsatellite genotyping
Microsatellite markers were chosen to span the entire genome [17,43] PCR thermocycler conditions were: 5min at 95˚C followed by 35 cycles of denaturing at 94˚C for 30s, annealing at 58˚C for 30s and extension at 72˚C for 30s, finishing with an extension step at 72˚C for 10min.Fragment sizing was carried out using a Beckman Coulter CEQ8000.Fragment sizes were visualized and recorded using the fragment analysis software on the Beckman Coulter CEQ8000.Micro-Checker version 2.2.3 [44] was used to check for null allele and scoring errors.

Population genetic analysis of microsatellites
Microsatellite data analysis was mainly carried out using Genepop version 4.0.10 [45].Tests for deviation from Hardy Weinberg Equilibrium (HWE) were carried out for each locus using Genepop option 1.3.The null hypothesis was HWE and the alternative hypothesis a deficit of heterozygotes, and Bonferroni correction for multiple testing used to adjust the 0.05 and 0.01 critical values.The inbreeding coefficient F IS , linkage disequilibrium (LD), log likelihood ratio statistics (G-test) and tables created using Markov chain algorithm of Raymond & Rousset [45] were all preformed.Gene diversity at each microsatellite locus, estimated by 1-Q (inter) where Q (inter) is the homozygosity among individuals, and among populations.Genetic differentiation (F ST ) were estimated using Genepop [45].Pairwise F st values were used to generate neighbour joining trees using the software MEGA 5.2 [46].

Bayesian analysis of population structure
Bayesian analysis of population structure was implemented using STRUCTURE version 2.3.4 [47].Individually based admixture models were used to estimate the ancestral allele source observed in each individual, where the ancestral source population is unknown.A total of 285 individuals from six African countries, genotyped at 16 loci, were analysed for cluster number K = 1-12 (10 replicate runs for each) using a set of 9 putatively neutral loci and comparing 8 2R loci to the 8 non-2R markers.A burn-in period of 50,000 generations and Markov Chain Monte Carlo (MCMC) simulations of 100,000 iterations were used.The admixture model was used as it allows individuals to have mixed ancestry where a fraction (qk) of the genome of an individual comes from an ancestral cluster (where tkqk = 1) [47].Structure Harvester [48] was used to infer the most likely number of clusters (K) using Evanno's method [49].CLUMPP [50] was used to collate the data from all 10 runs for each given K value, for plotting.

Analysis of polymorphism in the pyrethroid resistance rp1 QTL and the CYP6P9a gene
Five DNA fragments evenly spaced to span the 120kb BAC, originally isolated using a laboratory colony, FUMOZ [7], upstream and downstream of CYP6P9a were sequenced in order to assess the presence of a selective sweep around this key resistance gene across Africa (Cameroon, Malawi and Mozambique).Primers used to amplify the fragments are listed in S1 Table .A subset of 10 mosquitoes used for the microsatellite analysis were randomly selected from each collection site for analysis.For BAC fragments, 770-1000bp were sequenced using: 3 μl of 10X KAPA Taq buffer A (KAPA Biosystems), 0.24 μl of 5 U/μl KAPA Taq, 0.24 μl of 25 μM dNTP mix, 1.5 μl of 25μM MgCl 2 , 1.02 μl each of antisense and sense primers, 2 μl of gDNA (10ng), and 22.48 μl of dH 2 O.The 30 μl solution underwent a denaturing step at 95˚C for 5min, followed by 35 cycles of 94˚C for 30s, 57˚C for 30s and 72˚C for 1min and 30s, followed by a final extension step of 72˚C for 10min.
The CYP6P9a gene was amplified from the same gDNA samples used for the rp1 BAC analysis.CYP6P9a was amplified, covering the 5' UTR, the gene's two exons and one intron with a total sequence of approximately 2kb using previously published primers and parameters [8].PCR products were purified using the QIAquick PCR Purification Kit and directly sequenced using Sanger sequencing.Sequences were first analysed for quality then manually assessed for polymorphisms using BioEdit [51].Sequences were aligned using ClustalW [52].
Heterozygous sites in sequence data were phased using the PHASER algorithm implemented in DnaSP version 5.10 [53].For all sequences, DnaSP was used to calculate the number of segregating sites (S), the number of haplotypes (h), the nucleotide diversity (π) and the haplotype diversity (Hd).Two tests of neutrality, Tajima's D and Fu and Li's D Ã , were also carried out.For sequences encoding proteins, 1524bp of the CYP6P9a gene, the numbers of synonymous and nonsynonymous polymorphisms and nonsynonymous (K A ) and synonymous (K S ) polymorphisms per site were calculated.The K ST statistic in dnaSP 5.1 [54] was used to estimate the levels of pair-wise genetic differentiation between populations.The statistical significance of the K ST Ã estimates was assessed by permutation of subpopulations identities and recalculating K ST Ã 10,000 times as implemented in dnaSP5.1.Hudson, Kreitman and Aguade's (HKA) and McDonald and Kreitman's (MK) tests of neutrality were performed using the An.gambiae orthologue of CYP6P9a, CYP6P3 (AGAP002865-PA) as an out-group.For the MK test on the BAC25 fragment, which spans the CYP6AA2 gene in An. funestus, the An.gambiae CYP6AA2 gene (AGAP002862-PA) was used as an out-group.
Maximum likelihood phylogenetic trees where generated for the BAC25 sequences (Tamura's Model) and for CYP6P9a (Kimura's 2-parameter model) using MEGA 5.2, with 500 bootstrap replicates [46].Haplotype networks were determined, based on a 95% connection limit with gaps treated as a fifth state, using TCS [55].Individual haplotypes were labelled by colour and shape (circle denoting haplotypes unique to only one sequence, squares denote haplotypes containing multiple sequences).

Whole genome sequencing-based scan of selective sweep in pooled mosquitoes pre-(2002) and post-intervention (2014)
A whole genome scan was performed comparatively between pre-and post-intervention mosquitoes in order to detect all selective sweep signatures associated with the scale-up of insecticide-based interventions.Pooled template whole genome sequencing libraries were generated as follows.Genomic DNA was purified from individual female mosquitoes collected from Chikwawa in Malawi in 2014 and 2002 using the DNeasy Blood and Tissue Kit (QIAgen, Hilden, Germany), following the manufacturer's instructions and including an RNase treatment step to remove RNA.The gDNA from each mosquito was quantified using the Quant-iT Pico-Green dsDNA Assay Kit (Thermo-Fisher, Waltham, USA) on a FLUOstar Omega Microplate Reader (BMG Labtech, Aylesbury, UK).Equal quantities of gDNA from 40 individuals for each collection were pooled in equal amounts and the pools used to generate Illumina TruSeq Nano DNA fragment libraries (Illumina, San Diego, USA) with an insert size of 350bp.Libraries were sequenced (2x125bp paired-end sequencing) on different lanes of an Illumina HiSeq2500 (Illumina) at the Center for Genomics Research (University of Liverpool, UK), each multiplexed with three other libraries (not used in this study), to produce approximately 50,000,000 read pairs per library.Raw sequence reads were trimmed of adapter sequence and low quality bases, using cutadapt [56] and sickle [57], and filtered to remove trimmed reads shorter than 10bp.Trimmed reads were aligned to the Anopheles funestus (FUMOZ) reference genome sequence (version Afun1.3)downloaded from VectorBase [58], using bowtie2 [59].Aligned reads were filtered to remove duplicate reads and those not properly paired (mapped in forward and reverse orientation within 500bp of each other) or with mapping quality <10.Retained reads were used to detect single nucleotide polymorphisms (SNP).
SNP calling was carried out using SNVer [60].SNPs were filtered to remove those called in regions of in the top or bottom 25% of read coverage depth, to remove artefacts caused by misaligned paralogous sequences.Variant sites with coverage data within the allowed range for both libraries were analysed to identify those with significantly different allele frequencies in each pool.A chi-squared test for a significant difference in allele frequency was applied to each variant site.P-values were corrected for multiple testing using the method of Benjamini and Hochberg [61] and sites with an adjusted p-value less than 0.05 were considered significant.In addition, a rolling mean non-reference allele frequency was calculated and plotted for sets of 101 adjacent sites incremented by one site per step across each scaffold.
The most striking effect of analysing 2R alone is on Ghana, which becomes highly divergent from all other populations including Benin, the other West African population (S1A Fig; Fst 14.4-21.9%).When the 8 microsatellites on the other chromosomes are analysed, the pattern of differentiation is more aligned with geographical distance between all samples, with Ghana samples less differentiated from Benin (F st = 4.3%) than with the 2R markers (F st = 21.9%)(Fig 1C; S1B Fig).The major divergence between southern Africa and other populations is maintained (S1 Fig).

Fig 1 .
Fig 1.Africa-wide population structure analysis.A) Gene diversity of 11 microsatellites from throughout the genome, showing a loss of diversity at the telomeric end of chromosome 2R.B) Gene diversity of 8 microsatellites on chromosome 2R (including AFUB6, FunR and FunO) shows that the loss of diversity is restricted to AFUB6 and FunR, the microsatellites located near the pyrethroid resistance QTL rp1.C) Neighbor-Joining tree based on pairwise F st among population samples, estimated using 9 neutrally evolving microsatellites from throughout the genome (excluding AFUB6 and FunR on chromosome 2R, which may be evolving under positive selection).D) Barplot of assignment probabilities of individual genotypes to two ancestral clusters (the most likely number estimated from the data) estimated using Bayesian population structure analysis under an admixture model.Each bar is an individual genotype for 9 neutrally evolving microsatellites from throughout the genome (excluding AFUB6 and FunR).In panel A, points from left to right represent the following microsatellites: FunQ (on chromosome X); AFUB6, FunR, FunO (on 2R); AFUB11, FunL, AFUB10 (on 2L); AFND7, AFND19 (on 3R); FunF and AFUB12 (on 3L).In panel B, points from left to right represent AFUB3, AFND40, AFUB6, FunR, AFND6, AFND30, AFND32 and FunO (all on 2R).

doi: 10 .
1371/journal.pgen.1006539.g0012-Fine scale genomic sequence analysis confirms a selective sweep associated with pyrethroid resistance Patterns of polymorphism across the 120kb rp1 genomic region.To assess the signature of a selective sweep associated with pyrethroid resistance, we carried out fine-scale polymorphism analysis across a 120kb genomic region spanning the rp1 QTL in two highly pyrethroid resistant populations (Malawi and Mozambique) and a moderately resistant population (Cameroon).This revealed markedly reduced polymorphism in the vicinity of CYP6P9a (Fig 2A; S2A and S2B Fig), a key pyrethroid resistance gene, in Malawi and Mozambique but not in Cameroon.The greatest loss of diversity was closest to CYP6P9a at BAC25 (-9kb) (Fig 2A; S4 Table Fig).A drastic change is observed closer to the CYP6P9a at -9kb (BAC25) where a strong signature of positive directional selection is observed in both Malawi and Mozambique.This reduced haplotype diversity is shown by the presence of a highly predominant haplotype MAL/MOZ1 only found in Malawi and Mozambique with a frequency of 92.8% (S3F Fig).Maximum likelihood phylogenetic trees and haplotype network of rp1.Construction of Maximum likelihood (ML) phylogenetic trees of the five fragments further highlighted the reduced diversity close to CYP6P9a in the resistant populations of Malawi and Mozambique.The ML tree of the -34kb (BAC0) shows a high diversity of haplotypes for each of the three countries (S2C Fig), while the profile is very different at -9kb from CYP6P9a where all sequences from Mozambique and Malawi belong to a major haplotype while the Cameroon sample retained its high diversity (S2D Fig).Analysis of the TCS haplotype network confirms profiles obtained with the maximum likelihood phylogenetic trees for the 5 loci from -34kb to +86kb from CYP6P9a (S2E and S2F Fig).In general, Malawi and Mozambique formed their own clusters, with reduced genetic diversity shown by the fact that haplotypes from both countries had significantly fewer mutational steps between them (1-3 steps), while the more susceptible samples formed their own clusters with higher mutational steps (4-16 steps) for almost all of the BACs except for +86 (BAC 120).Genetic diversity of the CYP6P9a pyrethroid resistance gene across Africa.To further confirm whether the selective sweep was driven by pyrethroid resistance across Africa, the full 1,965 bp of the CYP6P9a resistance gene was sequenced for a total of 59 mosquitoes from six countries (Benin, Cameroon, Ghana, Malawi, Mozambique and Uganda).Analysis of the genetic diversity of CYP6P9a revealed signatures of strong directional selection in Malawi and Mozambique but not in other regions (S5 Table; S3A Fig).In Malawi, only 5 polymorphic sites are observed (S5 Table; S3A Fig) and an AA indel in the 5' UTR was found mainly in Southern Africa and not in the other countries.Nucleotide diversity (π) was lowest in Malawi (π = 0.0008) and Mozambique (0.0011), compared to Uganda (0.0015), Ghana (0.0020), Benin (0.0022) and Cameroon (0.0029) (S5 Table

Fig 2 .
Fig 2. Africa-wide analysis of genetic diversity across the rp1 pyrethroid resistance locus.A) Genetic diversity (pi) across the 120kb rp1 in Cameroon (orange), Malawi (blue), and Mozambique (green).B) Phylogeny of the 'BAC25' fragment (located 25kb along the BAC sequence and 9kb upstream of the CYP6P9a gene), which shows the most extreme difference in diversity between Cameroon (orange) and southern populations Malawi (blue) and Mozambique (green), which form a single clade.C) Phylogeny of the CYP6P9a gene sampled from throughout Africa, showing clear geographical divergence between southern Africa (Malawi and Mozambique; 100% bootstrap support), East Africa (Uganda; 87% bootstrap support) and West/Central Africa (Ghana, Benin, Cameroon; 78% bootstrap support).D) Haplotype network for Non-synonymous nucleotide variants in CYP6P9a.Light blue = haplotype shared between Malawi and Mozambique (MAL/MOZ); pink = haplotype shared between Benin, Cameroon and Ghana (BN/CAM/GH); teal = Benin (BN); red = Ghana (GH); orange = Cameroon (CAM); green = Mozambique (MOZ); blue = Malawi (MAL); purple = Uganda (UG).The size of the shapes is proportional to the frequency of the haplotype and numbers on each branch show the mutational steps separating haplotypes.doi:10.1371/journal.pgen.1006539.g002 Fig).Additional mosquitoes collected from across Malawi were sequenced and showed the selective sweep is consistent throughout southern Africa post-intervention (Fig 3B).A haplotype network of the closest fragment to CYP6P9a, BAC25 (-9kb), revealed the striking difference caused by control interventions.22 unique haplotypes (with up to 23 mutational steps) were observed in pre-intervention samples, but only 2 in post-intervention samples (Fig 3C), which showed evidence of significant positive directional selection using both Tajima's D (-2.10, -2.97) and Fu and Li's D Ã (-2.94, -3.45) (S4 Table

Fig 3 .
Fig 3.The selective sweep at rp1 in southern Africa coincides with the scale-up in pyrethroid use in malaria control.A) Gene diversity at microsatellites on chromosome 2R in mosquitoes collected before widespread pyrethroid-based intervention in Malawi, 2002 (purple) and Mozambique, 2002 (red), and 'post-intervention' in Malawi, 2010 (green) and Mozambique, 2009 (yellow).B) Fine-scale nucleotide sequence analysis of the 120kb rp1 locus in Cameroon (orange) or pre-intervention samples (Malawi (CKW2002) = brown and Mozambique (MOZ2009) = pink) compared to Post intervention samples from Chikwawa (green), Salima (light blue), Nkhotakota (blue) Malawi, Chokwe (purple) Mozambique, and Kaoma (dark blue) Zambia.C) TCS haplotype network at 'BAC25'.Haplotypes including sequences from more than one location are denoted by an H and include the frequency, haplotypes from one location Malawi (blue) and Mozambique (yellow) are only present in pre-intervention samples.D) Change in average pairwise diversity (Pi) in the CYP6P9a gene pre-vs.post-intervention in Malawi and Mozambique.E) Phylogeny of CYP6P9a sequences including both Malawi (Ckw) and Mozambique (Moz) shows that post-intervention samples (red) are almost completely homogenous while pre-intervention samples (black) are diverse.doi:10.1371/journal.pgen.1006539.g003

Fig 4 .
Fig 4. Genome-wide analysis of selection on the An.funestus genome.A) Sites with a significantly different allele frequency between 2002 and 2014 per genomic scaffold plotted against scaffold length identifies scaffold KB669169 as an outlier in which significantly skewed sites are overrepresented.B) Plot of P-values of the difference in allele frequency for sites on scaffold KB669169.Grey dotted lines indicate the 120kb region represented by a sequenced BAC clone containing the rp1 locus.C) Mean frequency of non-reference alleles (for 101 adjacent variant sites) on scaffold KB669169.D) Allele frequencies as in C, showing a magnified region spanning the rp1 locus.E) Mean frequency of non-reference alleles (for 51 adjacent variant sites) on the BAC clone containing the rp1 locus.Positions of genes in the P450 cluster are indicated and CYP6P9a and CYP6P9b are labelled.doi:10.1371/journal.pgen.1006539.g004 (i) Reduced diversity of microsatellites flanking pyrethroid resistance genes on 2R in Malawi and Mozambique.(ii) Reduced genetic diversity of genomic sequences flanking pyrethroid resistance genes on 2R in Malawi and Mozambique.(iii) Reduced genetic diversity of the CYP6P9a gene in Malawi and Mozambique.(iv) No signature of a selective sweep prior to the widespread use of LLINs and IRS in Malawi and Mozambique.
MOZ-Mozambique and UG-Uganda).(C) Haplotype network of CYP6P9a for individual countries for coding region.The size of the polygon reflects the frequency of the haplotype and colour represents the countries (BN (Benin)-Blue, CAM (Cameroon)-Yellow, GH (Ghana)-Red, MAL (Malawi)-Grey, MOZ (Mozambique)-Green and UG (Uganda)-Purple).Segregating mutation is represented by each node and rectangular boxes represent major haplotype.(D) Neighbour joining tree based on genetic distances from K ST estimates of pairwise population comparison.(TIFF) S4 Fig. Analysis of the rp1 BAC clones showing higher selection near CYP6P9a.Analysis of two BAC sequences from southern Africa BAC0 on the 5'UTR and BAC25 near the gene CYP6P9a.(A-B), Haplotypes from post-intervention samples (red) and pre-intervention from Malawi (blue), and Mozambique (green) including the frequency in brackets and the SNP location on the top x-axis.(C-D), ML-tree of pre-intervention (black) and post-intervention (red) show more divergence at BAC0 while BAC25 shows distinct grouping between pre and post.(E-F), Haplotype network where the size correlated to frequency.The red denotes samples collected post-intervention, blue were collected pre-intervention and green contains sequences from both time points.(TIFF) S5 Fig. Loss of diversity post-intervention in the gene CYP6P9a.(A) All SNPs in pre-intervention samples (blue and green) and post-intervention (red) compared to a susceptible lab strain, Fang (black) including frequency in brackets and the position on the top x-axis.(B) Haplotypes based on amino acid changes pre (blue and green) versus post-intervention (red) shows a major loss of diversity.SNPs highlighted in yellow denote changes implicated in increased catalytic function [20].(TIFF) S6 Fig. Impact of control interventions on genetic diversity of CYP6P9a.(A) The TCS haplotype network of pre-versus post-intervention samples based on the coding region including haplotypes with more than one sequence (red) and singular haplotypes from Malawi (orange) and Mozambique (blue).(B) Africa-wide Neighbor-Joining tree of CYP6P9a shows geographical clustering where southern Africa (blue and orange) is divergent.Within the southern Africa cluster there is a lack of diversity in Mozambique (MOZ 2009) and Malawi (MWI 2010).(TIFF) S7 Fig. Genome-wide analysis of selection on the An.funestus genome showing other scaffolds with sites with a significantly different allele frequency between 2002 and 2014.These show no striking valley of reduced variability in contrast to KB669169 spanning rp1.(A) Plot of P-values of the difference in allele frequency for sites on respective scaffold.(B) Mean frequency of non-reference alleles (for 101 sites) on respective scaffold.The black line is for MWI-2002 and the red line for MWI-2014.(TIFF) S8 Fig. Genomic location of the selective sweep.(A) Gene annotation of the rp1 region view from the Vectobase screenshot.The region from approximately 1.32 to 1.48 Mb corresponds to the sequence BAC containing rp1.Blue bars represent scaffolded contigs and spaces are unsequenced assembly gaps, which are common in this region.Red boxes represent annotated genes.(B) Contrasting polymorphism patterns between pre-and post-intervention samples: Data aligned to BAC (IGV screenshot).Full-length (120kb) BAC sequence.The top track