Genetic Networking of the Bemisia tabaci Cryptic Species Complex Reveals Pattern of Biological Invasions

Background A challenge within the context of cryptic species is the delimitation of individual species within the complex. Statistical parsimony network analytics offers the opportunity to explore limits in situations where there are insufficient species-specific morphological characters to separate taxa. The results also enable us to explore the spread in taxa that have invaded globally. Methodology/Principal Findings Using a 657 bp portion of mitochondrial cytochrome oxidase 1 from 352 unique haplotypes belonging to the Bemisia tabaci cryptic species complex, the analysis revealed 28 networks plus 7 unconnected individual haplotypes. Of the networks, 24 corresponded to the putative species identified using the rule set devised by Dinsdale et al. (2010). Only two species proposed in Dinsdale et al. (2010) departed substantially from the structure suggested by the analysis. The analysis of the two invasive members of the complex, Mediterranean (MED) and Middle East – Asia Minor 1 (MEAM1), showed that in both cases only a small number of haplotypes represent the majority that have spread beyond the home range; one MEAM1 and three MED haplotypes account for >80% of the GenBank records. Israel is a possible source of the globally invasive MEAM1 whereas MED has two possible sources. The first is the eastern Mediterranean which has invaded only the USA, primarily Florida and to a lesser extent California. The second are western Mediterranean haplotypes that have spread to the USA, Asia and South America. The structure for MED supports two home range distributions, a Sub-Saharan range and a Mediterranean range. The MEAM1 network supports the Middle East - Asia Minor region. Conclusion/Significance The network analyses show a high level of congruence with the species identified in a previous phylogenetic analysis. The analysis of the two globally invasive members of the complex support the view that global invasion often involve very small portions of the available genetic diversity.


Introduction
An enduring challenge within the context of cryptic species is the delimitation of individual species within the complex [1][2][3]. In this context, the absence of morphological characters with which to identify individual species has lead to the reliance on molecular tools to identify and delimit species [4][5][6][7]. In exploring the relationships between cryptic taxa, phylogenetic methods that assume bifurcating trees e.g. neighbor joining, maximum likelihood, maximum parsimony and minimum evolution are often used to explore these relationships however, gene evolution cannot always be represented in this manner. Rather, genealogies of closely related taxa may be multifurcated; descendant genes coexist with persistent ancestors producing reticulate relationships. To address this, networking approaches have been developed to estimate genealogies; these include pyramids technique [8], statistical geometry [9], split decomposition [10], the median networks [11,12], median-joining networks approaches [13][14], molecular-variance parsimony [15], netting [16], likelihood network [17], reticulogram [18] and reticulate phylogeny [19].
More recently, Hart and Sunday (2007) [20] and Chen et al. (2010) [21] observed that the application of statistical parsimony network analytics to the question of species delimitation, identified a strong association between breaks in network connectivity and species level separation. Using mitochondrial cytochrome oxidase 1 (mtCOI) and the software TCS v1.21 [22] they calculated the maximum number of mutational steps needed to make a parsimonious connection between two haplotypes with the probability of 95%. These haplotypes were then linked using cladogram estimation analyses to form a network [23]. Where the number of steps needed to form a link exceeded the likelihood of 5%, those haplotypes remained unconnected. This method is particularly appropriate for population level analysis as it does not involve many of the assumptions of phylogenetic reconstruction methods e.g. it does not assume that the ancestral sequence is missing and does not require bifurcating relationships [24]. The use of statistical parsimony emphasizes what is shared among haplotypes that differ minimally rather than the differences among the haplotypes and provides an empirical assessment of deviations from parsimony. Using this approach Hart and Sunday (2007) [20] and Chen et al. (2010) [21] were able to predict species limits within species complexes that lacked sufficient species-specific morphological characters to separate individual taxa.
Since the early 1990s there has been the ongoing debate as to taxonomic identity of the whitefly, Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae). In their recent review, De   [25] explored the history of the debate as to whether B. tabaci was a complex species composed of biotypes or a complex composed of multiple species. They concluded that separation of different members of the B. tabaci complex using morphological or biological traits was not possible and that all reliable separations were being made on the basis of molecular data, primarily mtCOI. In particular, they noted that the Bayesian phylogenetic analyses of mtCOI haplotypes by Boykin et al. (2007) [26] and Dinsdale et al. (2010) [27] provided compelling evidence to support the case for a cryptic species complex. In particular, Dinsdale et al. (2010) [27] through the analysis of lineage divergence based on the Kimura-2 parameter (K2P) model of molecular evolution, that there was a distinct break in the pairwise frequency distribution at 3.5% sequence divergence and argued that this was evidence for species level separation. Goldstein & DeSalle (2011) [28] urge caution though in delimitation based on a single gene and argue that delimitation using single gene data is most compelling when used in concert with other data. The integration of other data was a key part of the review [25] which considered the evidence for species level separation in B. tabaci. Here, they observed that the separation identified in Dinsdale et al (2010) [27] was well supported by all the available mating compatibility data which showed that all crosses between putative species demonstrated either complete mating isolation or partial isolation resulting in reduced fitness in the progeny [29][30][31][32][33][34]. These studies suggested that B. tabaci was composed of at least 24 separate species while Hu et al. (2011) [35], using the approach outlined in Dinsdale et al. (2010) [27], increased the number to 28. The terminology used to describe B. tabaci species is that of Dinsdale et al. (2010) [27]. We refer to them as 'putative species' as this recognises the need for formal descriptions and additional corroborating evidence. The link between biotype terminology and the putative species terminology is detailed in Dinsdale et al (2010) [27] and De   [25].
We have applied statistical parsimony network analysis using TCS to explore whether the approach supports the one used by Dinsdale et al. (2010) [27]. In addition, mtCOI has been widely used to explore the possible origins of invasive species [e.g. [36][37][38] and the analysis of network relationships provides a good opportunity to explore the geographic relationship of haplotypes and from this infer patterns of spread from their respective home ranges. In the context of the B. tabaci species complex, this is of considerable interest in regards to two members of the complex, Middle East-Asia Minor 1 (MEAM1, commonly known as the B biotype) and Mediterranean (MED, commonly known as the Q biotype) which have become global invaders through the trade in ornamentals [39,40].

Mitochondrial cytochrome oxidase 1 haplotype network analysis
Statistical parsimony analysis of the 352 unique haplotypes revealed the presence of 35 distinct networks (Figs 1, 2). Of the 352 haplotypes all, but three, were able to be assigned to one of the 24 putative species. The three remaining haplotypes, EU192051 (China) and, AF340212 and AF340213 (Argentina), diverged by .3.5% and so were considered to be new putative species based on the rule set in Dinsdale et al. (2010) [27]. There was a high degree of agreement between the species level delimitation based on the network analysis and that identified by the 3.5% break in sequence divergence in Dinsdale et al. (2010) [27]. Of the 24 putative species proposed by Dinsdale et al. (2010) [27], 18 corresponded directly to a single network (Table 1, Figs 1, 2). Of the remaining six putative species three, AsiaI ( Fig. 1), AsiaII_6 (Fig. 2) and MED (Fig. 3) each consisted of a network plus one unconnected haplotype and MEAM1 (Fig. 4) split into one network plus two unconnected haplotypes (Table 1). For AsiaI the unconnected haplotype was from Pakistan (A1Pak2, AJ510066), for AsiaII_6 from China (A2Ch11, AJ784261), for MED from Uganda (MUg7, AY903565) and for Middle East-Asia Minor 1 the two unconnected haplotypes were both from Pakistan (Pak8, GU977267; Pak9 GU977268). Of the two remaining putative Dinsdale species, AsiaII_7 ( Fig. 1) split into two separate networks and New World (Fig. 2) split into two networks, plus two unconnected haplotypes (Table 1). In the case of AsiaII_7, one network contained haplotypes from India and China while the other had haplotypes from China only (Fig. 1). For the New World, one network contained haplotypes from Belize, El Salvador, Guatemala, Honduras, Mexico, Puerto Rico and the south western USA while the second had haplotypes from Colombia; haplotypes from Panama (NewP1, DQ130060) and Sudan (NewSD, EU760727) were unconnected (Fig. 2). Table 2 provides the range of Kimura's 2-parameter (K2P) divergences for each unconnected haplotype or network that was unconnected to the network representing each of the six Dinsdale putative species. In each case the minimum K2P divergence ranged from 1.9% to 3.3% while the maximum ranged from 3.0% to 6.6% and averaged 2.5% to 4.7%. The mean genetic distances (K2P) within putative species for each of networks that represented the different Dinsdale putative species ranged from 0.3% to 1.8% (Table 1) while distances between networks were .2.0% with a maximum of 24.4% between network 8 and network 15 (Table 3).

Haplotype diversity and distribution for globally invasive B. tabaci
Of the 660 MEAM1 sequences, 601 (0.911) came from the invaded range and 59 (0.089) from the home range. Furthermore, of the 80 unique haplotypes, 40 came from the home range, 44 from the invaded range, four from both, 36 from the home range only and 40 from the invaded range only. Of the 374 MED sequences, 153 (0.409) came from the invaded range and 221 (0.591) from the home range. Of the 86 unique haplotypes, 76 came from the home range, 12 from the invaded range, four from both, 72 from the home range only and eight from the invaded range only.
For MED, the search of GenBank identified 86 unique haplotypes. Of these 23 (MED-Hap1 to MED-Hap23) were represented by at least two sequences while the remaining 63 were represented by a single sequence (Table S1, Table 4, Fig. 3) and 1, MUg7 (Uganda) was shown not to be connected to the network. The network analysis suggested a complex structure for MED. There are two distinct elements to the network. The first element is composed primarily of countries from Sub-Saharan Africa (Fig. 3). This network consists of a hub (Ivory Coast, MIC2) with one branch consisting of primarily West African countries (Burkina Faso, Cameroon, Ghana, Ivory Coast) as well as Uganda and a branch consisting of both East and West African countries (Cameroon, Nigeria, Uganda, Zimbabwe). It is interesting to note that the primarily West African branch contains haplotypes that have retained the capacity to induce silverleafing which has been lost throughout the rest of the network (Fig. 3). A third branch which splits into a branch containing haplotypes from Burkina Faso and a branch with haplotypes from Croatia, Cyprus, Egypt, France, Israel, Syria, Turkey and USA (Fig 3). This Mediterranean branch acts as the conduit to the second element of the network. The network structure suggests a connection from Sub-Saharan Africa through Turkey/Syria/Israel (MTur) to the eastern Mediterranean (Fig. 3). The primarily eastern Mediterranean portion of the network then connects through Croatia to the node MCh1. This haplotype occurs in the western Mediterranean countries of France, Morocco, Portugal and Spain as well to the eastern Mediterranean island of Crete and countries well outside the region, China, Guatemala, Japan, South Korea, Taiwan and  Table 1. Network analysis based on statistical parsimony [23] showing the genealogical relationships of the COI haplotypes in cryptic species of Bemisia tabaci. The Dinsdale putative species [27] are indicated in parentheses. Networks encompassed by a box are those Dinsdale putative species that the analysis suggests have additional species level separation. The names codes in the ovals refer to the individual haplotypes (see Table S1). The small circles indicate the presence of missing intermediates while the connections are based on the set of plausible solutions with a 95% of parsimony probability. The number of sequences for each haplotype where n.1 is indicated next to the node; nodes without a number are n = 1. doi:10.1371/journal.pone.0025579.g001  Table 1. Network analysis based on statistical parsimony [23] showing the genealogical relationships of the COI haplotypes in cryptic species of Bemisia tabaci. The Dinsdale putative species [27] are indicated in parentheses. Networks encompassed by a box are those Dinsdale putative species that the analysis suggests have additional species level separation. The names codes in the ovals refer to the individual haplotypes (see Table S1). The small circles indicate the presence of missing intermediates while the connections are based on the set of USA (Fig. 3). MCh1, forms a hub that connects an array of closely connected haplotypes. These haplotypes mostly occur either in the Mediterranean (Algeria, Crete, Croatia, Egypt, France, Morocco, Spain, Syria, Tunisia) or occur in countries well outside the region (China, Czech Republic, Japan, Netherlands, South Korea, Taiwan, Uruguay, USA) ( Table 4, Fig. 3). The exceptions to these are haplotypes that come from Cameroon, Sudan and a cluster of connected haplotypes stemming from MSu1 which include the countries of (Burkina Faso, Cameroon, Ghana, Greece, Sudan, Tunisia, Uganda) ( Table 4, Fig. 3). The network plausible solutions with a 95% of parsimony probability. The number of sequences for each haplotype where n.1 is indicated next to the node; nodes without a number are n = 1. Networks  structure supports the idea that MED has a home range that extends from Sub-Saharan Africa through into the Mediterranean region.
The global pattern of spread beyond this assumed home range is more complex than for MEAM1. Of the 85 haplotypes considered, 14 have spread outside this region. In the case of the USA, three haplotypes have been recorded. MSy (MED-Hap 3, Table 4, Fig. 3) which also occurs in Cyprus, Israel, Syria and Turkey suggests an introduction from the eastern Mediterranean into the USA. However, the presence in the USA of MCh1 (Table 4, Fig. 3) which is associated to the assumed home range through France, Greece: Crete, Morocco, Portugal, Spain, suggests a second invasion from the western Mediterranean has also occurred. This haplotype also now occurs in China, Guatemala, Japan, South Korea and Taiwan. MCh2 (MED-Hap 2, Table 4, Fig. 3) which is also associated with the western Mediterranean (France, Morocco, Spain) and now also occurs in the USA further supports the idea that the USA has experienced two waves of invasion. The haplotype also occurs in China, Japan, South Korea, Taiwan and Uruguay. In the case of mainland China, the same three haplotypes that have invaded the USA also occur there along with one haplotype (MFra4) which also occurs in France as well as Uruguay suggesting both eastern and western Mediterranean origin invasions into China (  Fig. 3). The remaining haplotypes that occur outside the assumed home range are MCBH4 (MED-Hap 4, Table 4) Czech Republic, France, Morocco and the Netherlands; MMor1 (Table 4, Fig 3) France, Morocco and Japan both of which again suggest western Mediterranean origins and two haplotypes found only in Japan and Taiwan. What is of particular interest in regards to the patterns of global invasion is that with the exception of the eastern Mediterranean, MTur, all haplotypes that have been found outside the Mediterranean/Sub-Saharan Africa regions were from the western Mediterranean and belonged either to the haplotype MCh1 or were very closely related to this haplotype (Table 4, Fig. 3). This mirrors the pattern observed for MEAM1 in that a single haplotype has been hub from which all invading haplotypes stem (Figs 3, 4).
In the case of MEAM1, the search of GenBank identified 80 unique haplotypes, two of which were subsequently shown not to  [23] showing the genealogical relationships of the COI haplotypes representing the Mediterranean putative species. The two networks are numbered 30 and 31. Network 30 represents 85 of the 86 haplotypes assigned to MED using the approach outlined in [27]. The names codes in the ovals refer to the individual haplotypes (see Table S1). The small circles indicate the presence of missing intermediates while the connections are based on the set of plausible solutions with a 95% of parsimony probability. Each of the haplotypes (23 in total, see Table 4) for which there were at least two identical representatives in GenBank are indicated by an arrow linked to a box which details the countries where the haplotype was found; countries that are in bold and underlined are those which indicate an invasion. The ovals coloured in red indicate those haplotypes that together represent .80% of the GenBank records in the invaded ranges. The number of sequences for each haplotype where n.1 is indicated next to the node; nodes without a number are n = 1. doi:10.1371/journal.pone.0025579.g003 form part of the MEAM1 network. Of the remaining 80 haplotypes, 11 (MEAM1-Hap 1 to MEAM1-Hap 11) were represented by at least two sequences while the remaining 69 were represented by a single sequence (Table S1, Table 5, Fig. 4). The network analysis shows one branch composed of haplotypes from either the Middle East (Kuwait, Saudi Arabia, Syria, United Arab Emirates, Yemen) or Asia Minor (Iran, Iraq, Pakistan); none of the haplotypes occur outside the assumed home range of MEAM1 (Table 5, Fig. 4). This branch connects to the haplotype referred to as Jap1 (Table 5, Fig. 4). This haplotypes has spread widely beyond the assumed home range of MEAM1 to 37 countries across Africa, Europe, Asia, Oceania and the New World and accounts 79.8% of the mtCOI sequences in GenBank. Jap1 forms a central hub from which all the remaining 51 haplotypes radiate (Fig. 4). Of these, 11/51 come from either Israel (8), Jordan (2) or Pakistan (1) while a further 44/51 haplotypes come from countries beyond the assumed home range ( Table 5, Fig. 4). Of these, three, Arg (MEAM1-Hap5, Argentina, Australia, Cook Islands, French Polynesia, Mexico, Israel and USA, Table 5, Fig. 4), SK (MEAM1-Hap 8, Pakistan and South Korea, Table 5, Fig. 4) and Ch1 (MEAM1-Hap 4, Australia, China, French Polynesia, Israel, Taiwan and USA) have direct links to the assumed home range through Israel and together account for a further 5.6% of the sequences in GenBank. This indicates that from a global invasion perspective, four haplotypes account for approximately 85% of all invasion records. What is particularly interesting is that Israel does not feature in any of the countries in the branch composed of haplotypes only from the  [27]. The name codes in the ovals refer to the individual haplotypes (see Table S1). The small circles indicate the presence of missing intermediates while the connections are based on the set of plausible solutions with a 95% of parsimony probability. Each of the haplotypes (11 in total, see Table 5) for which there were at least two identical representatives in GenBank are indicated by an arrow linked to a box which details the countries where the haplotype was found; countries that are in bold and underlined are those which indicate an invasion. The oval coloured in red indicates the haplotype that represents .80% of the GenBank records in the invaded ranges. The number of sequences for each haplotype where n.1 is indicated next to the node; nodes without a number are n = 1. doi:10.1371/journal.pone.0025579.g004 home range; Israel first appears in Jap1. Another feature of the network is that of the 78 sequences, 27 are directly connected to Jap1 with a further 12 separated by a single node.
The results Wst analysis and AMOVA support there being a complete genetic break among the 35 independent networks ( Table 6) and so provides additional support for existence of distinct cryptic species. The AMOVA showed a very high percentage of variation among networks, 92.01% whereas the within network variation is quite low, 7.99%.

Discussion
The statistical parsimony analysis showed that of the 352 unique haplotypes, all but 15 belonged to the network that corresponded to one of the 24 putative species identified in Dinsdale et al. (2010) [27]. Of the 15 haplotypes, 3 belonged to new putative species identified using the rule set in Dinsdale et al. (2010) [27]; this was confirmed using statistical parsimony. The remaining 12 were considerably more divergent than the mean K2P divergences and occupied the upper end of the divergence range for the putative species from which they were split. Furthermore, of the six putative species, only two, AsiaII_7 and New World were shown to vary markedly from Dinsdale et al. (2010) [27]. This suggests that in most cases the K2P based approach devised by Dinsdale et al. (2010) and the TCS approach are in close agreement producing very similar assignment estimates which we know are well supported by available courtship and mating data [25,28,32,33]. The approach also identified the need for more work, especially in the case of the New World to further resolve relationships. Our results further support the view [21] that when attempting to estimate species diversity where morphological data is unhelpful, network analysis is a good supporting tool that either confirms molecular phylogenetic analysis or shows where more work is needed to resolve the relationships That said, while network analytics provided good agreement with a phylogenetic approach, the issue of where to draw the line between species is complex. Multiple methods [46][47][48][49][50][51][52][53][54][55] have been applied to the questions of species delimitation, but there is no consensus as to which approach provides the best estimate or whether a best estimate is possible [56,57]. Increasingly, there is a view that multiple genes are required to identify species level boundaries [58][59][60][61]. However, in the context of invasive species and biosecurity in general, species complexes such as that exemplified by B. tabaci which move around the world as a consequence of trade [39,40] require decisions by regulators to be made quickly without the recourse to generating new data. In this study, the dataset that covers the widest available range of B. tabaci diversity is mtCOI and both a phylogenetic and network based analysis of this data provided similar outcomes that agree with available mating compatibility data. Future research could further consider the analysis of this dataset with a range of species distinctiveness measures to further explore the challenge of species delimitation and the applicability of the approaches used to date. The addition of genomic data will no doubt add further to our understanding of species level separation in this complex.
The results are based on a partial single maternally inherited gene (mtCOI) and what data there is supports a high level of congruence between phylogenetic relationships derived from mitochondrial and genomic DNA [25,26]. Sole reliance on mtCOI as a means of considering the phylogeography of biological invasions is not unusual [62][63][64] although the addition of genomic DNA would add considerably to our understanding of the patterns of invasion. Furthermore, the B. tabaci complex shows very marked phylogeographic structure [26,27] so in both cases the invasive MEAM1 and MED are genetically distinct and discrete when compared with the indigenous haplotypes in the geographies that have been invaded. The patterns of diversity and distribution of both globally invasive members of the B. tabaci cryptic species complex follow similar patterns. In both cases only a small number of haplotypes represent the majority of haplotypes that have spread beyond the home ranges; one haplotype in the case of MEAM1 and three in the case of MED account for .80% of the GenBank records in the invaded ranges. Furthermore, in both cases, the one haplotype for MEAM1 and two of the three for MED form hubs to which most invader haplotypes connect directly. The analysis of the haplotype data strongly suggests a role for Israel as the source of MEAM1 that has invaded most parts of the world whereas the pattern of invasion for MED suggests two sources. The first is the eastern Mediterranean where a haplotype found only in this part of MED's home range has been detected  only the USA where it occurs primarily in Florida and to a lesser extent California. The second are haplotypes that occur in the western Mediterranean which have spread not only to the USA, but also to Asia and South America. It is also of some interest that in the case of MED 12 out of 86 haplotypes have spread beyond the assumed home range whereas for MEAM1 44 of the 80 haplotypes identified occurred outside the home range. Given that MEAM1 appears to have first begun to invade globally in the mid 1990s, around 10 years before MED began its global invasion, one might interpret this result in the context of the increasing body of knowledge that is accumulating in regards to global invasions which suggests that over time, invaded ranges accumulate an increasing proportion of the home range genetic diversity [62][63][64][65][66][67][68]. However, the above interpretations need to be considered with a degree of caution as the relative proportions of haplotypes that have spread beyond the home range may be an artifact of the varying levels of sampling effort across the different studies from which the sequences were sourced. In the case of MEAM1, the level of sampling across the home range would appear to be quite low as only 8.9% of sequences came from within the home range. This suggests that the invaded range has been subject to far more intensive sampling than the home range implying that the findings overestimate the proportion of diversity that has moved beyond the home ranges. In addition, the paucity of sampling within the home range of MEAM1 suggests the proposed origin of the global invasion needs to be considered with some caution. The situation for MED is somewhat different; the level of sampling across the home range is much higher as 59.1% of sequences came from within the home range. Furthermore, given that the numbers of sequences from the invaded and home ranges are similar, one might conclude that the levels of sampling effort across the two domains is equivalent and as only eight of the 86 MED haplotypes were detected outside the home range, the proportion of diversity that has so far invaded is quite low. The structure of the MED network supports the existence of two distributions within the home range. The first is a Sub-Saharan range and the second a Mediterranean range. The significance of this is unclear, but one possible explanation is that MED evolved in Sub-Saharan Africa and then spread to the eastern Mediterranean and from there to the western Mediterranean. There is also the suggestion that MED has moved back to Sub-Saharan Africa from the Mediterranean region, possibly via Sudan and that this shift is coincident with the global spread of invasive MED that has been documented over the past 10 years [40] so it is possible that this latter return to Sub-Saharan Africa represents a very recent invasion.
The MEAM1 network is a little simpler to interpret and supports the view that it originated in the Middle East -Asia Minor region. There is also a case to suggest the involvement of Israel in the global spread of MEAM1, but this needs to be tempered in the light of the low level of sampling across the home range relative to the invaded range. However, all the haplotypes that have spread globally and have a connection back to the home range, do so via Israel. Of the countries in the home range Israel has by far the most developed ornamental plant export industry which accounts for a third of Israel's agricultural exports and ornamental plants are the primary means by which MEAM1 has been spread globally [35].
Networks are bidirectional in nature and so cannot directly show a hierarchical relationship. However, in some cases the pattern of the network can be used to propose a directional relationship. In our case the structure of the MEAM1 network  Table 2 to relate network numbers to Dinsdale putative species [27]; network numbering follows the order of Table 2. doi:10.1371/journal.pone.0025579.t003 Table 3. Cont.
supports the argument for a Middle East -Asia Minor origin [25]. Further, the two quite divergent haplotypes, Pak8 and Pak9 which at the 95% parsimony cutoff were shown not to be connected to the network, do connect at 93% and in both cases to the distal end of the home range branch at Sy (Syria). The structure of the branch supports the idea that MEAM1 may have originated in the region encompassed by Iran, Iraq, Pakistan and Syria and then spread into the Arabian Peninsula and from there to the rest of the world through Israel. However, the paucity of sampling across the home range requires that this interpretation be viewed with caution and requires more detailed and extensive sampling to verify. Another feature of the MEAM1 network is the radiation of nodes about the Jap1 node. Here, when the branch containing haplotypes that originated only in the presumed home range are removed, of the 52 remaining nodes, 39 nodes were within one node of Jap1. Of these, 32 are known to occur only in locations outside the home range. There are several possible explanations. Firstly, this proliferation may be due to sequencing read errors. We cannot check for this as the trace files are largely unavailable. Secondly, they may simply reflect a difference in sampling effort with greater effort being apportioned to areas that have been invaded by MEAM1. This is quite plausible as the bulk of the literature pertaining to this pest come from countries where the pest has invaded.
The data and analyses presented here provide the first detailed consideration of the network relationship between different members of the B. tabaci cryptic species complex. In so doing it provides an insight into the patterns of global spread of two highly invasive members of the complex. Studies comparing mtCOI variation in home and invaded ranges have found that depending on the example, genetic diversity in the invaded range may be less, equivalent or exceed that observed in the home range [62][63][64]67]. Where diversity is lower in the invaded range bottlenecks and drift are considered the underlying drivers [69,70]. There is a general view that in the case of biological invasions, genetic diversity is reduced in the invaded ranges, but recent reviews [66,69,71] challenge this perception. In cases where levels of diversity are the same or equivalent, it is suggested that multiple introductions over time is a key contributor to invasion success and that the accumulation of diversity can act to facilitate the process of invasion [66,69]. On the basis of mtCOI haplotype diversity, the results, bearing in mind the limitations associated with using data based on unknown sampling effort, suggest that diversity in the invaded range is lower in the case of MED than in the home range and most likely lower for MEAM1 given the low level of sampling There was a total of 86 unique haplotypes. The number of identical haplotype accessions is indicated for each of the unique haplotypes as well as the number of haplotypes for which there was only one accession (singleton). The source country for each accession is indicated along with whether or not it occurred in the presumed home range. doi:10.1371/journal.pone.0025579.t004 in the home range. While, the addition of genomic DNA would add greater strength to the current observation, the observation that most of the diversity in the invaded ranges is represented by one haplotype in the case of MEAM1 and three for MED raises the question as to why this is the case. One possible explanation is that the invasion pathway has only had access to a very limited portion of the available genetic diversity for each invader. Alternatively, the invasive genotypes represented by the predominant mtCOI haplotypes may be associated with a set of traits that facilitate their ability to access the invasion pathway. Whatever the reason, if the accumulation of genetic diversity is a key factor in increasing invasion success [66,67], then measures that restrict the recruitment of additional genetic diversity should be maintained even after establishment and spread has occurred, so as to avoid increasing the genetic diversity of two already damaging invasive pests.

Mitochondrial cytochrome oxidase one sequences from GenBank
GenBank was searched for all B. tabaci mtCOI sequences. This yielded a dataset of 2325 sequences. Sequences were then aligned using ClustalX 1.83 [41]. Sequences were aligned against AB204577 and at the 59 end began with GAAAATTAGAG-GTATTT and terminated at the 39 end with TCCTTTC-TTCTTCTGCGGT; a 657 bp fragment. Sequences with gaps, stop codons or shorter than 610 bp were removed yielding a final dataset of 1436 sequences of which 352 were unique. The consensus sequences from Dinsdale et al. (2010) [27] were used to assign sequences to the different putative species. Within each putative species the first 38 and the last nine bases are highly conserved. A representative of each unique haplotype together with the number of each sequence representing each haplotype recovered from GenBank is detailed in Table S1.

Mitochondrial cytochrome oxidase 1 haplotype network analysis
We used the 198 unique B. tabaci mtCOI haplotypes from Dinsdale et al. (2010) [27] and added a further 154 unique haplotypes accessed from GenBank (Table S1); a total of 352 unique haplotypes. The sequences were analysed using statistical parsimony [23] with the program TCS v.1.21 [22]. The network connection limit was set at 95% following Hart and Sunday (2007) [20] and Chen et al. (2010) [21]. The resulting networks identify both the relationship between the different haplotypes as well as the significant number of substitutions [23] connecting haplotypes. Readers wishing to see the relationship between the species names used here and biotype terminology are directed to the review De   [25].
Haplotype diversity and distribution for globally invasive B. tabaci There are two globally invasive members of the B. tabaci complex MEAM1 and MED. In GenBank there were 660 MEAM1 sequences and 374 MED sequences and for each sequence the collection country location was then recorded. Duplicate haplotypes were identified using TCS [22]. A key point to bear in mind is that the sequences from GenBank are the results of numerous studies where the sampling effort used will be variable and the results should be viewed accordingly. The nucleotide  composition and K2P genetic distance [42] between networks were calculated in MEGA 4.0 [43].
To identify limits to gene flow among these putative cryptic species, pairwise Wst values were calculated [44] and an analysis of molecular variance (AMOVA) undertaken using Arlequin v.3.1 [45].

Supporting Information
Table S1 The 352 unique mitochondrial cytochrome oxidase 1 haplotypes used in the network analysis. The GenBank accession number for each haplotype is provided along with the country of origin. The haplotype identifier refers to haplotypes belonging to MEAM1 and MED where there were at least two identical sequences in GenBank; the haplotype identifiers are used in Figs 1, 2, 3. The network code indicates the abbreviation used in the networks presented in Figs 1, 2, 3. The number of sequences representing each haplotype is also provided. The haplotype identifier indicates a haplotype from either the Mediterranean or Middle East -Asia Minor 1 putative species which was represented by at least two accessions in GenBank. The network code is the identifier used for each accession in the network analysis. (DOCX)