A Multilocus Phylogeny of the World Sycoecinae Fig Wasps (Chalcidoidea: Pteromalidae)

The Sycoecinae is one of five chalcid subfamilies of fig wasps that are mostly dependent on Ficus inflorescences for reproduction. Here, we analysed two mitochondrial (COI, Cytb) and four nuclear genes (ITS2, EF-1α, RpL27a, mago nashi) from a worldwide sample of 56 sycoecine species. Various alignment and partitioning strategies were used to test the stability of major clades. All topologies estimated using maximum likelihood and Bayesian methods were similar and well resolved but did not support the existing classification. A high degree of morphological convergence was highlighted and several species appeared best described as species complexes. We therefore proposed a new classification for the subfamily. Our analyses revealed several cases of probable speciation on the same host trees (up to 8 closely related species on one single tree of F. sumatrana), which raises the question of how resource partitioning occurs to avoid competitive exclusion. Comparisons of our results with fig phylogenies showed that, despite sycoecines being internally ovipositing wasps host-switches are common incidents in their evolutionary history. Finally, by studying the evolutionary properties of the markers we used and profiling their phylogenetic informativeness, we predicted their utility for resolving phylogenetic relationships of Chalcidoidea at various taxonomic levels.


Introduction
Fig trees (Ficus, Moraceae) and their pollinating fig wasps (Agaonidae, Chalcidoidea, Hymenoptera) present a specialized case of an obligate pollination mutualism [1].Each Ficus species is reliant on agaonid fig wasps for pollination and, in return, the pollinating fig wasp depends on its host Ficus for reproduction and larval development [2,3].The fig -pollinator association is exploited by several other lineages of Chalcidoidea, a few Braconidae (Ichneumonoidea) and up to 30 fig wasp species can be associated with a single Ficus host [4,5].Among these wasp lineages are the Sycoecinae (Chalcidoidea), a group that does not belong to Agaonidae as previously stated [6], but is clearly a member of the Pteromalidae sensu stricto [7][8][9][10].
In contrast to most other wasp lineages associated with figs, all the Sycoecinae are internal ovipositors.Sycoecines, which do not show any pollination behaviour, cannot pollinate actively pollinated Ficus species [11].In passively pollinated Ficus species, large quantities of pollen are released by anther dehiscence so that all the emerging wasps become covered with pollen [12].Therefore, sycoecines (Diaziella) associated with passively pollinated fig trees (Conosycea stranglers) are efficient pollinators [13].
Convergent evolution is believed to account for the morphological similarity between the sycoecines and pollinating fig wasps both being exposed to identical selection pressures due to the constraints of internal oviposition [14].These morphological adaptations, such as smooth, elongated and dorso-ventrally flattened head and thorax, and the presence of tibial and mandibular modifications, enable both pollinators and sycoecines to crawl through tightly appressed bracts surrounding the ostiole to enter the fig cavity.
Presently, the Sycoecinae comprises six genera and 72 described species.The current taxonomy is based on morphological delimitation [15][16][17][18][19] but no molecular phylogenetic analysis has been attempted so far.Therefore our understanding of sycoecine evolutionary relationships is wanting.Diagnostic characters for the six sycoecine genera are provided in Table 1, for identification keys and detailed taxonomy, the reader is directed to the website www.figweb.org[20].Four sycoecine genera, Crossogaster Mayr (16 described species), Philocaenus Grandi (24), Sycoecus Waterston (10) and Seres Waterston (4), are restricted to the Afrotropical region and are associated with Ficus section Galoglychia (Table 1, detailed distribution available at www.figweb.org,[20].The genus Diaziella Grandi (14 described species) occurs in the Oriental region and is associated with Ficus section Urostigma, subsection Conosycea [21][22][23].Finally, the New Guinean genus Robertsia Bouček (4 described species) is associated with Ficus section Stilpnophyllum, subsection Malvanthera (Table 1) [6,24].Extrapolating from our sampling of several hundred species of Ficus the total diversity of the Sycoecinae could reach 190 species worldwide.The subfamily is better known from the Afrotropical region (where an estimated 63% of the species are described) than the Oriental and Indo-Australasian regions where only an estimated 10% of the species are described [20].Consequently, and despite that the Sycoecinae is one of the better-known fig wasp subfamilies, overall more than 65 % of the species await description.
How fig wasp communities have been structured over space and time is a fascinating question that remains to be answered [1].However, before addressing issues such as "how can we explain species coexistence in these closed communities?","do the partners have the same biogeographical history?" and "do we observe cophylogenetic patterns between partners?",we first need robust phylogenies validated by expert taxonomists for all the involved fig-wasp lineages.
In this paper we employ multiple genetic loci, extensive taxon sampling of both described and undescribed species and several different analytical approaches to 1) evaluate the monophyly of the sycoecine genera, 2) provide a first comprehensive phylogenetic hypothesis for the subfamily, 3) propose a new classification and 4) discuss the phylogenetic relationships in light of the host fig tree relationships.
On a more general note, we were interested in testing a combination of frequently used and recently developed markers to resolve the phylogenetic relationships of chalcidoid Both sexes with one labial palp segment and two maxillary palp segments; the female eighth urotergite spiracular peritremata distinctly expanded; the male inner apical mandibular tooth is subequal (but still longer) to much longer than the outer tooth [17] Diaziella Grandi, 1928 D. bicolor Grandi, 1928 14 Oriental Urostigma Urostigma

Conosycea
Four-segmented fore tarsi; a laminar projection present on the proximal fore tarsal segment; a pronounced hypopygium that may extend well beyond the end of the metasoma; an ovipositor usually long (about the length of the metasoma) sometimes shorter [22].

Caulocarpae
Females with two anelli and four funicle segments; gastral tergites with a crenulated posterior edge; ventral tentorial pits in close apposition; eighth urotergite spiracle not expanded.Males have the outer mandibular tooth longer than the inner, without any ventral teeth present [18,19].

Caulocarpae
Propodeal spiracles medially positioned with a plica extending from the internal edge of the spiracle to the posterior margin of the propodeum, which may sometimes be indistinct [15].

Cyathistipulae
Fore tibial spur modified into a plate of many fine teeth; first funicle segment with axial expansion; propleura excavated; pronotum with distinct lateral depressions [16].
Classification of the genus Ficus follows Berg and Corner [93].
doi: 10.1371/journal.pone.0079291.t001lineages at various taxonomic levels.Chalcidoidea have tremendous importance in both natural and managed ecosystems.Several species are for example used as biological control agents of agricultural and ornamental pests [25].However, our understanding of how many of these lineages are monophyletic and what are their phylogenetic relationships is clearly wanting [9,10].Although genes should be selected to match the time period of the phylogenetic problem at hand [26,27], the most commonly used markers remain 28S and 18S ribosomal genes, Cytochrome oxidase I (COI) and Cytochrome b (Cytb), whatever the taxonomic level studied (e.g.[9,[28][29][30][31][32][33]).While ribosomal genes often evolve too slowly to resolve relationships in rapidly diversifying lineages, mitochondrial genes become saturated over larger time scales and are poor estimators of a phylogeny at high taxonomic level [26,34].By studying the evolutionary properties of the markers we sequenced, profiling their phylogenetic informativeness (sensu Townsend [27]) under different alignment procedures, and comparing inferred topologies, we predict their utility for phylogenetic inference at shallower and deeper taxonomic levels.

Taxonomic sampling
Species identification was based on morphological characteristics according to the current literature and was conducted by SvN and JYR.Morphological observations were performed on 10 to 20 representatives of each species.A total of 81 sycoecine specimens representing all the known genera and 56 species, of which 50% are undescribed, were included in the molecular study (Table 2).Seventy percent of the species were represented by sequences from one specimen only.The type-species of the genera Seres (S. armipes) and Philocaenus (P.barbarus) were also included in our analyses.All necessary permits were obtained for the collection of specimens in nature reserves and national parks ( ).A high definition image library of vouchers was also constructed, using the EntoVision Premium Portable Imaging System, to allow future identification of specific taxa and traceability of our results (see Figure 1 for examples).

Marker choice, DNA extraction, PCR amplification and sequencing
We selected markers that retain phylogenetic signal within deeper nodes as well as within terminal clades.We combined two mitochondrial protein-coding genes (Cytochrome oxydase I, COI and Cytochrome b, Cytb) and four nuclear genes (the F2 copy of elongation factor-1a, EF-1α; the internal transcribed spacer 2, ITS2; the ribosomal protein RpL27a and the regulatory protein mago nashi).
mtDNA loci and ITS2 have proven useful in resolving insect molecular phylogenies at shallower taxonomic levels.However, they are rapidly evolving, which make them poor markers for deep divergences [26,34].EF-1α has been successfully used to resolve within-family relationships (e.g.[36][37][38][39]).While the exonic portions of RpL27a have proven informative for resolving deep-level relationships within the Hymenoptera [40], its intronic portions have been successfully used to reconstruct the phylogeography of parasitoids of oak galls (Pteromalidae, [41,42].To our knowledge, mago nashi which encodes for a transcription factor that plays essential roles in Drosophila axis formation [43] has never been included in studies of insect molecular phylogeny. Genomic fig wasp DNA was isolated using standard phenolchloroform extraction and Qiagen DNeasy or ZyGEM extraction kits following standard protocols.Primer sequences and amplification protocols followed Cruaud et al. [44] for Cytb and COI, Cruaud et al. [45] for EF-1α , Lopez-Vaamonde et al. [46] for ITS2 and Lohse et al. [41] for RpL27a and mago nashi.PCR products were purified using ExonucleaseI and Phosphatase, and sequenced directly using the BigDyeTerminator V3.1 kit (Applied Biosystems) and an ABI3730XL sequencer at Genoscope, Evry, France.Both strands for each overlapping fragment were assembled using the sequence editing software Geneious v5.5 [47].All the sequences were deposited in GenBank (Table S1).

Phylogenetic analyses
Alignments of COI, Cytb, EF-1α and mago nashi were straightforward due to a lack of length variation.ITS2 and RPL27a were aligned using various procedures, to assess the impact of alignment methods on phylogenetic inferences.ITS2 and RPL27a alignments were reconstructed with ClustalW [48] and MAFFT 6.864 [49].Default parameters were used in ClustalW and the L-INS-i option was chosen in MAFFT [49].Alignments were then cleaned from highly divergent blocks using the online version of Gblocks 0.91b [50] using the default settings and a less stringent selection of blocks.For the latter strategy (hereafter named « Gblocks relaxed »), the "minimum number of sequences for a flanking position" and the "minimum number of sequences for a conserved position" were set to half the number of sequences, the "minimum length of a block" was set to 5 and selection of positions with gaps present in less than half of the sequences was allowed.
Alignments of the protein coding genes were translated to amino acids using Mega 4.0.2[51] to detect frameshift mutations and premature stop codons, which may indicate the presence of pseudogenes.Alignments of individual gene regions obtained with each strategy were first analysed separately to produce single-gene trees and then concatenated, resulting in six combined datasets (ClustalW; ClustalW + Gblocks default parameters; ClustalW + Gblocks relaxed parameters; MAFFT; MAFFT + Gblocks default parameters; MAFFT + Gblocks relaxed parameters).All combined datasets (including charsets) have been deposited on TreeBase (http://purl.org/phylo/treebase/phylows/study/TB2:S14838).
Phylogenetic trees of the six combined datasets were estimated using maximum likelihood (ML) and Bayesian methods.All analyses were conducted on a 150 cores Linux Cluster at CBGP as well as on the CIPRES Science Gateway [52].Two alternative partitioning strategies were compared using Bayes factors (BF) [53]: Scheme 1: mitochondrial genes, EF-1α, ITS2, RpL27a and mago nashi versus Scheme 2: first + second codon positions of mitochondrial genes, third codon positions of mitochondrial genes, EF-1α, ITS2, RpL27a and mago nashi.Following Kass and Raftery [53], Pagel and Meade [54], and Schulte and de Queiroz [55], Bayes factors were calculated using the following formula: BF = 2*(ln L1 -ln L0) + (P1-P0) * ln (0.01) where ln Li and Pi are respectively, the harmonic mean of the ln likelihoods and the number of free parameters of model i.BF values from 2 to 6 were considered positive evidence, from 6 to 10 as strong evidence, and > 10 as very strong evidence favouring the alternative hypothesis over the null hypothesis.Best fitting model for each partition was identified using the Akaike information criterion [56] as implemented in MrAIC.pl1.4.3 [57].We performed ML analyses and associated bootstrapping using the MPIparallelized RAxML 7.2.8-ALPHA[58].GTRCAT approximation of models was used for ML boostrapping [58] (1000 replicates).Bootstrap percentage (BP) > 95% was considered as strong support and a BP < 70% as weak.
Bayesian analyses were conducted using a parallel version of MrBayes v. 3.2.1 [59].We assumed across-partition heterogeneity in model parameters by unlinking parameters across partitions.Parameter values for the model were initiated with default uniform priors and branch lengths were estimated using default exponential priors.To improve mixing of the cold chain and avoid it converging on local optima, we used Metropolis-coupled Markov chain Monte Carlo (MCMC), with each run including a cold chain and three incrementally heated chains.The heating parameter was set to 0.02 in order to allow swap frequencies from 20% to 70% [60].We ran two independent runs of 20 million generations for the MAFFT-6 partitions and the MAFFT-Gblocks relaxed-6 partitions datasets and sampled values every 2000 generations.We ran two independent runs of 10 million generations for all other datasets and sampled values every 1000 generations.For the initial determination of burn-in, we examined the plot of overall model likelihood against generation number to find the point where the likelihood started to fluctuate around a constant value.The points sampled prior to convergence of the chains were then discarded.Convergence was also evaluated using the effective sample size (ESS) values of each parameters as reported in Tracer v1.5 [61].The results were based on the pooled samples from the stationary phases of the two independent runs.Posterior probabilities (PP) > 0.95 were considered as strong support.

Test of alternative topologies
To compare topologies and assess whether certain alternative relationships among recovered clades could be statistically rejected, we performed AU [62] and SH [63] tests implemented in the program package CONSEL [64].The program makermt was used to generates K=10 sets of bootstrap replicates.Each set consisted of 100,000 replicates of the row sums (10 times the default number of replicates).Default scales parameters were used (r 1 =0.5, r 2 =0.6, r 3 =0.7,r 4 =0.8, r 5 =0.9, r 6 =1, r 7 =1.1, r 8 =1.2 r 9 =1.3, r 10 =1.4), meaning that in the k-th set of replicates, N k = r k N sites were randomly chosen with replacement to calculate the row sums (with N=total number of sites).RAxML was used to compute the persite log likelihoods for all trees.

Phylogenetic informativeness of markers
We initially compared the evolutionary properties of the markers using a Bayesian framework.For each partition of the complete combined datasets (without Gblocks cleaning), we studied base composition, substitution rates and rate variation among sites (α).We also compared rate variation among partitions, considering the parameter m (rate multiplier).
We profiled the phylogenetic informativeness (PI) of the markers, using the method described by Townsend [27] and implemented in the program PhyDesign [65]).This method uses per-site rate estimates to project the utility of a gene for resolving phylogeny across lineage history [27,66].For each partition of a combined dataset, the phylogenetic informativeness of all sites can be summed, which provides the net phylogenetic informativeness (i.e. the degree to which the partition is predicted to contribute to resolution of the phylogeny across history).The informativeness can also be divided by the number of sites, resulting in the phylogenetic informativeness per site.We adopted this later approach in the present study as it allows a comparison of the relative power of genes without the influence of gene length, which may be considered as more accurate [67,68].PI profiles indicate historical epochs during which a partition is most likely to provide phylogenetic signal but do not discount for the misleading effect of homoplasy caused by convergence to the same character state in divergent lineages [27,67,68].Therefore, one must be careful when interpreting PI profiles.When reading PI profiles from tips to root, PI of the marker increases until reaching its PImax.
Once the profile has crested, there are more and more sites that are evolving faster than the optimal rate, which can result in homoplasy ("phylogenetic noise", [68,69].Homoplasy plays a greater effect when the internodes are short.Therefore, when trying to resolve relationships, it is preferable to use character sets with PI profiles peaking deeper in time than the epoch of interest [68]. Bayesian and ML analyses of ClustalW and MAFFT alignments produced similar topologies (see results) and informativeness was calculated using the MAFFT datasets (6 partitions, without Gblocks cleaning, default-Gblocks cleaning, relaxed-Gblocks cleaning).Site rates as inferred by MrBayes (report siterates=yes command) were compiled into PhyDesign formatted files (see PhyDesign FAQ) and used as input rates.Those "vector files" are available upon request.The ultrametric tree was obtained from the MAFFT ML tree (6 partitions) using PATHd8 [70] by arbitrarily setting the root to 1.

Impact of alignment strategy
The final matrix contained 81 sycoecine specimens representing 56 ingroup species and six outgroups.Outgroup ITS2 sequences and Bruchophagus caucasicus, Haltichella rufipes and Megastigmus aculeatus RpL27a sequences were too divergent to be reliably aligned and were consequently excluded from the analyses.No stop codons or frame shifts were detected in the protein coding regions.
Numbers of aligned base pairs, variable sites and parsimony-informative sites for each gene region used in this study are summarized in Table 3. MAFFT and ClustalW alignments of ITS2 and RPL27a resulted in datasets with similar properties (Δ total sites MAFFT/ClustalW = -2 for ITS2, 28 for RPL27a; Δ parsimony informative sites MAFFT/ClustalW = -8 for ITS2, -4 for RPL27a).Removing highly divergent alignment blocks using Gblocks with default parameters dramatically reduced the number of parsimony-informative sites (about 95% loss for ITS2 and between 66% (ClustalW) and 70% (MAFFT) loss for RPL27a).This loss resulted in far less resolved ITS2 and RPL27a trees, when either the MAFFT + Gblocks default or ClustalW + Gblocks default datasets were analysed (Figures S17, S20 and see Table S3 for a comparison of tree resolutions).The Gblocks-relaxed parameters cleaning only slightly affected the resolution of the ITS2 tree and did not affect the resolution of the RPL27a tree (Table S3, Figures S18, S21).
For all partitions the best-fitting model chosen by MrAIC was GTR+I+Γ.We used a discrete gamma approximation [71] with four categories.A total of four analyses were performed per six combined datasets (2 partitioning schemes (5 versus 6 partitions) each analysed using ML and Bayesian methods, Table 4).For all bayesian analyses, after discarding 25% of the samples as burnin, ESS value of each parameter largely exceeded 200, which showed that convergence was reached.Twenty-four combined trees were obtained (Figures S1-S12) and deposited on TreeBase (http://purl.org/phylo/treebase/phylows/study/TB2:S14838). For all datasets, Bayes factor indicated that the most complex partitioning scheme (6 partitions) was preferred over the least complex one (5 partitions) (Table 4).
All trees had well-resolved backbones and three major clades (A, B, C) were identified (Figures 1-4, S1-S12).As  S2).Those tests indicated that the Gblocks-default parameters topologies were all significantly different from the others, though our visual comparison of trees revealed that conflicting nodes had poor supports.The only topological conflict between all 24 topologies with a BP > 70 was the position of Diaziella sp.nov.12 ex Ficus sp.(n° 2989_01) (Figures 3-4, dashed line).Its position was either unresolved (5 trees), as sister to the rest of clade A2 (6 trees, 80 < BP < 92; 0.52 < PP < 1.00) or nested within clade A2 (ClustalW+Gblocks relaxed-6 partitions tree, BP=75, PP=0.89).This uncertainty was probably due to the amount of missing data for this taxon (only 3 markers on 6, Table S1).

Evolutionary properties and phylogenetic informativeness (PI) of markers
Model parameter estimates for each partition of the Bayesian analyses of the MAFFT combined dataset -6 partitions (mean and 95% credibility intervals) are reported in Table 5. Parameter estimates from the analysis of the Clustal W combined dataset (6 partitions) were similar (available upon request).As might be expected, the mtDNA partition showed extreme base compositional bias (69.7% and 96.0% of A/T for mtDNA first + second codon positions and third codon positions respectively).Among the nuclear genes analysed, while RPL27a and mago nashi were A/T-biased (73.6% and 65.4 % respectively), EF-1α and ITS2 show more or less even base composition (50.9% and 55.2% of A/T, respectively).There was a higher rate of A-G and C-T transitions for all partitions (52.6% for ITS2, 64.1% for mtDNA first + second codon positions, 68.8% for RPL27a, 80,3% for EF-1α, 84.9% for mago nashi and 88.8% for mtDNA third codon positions).The mtDNA third codon positions showed a high rate of C-T transitions relative to any other transformations (57.0%).For the nuclear genes, the rates of transitions were close and so were the rates of transversions.The ITS2 rate matrix was the least skewed towards one type of change over another.The gamma shape parameter α was higher for the nuclear genes than for the mtDNA partitions, indicating that nuclear genes show less rate heterogeneity among sites than mitochondrial genes (Table 5).The rate multiplier parameters (m) indicated that rates of substitution were different among partitions.The mtDNA third codon positions evolved about twenty-three times faster than the fastest nuclear gene (ITS2).EF-1α was the most slowly evolving marker.
For the complete dataset (MAFFT without Gblocks cleaning -6 partitions), mtDNA (first+second and third codon positions) showed sharp peaks of informativeness at shallower nodes (Figure 5).Then, the profiles declined showing potential for homoplasy, which is not surprising when considering mtDNA evolutionary properties (Table 5).This prediction is confirmed by the observation of the real performance of the mtDNA marker (i.e. the single-gene tree) in resolving shallower relationships (Figure S13).Interestingly, PI profile suggested that ITS2 was informative across the whole phylogeny.Again, this prediction was confirmed by the observation of the ITS2 tree (Figure S16), on which main clades were recovered.Overall, shapes of the PI profiles correlated well with the corresponding single-gene tree topologies (Figures S13-S16,   S19) and, more precisely, with the distribution of branch lengths across the topologies.The mtDNA topology had longer terminal than internal branches.The deeper in the tree, the shorter the branches, though enough information remained to resolve, at least partially, deeper relationships.Terminal branches of the ITS2 tree were much shorter than those of the mtDNA tree, but longer than those of the RPL27a tree.The deeper branches of the ITS2 tree were longer than those of the RPL27a tree.Neither EF-1α nor mago nashi, the slowest evolving markers contained enough information to resolve relationships within species complexes and sometimes even between species (Figures S14, S15).Our qualitative comparison of single gene tree topologies would lead us to the following ranking in informativeness at shallower nodes : mtDNA > ITS2 > RPL27a > mago nashi ~ EF-1α, which corresponded to the relative position of the PI profile curves.At deeper nodes, this comparison would lead us to rank the markers as follows: ITS2 (longest branches) > RPL27a > mtDNA ~ mago nashi ~ EF-1α, which is again compatible with the PI profiles on Figure 5. mago nashi PI profile curve lays a bit above EF-1α PI profile curve probably because some longer branches do appear in the mago nashi topology (Figure S15).PI profiles were similar, when a relaxed Gblocks cleaning was performed (Figure S22), and, as expected, the informativeness of ITS2 and RPL27a sharply decreased, when default parameters were used to clean the alignment (Figure S23).A bit surprisingly, the default-Gblocks mtDNA PI score was the highest across the whole phylogeny, though RPL27a showed an equivalent resolution at intermediate nodes (Figures S13, S20).

Sycoecinae relationships
As we were interested in relationships at both lower and higher levels, we focussed on trees from the ML and Bayesian analyses of the complete datasets (ie without Gblocks cleaning).We deliberately kept what could be considered the "worst" tree (ie ClustalW + 5 partitions, Figure 2, S1) and the "best" tree (ie MAFFT + 6 partitions, Figures 1, S8) as no one was preferred by the SH/AU tests of alternative topologies.We also wanted to over-emphasize that the few topological differences only occurred between nodes that were either poorly supported in both trees or poorly supported in at least one of the two trees.In the following section, BP and PP values are mentioned as followed: BP ClustalW-tree/BP MAFFT-tree; PP ClustalW-tree/PP MAFFT-tree or are summarized by unique values when no differences between the trees were observed.
The genus Robertsia never clustered with other Sycoecinae species.However, a monophyletic Sycoecinae could not be rejected by both SH and AU tests (Table S2).
Finally, our analyses recovered several cases, where most closely related species were sampled from the same host tree (black boxes in Figures 1 & 2), which could indicate sympatric speciation (Genera Diaziella, Crossogaster, Sycoecus and Philocaenus) .

Impact of alignment and partitioning strategies, accuracy of phylogenetic informativeness profiles
As underlined by Rokas et al. [26], one should be cautious when making generalizations about the taxonomic level at which a given marker might be useful.Genera belonging to different insect families may span a range of evolutionary ages, and hence evolutionary properties and phylogenetic informativeness (PI) of markers should be used as guidelines only [69].Despite the potential problems while applying these properties across different families, we believe that our results may be useful for the designing of future phylogenetic studies between and within chalcidoid families, especially within the pteromalid complex [9].It as been shown that MAFFT outperformed ClustalW on several datasets (e.g.[49]), however, this was not the case for our dataset.We also highlighted that complex partitioning schemes, even favoured by Bayes Factors did not lead to different topologies.Then, between what could be considered the "worst" tree (ie ClustalW + 5 partitions, Figures 2, S1) and the "best" tree (ie MAFFT + 6 partitions, Figures 1, S8) only few topological differences were identified and all of them occurred between nodes that were either poorly supported in both trees or poorly supported in at least one of the two trees.We also showed that using Gblocks with default settings on our alignments dramatically decreased the number of parsimony informative sites (Table 3) and tree resolution.This indicates that one must be cautious when using Gblocks to clean alignments.Gene properties should be studied before Gblocks refinement and selection of blocks should be used only when focussing on relationships at high level, especially when an important number of markers are concatenated [72].
Overall, shapes of the phylogenetic informativeness (PI) profiles (Figure 5) correlated well with the distribution of branch lengths on the corresponding single-gene tree topologies (Figures S13-S21).It is noteworthy that ITS2 contained information spanning the whole tree and mtDNA did contain signal to contribute resolving shallower but also deeper nodes of the phylogeny.It was surprising that the mtDNA PI score was still the highest of any gene at the root of our phylogeny when we profiled the informativeness of each partition of the MAFFT-Gblocks default dataset (Figure S23).Klopstein et al. [67] suggested that PI could be biased towards fast rates (due to the whole set-up of the measure).This could explain the relative position of the curves on the MAFFT-Gblocks default PI profiles.Indeed, for this dataset, mean values of m (rate multiplier) scores for the two partitions: mtDNA-third codon (mtDNAnt3) and mtDNA-first + second codon (mtDNAnt1&2) positions were the highest of any partitions (Table 6).Nevertheless, the mtDNA PI score for the MAFFT dataset was not the highest of any gene at the root of our phylogeny (Figure 5), though the rate multiplier of the mtDNAnt3 partition was the highest of any other partitions (Table 5).Moreover, the mean values of m scores for the ITS2 and RPL27a partitions were higher than the mean value of m score of the mtDNAnt1&2 partition, though the mtDNAnt1&2 PI score was higher than ITS2 and RPL27a PI scores at shallower and intermediate nodes (Figure 5).

Monophyly of the sycoecine genera and new classification 2.1. Robertsia (clade A).
The genus Robertsia is morphologically very distinct from the other sycoecine genera (Table 1) and is the only Sycoecinae genus with apterous males [6,24].Our study shows that taxonomic affinities of Robertsia are ambiguous.However, no definitive conclusion can be drawn from our results.Further studies, including more representatives of the genus and New Guinean pteromalid genera are required.These studies could for example discard a possible long-branch attraction artifact between Robertsia and the outgroup species used in the present study or a negative impact of missing data (only ITS2, RPL27a and mago nashi could be obtained for this taxa).

Diaziella (clade A).
Diaziella is strongly supported as monophyletic across all analyses.This is also well defined morphologically [22].All Diaziella species are associated with Ficus species occurring in Oriental tropical forests and belonging to the subsection Conosycea (Table 1, Figures 1 & 2).Our analyses highlighted two species-groups within the genus (A1 and A2).The species associated with the basal most Conosycea fig trees Ficus glaberrima, F. curtipes and F. lawesii [44,73] clustered in a strongly supported clade (clade A1).These species are characterized by a metallic tinge on the female thorax.Within the clade A2, two small radiations of eight and three species are hosted respectively by F. sumatrana and F. sundaica.Females of these species are characterized by a non-metallic thorax (black or yellow).

Crossogaster (clade B).
Crossogaster is recovered as monophyletic with strong support in all the analyses, a result also supported by morphology (Table 1).Crossogaster species are associated with four of the six Galoglychia subsections namely Platyphyllae, Chlamydodorae, Crassicostae and Caulocarpae that occur in afrotropical forests and savannas (Figures 1 & 2, Table 1).
Crossogaster splits into two well supported clades (B1 and B2).Based on morphological interpretation, van Noort (1994a) split Crossogaster into two monophyletic species-groups and excluded three unassigned basal species.Both the C. odorans species-group (characterised by the presence of elongate multiporous plate sensilla (MPS) on five antennal funicle segments), and the C. triformis species-group (characterised by the presence of placoid MPS on the usually four antennal funicle segments) are included in our clade B2, but each group is not supported suggesting that MPS appearance and number of funicular segments is homoplasious.Our clade B1 corresponds with two of the species exhibiting plesiomorphic characters, C. michaloudi and C. lachaisei (Crossogaster sp.nov 5 is closely related to C. lachaisei).The third species exhibiting plesiomorphic characters, C. inusitata, is a basal representative included in clade B2.However, more representative taxa need to be included to redefine the Crossogaster species-groups.
Interestingly, Crossogaster odorans appeared to cluster into several subclades, which is not surprising since Crossogaster odorans was described from specimens associated with F. burkei.Van Noort (1994a) acknowledged that C. odorans probably comprised a species complex and discussed the associated morphological variation.

A new genus basal to Seres + Sycoecus (clade C1
).Since this clade is strongly supported with a long-branch length and is characterized by strong morphological synapomorphies, it should represent a new genus that will be described elsewhere (van Noort & Rasplus, in prep).Species belonging to clade C1 exhibit a unique plate of teeth on the fore tibia of females and a metallic green head in males.Seres wardi was first identified as a distinct member of Seres based on morphology of both sexes [15].Together with another distinct member (Seres longicalcar), S. wardi was placed in the genus Seres based on the perceived synapomorphic positioning of the propodeal spiracles [15], which now appears to have been derived independently on several occasions rendering Seres as currently recognised to be a polyphyletic assemblage.Species belonging to this clade are only associated with fig trees of the section Galoglychia subsection Caulocarpae, all of them occurring in afrotropical rainforests (Figures 1 & 2).

Sycoecus (clade C2
). Sycoecus is recovered as a strongly supported monophyletic clade across all the analyses, confirming the morphological delimitation of this genus that exhibits striking apomorphies (Table 1 We give a diagnosis of Seres in its new delimitation, and complete list of Seres species in Appendix S1.In summary, the genus is characterized morphologically by 1) an eighth urotergite spiracle not enlarged 2) the presence of two labial and three maxillary palp segments, 3) ventral tentorial pits in close apposition, 4) forewing marginal vein thickened 5) the outer tooth of male mandibles longer than the inner mandible.
Molecularly, the genus Seres in its new sense comprises five distinct species groups (C4.1-C4.5)that largely but not always exactly correspond to the species groups previously defined on morphological characters.A detailed study about Seres species groups is in preparation and will be published elsewhere.

Evolutionary implications of the relationships among the Sycoecinae
As mentioned above, all trees have well-resolved backbones, which indicates stability of higher-level relationships to different alignment strategies (Figures 1-4, S1-S12).Based on this robust phylogenetic hypothesis of relationships among the Sycoecinae, we give preliminary results regarding the evolutionary history of the group.
3.1.High degree of morphological convergence.The genera Crossogaster and Seres (sensu nov.) are morphologically similar (head and body shapes) and in a morphological phylogenetic assessment were recovered as sister taxa due to perceived synapomorphies [74] that we here show are likely to represent homoplasious convergence.Many of the species in both genera have a single comb of backward pointing teeth on the mandible, and a single comb of teeth on the fore tibia.Crossogaster and Seres species are often found on the same host species, particularly within the subsections Platyphyllae and Chlamydodorae.Contrary to the conclusions based on the morphological analysis of the Sycoecinae, our molecular phylogenies show that Seres and Crossogaster are not each other's closest relative.Therefore, the morphological similarity of the two lineages is probably the product of convergent evolution.Indeed, both lineages have probably evolved similar adaptations under identical selection pressures due to the constraints of internal oviposition in the same host.This has already been demonstrated for the sycoecines and their associated pollinating fig wasps, for which head shape (calculated as the ratio of head width to head length) and fresh fig diameter of host fig trees were correlated [14].As mentioned above, morphological convergences are also common between the species of both genera confounding the ability to tease apart phylogenetic relationships based on morphology alone.

Several cases of probable speciation on a single host fig species.
Interestingly, our analyses show that speciation on a single host Ficus species has probably occurred recurrently in different sycoecine genera.Ficus sumatrana and F. sundaica host eight and three related Diaziella species respectively (Figures 1 & 2).This is the first time so many congeneric species of non-pollinating fig wasps (8) have been collected on the same tree and recorded from the same host fig species.All these species were morphologically clearly differentiated based on the head shape, antennae structure, wing patterns and mandible shape (pers.obs.SvN & JYR).It is noteworthy that all these species were collected from fig syconia that contained very few pollinators.Ficus sumatrana is passively pollinated and Diaziella species enter the fig covered with pollen.Given the successful reproduction in fig syconia with low pollinator incidence, it suggests that at least some of the Diaziella species may play a role in the pollination of their host, as previously reported about other sycoecine species [13].
Although less spectacular than the small radiation of Diaziella species on F. sumatrana, our results highlight several cases of apparent speciation on a single host fig species in the genera Crossogaster (twice), Sycoecus (once) and Seres (once) (Figures 1 & 2. Black boxes).Again, in all cases, the species were clearly differentiable based on morphology (pers.obs.SvN & JYR).It is noteworthy that outside our dataset, all known Robertsia species (4) have only ever been reared from Ficus xylosycia (Malvanthera) [24].
As for pollinators [1,5], sycoecine generation time is by far shorter than fig generation time, which could explain why sycoecine speciation by duplication is so common.A possible scenario would involve divergence between sycoecine species isolated on different populations of their host fig tree (e.g. by forest fragmentation).When secondary contact occurs, while fig trees can still interbreed, reproductive incompatibilities have evolved between sycoecine species precluding mating of species.As a consequence, sycoecine species co-occur on the same host tree [5].Another scenario would be that co-occuring sycoecine species result from ecological speciation [75].For example, the two species collected in the same F. chirindensis figs in Kenya (2968_01 and 2968_02) are clearly differentiable based on morphology (antennae structure, mandible shape) and coloration: one species is pale (2968_01, sp nov.2) and the other is dark (2968_02, sp.nov.3).As pale wasp species are attracted to light at night (see 76-79 for the Agaonidae and [21] for the Sycoecinae), we could hypothesize that the first species exploit pools of F. chirindensis figs attractive at night whereas the dark species exploit pools of figs attractive at day.It has been shown that co-occuring agaonid species may have different longevity and ability to resist desiccation [80].This could be also hypothesized for sycoecines.Wasps with longer life span being more likely to disperse over longer distances [81].Explaining the co-occurrence of eight Diaziella species on F. sumatrana is more puzzling, especially since the species do not exhibit morphological differences that could be linked to the exploitation of different niches (e.g.ovipositor length [82] or color).As mentioned above, one possibility would be that all the species have evolved in allopatry, in different forest fragments.Another explanation could be that F. sumatrana is a complex of sympatric species that share representatives of the different Diaziella species, all the species being sampled by chance on one F. sumatrana tree.Further studies are needed to understand which mechanisms have been involved.

3.3.
Comparison with available Ficus host phylogenies.Although current taxonomy of the Sycoecinae supports a degree of host specificity between the wasps and their Ficus hosts, our study reveals many examples of breakdown in host-specificity.For instance, we frequently observe the association of more than one sycoecine (often belonging to several genera) with one Ficus species (e.g.Crossogaster and Seres species Figs. 1 & 2).Furthermore, a single sycoecine species can be associated with more than one Ficus species (e.g. S. medius, S. liodontus).Such complex associations imply that events such as host-switches and speciation on host are common incidents in the evolutionary history of these independent lineages.
Ostiolar morphology prevents entry into the fig cavity for wasps that are not specifically adapted [14,83,84].Externaly ovipositing fig wasps do not have to conform to the morphological adaptations required to enter the fig cavity through a host-specific ostiole.Therefore, internally ovipositing wasps are thought to be highly host specific and less likely to experience host shifts than the externally ovipositing wasps [1,5,46,85].However, these ideas are still contentious.Recent studies have shown that some externally ovipositing fig wasps may be highly host specific [5,[86][87][88] and could have cospeciated with their host figs [86,87] but see 89,90.Overall, we lack extensive data on non-pollinating fig wasps host specificity and further cophylogenetic studies on representative samplings of both figs and wasps are required.Our results strongly suggest that strict cospeciation has not shaped the evolutionary history of both sycoecines and their host Ficus (Figure 6).It appears that the constraints of internal oviposition may not be enough to prevent host-switching events.However, further studies are needed to uncover the fine evolutionary history of the partners.These should especially focus attention on species that are associated with numerous fig hosts, the majority of which fall within Ficus subsection Chlamydodorae.

Conclusion
Prior to this study, elucidation of the taxonomic relationships within the Sycoecinae through molecular phylogenetic analyses had not been attempted.The combined analysis of mitochondrial and nuclear gene regions resulted in a robust phylogenetic hypothesis of relationships among the Sycoecinae, validated by our morphological observations.This study contributes to the global effort to better understand how world fig wasp communities have been structured through space and time.With Agaonidae (Ficus pollinators, [44,91]), Sycophaginae [45,88] and Sycoryctinae [92], Sycoecinae is the fourth fig wasp group for which a worldwide phylogeny is now available.

Figure 1 .
Figure 1.Phylogram of relationships among the Sycoecinae obtained from the analysis of the MAFFT alignment (combined dataset, without Gblocks cleaning, 6 partitions: mtDNAcodon1&2, mtDNAcodon3, EF-1α, ITS2, RpL27a, mago nashi).Uppercase letters refer to clades discussed in the text.The new classification is indicated by colored bars on the right (yellow = oriental species, blue = afrotropical species).Nodes with likelihood bootstrap (BP) values < 70 have been collapsed.BP (> 70) and Bayesian posterior probabilities (> 0.90) are indicated at nodes.Illustrations of female habitus for the main clades are provided on the right.Host fig tree subsections are indicated between parentheses.Black boxes at nodes show cases of probable speciation on a single host Ficus species.

Figure 2 .
Figure 2. Phylogram of relationships among the Sycoecinae obtained from the analysis of the ClustalW alignment (combined dataset, without Gblocks cleaning, 5 partitions: mtDNA, EF-1α, ITS2, RpL27a, mago nashi).Uppercase letters refer to clades discussed in the text.The new classification is indicated by colored bars on the right (yellow = oriental species, blue = afrotropical species).Nodes with likelihood bootstrap values < 70 have been collapsed.BP (> 70) and Bayesian posterior probabilities (> 0.90) are indicated at nodes.Illustrations of female habitus for the main clades are provided on the right.Host fig tree subsections are indicated between parentheses.Black boxes at nodes show cases of probable speciation on a single host Ficus species.doi: 10.1371/journal.pone.0079291.g002

Figure 3 .
Figure 3. Cladograms of relationships among the Sycoecinae obtained from the ClustalW alignment of the combined dataset under three different alignment strategies and two partitioning schemes.The corresponding ML and Bayesian trees are given in Figures 2, S1-S6.Nodes with BP support < 70% have been collapsed and BP supports for main clades are indicated at nodes.Uppercase letters refer to clades discussed in the text (see also Figure 2).The dashed line indicates the only taxon with a conflicting position among trees (see text).doi: 10.1371/journal.pone.0079291.g003

Figure 4 .
Figure 4. Cladograms of relationships among the Sycoecinae obtained from the MAFFT alignment of the combined dataset under three different alignment strategies and two partitioning schemes.The corresponding ML and Bayesian trees are given in Figures 1, S7-S12.Nodes with BP support < 70% have been collapsed and BP supports for main clades are indicated at nodes.Uppercase letters refer to clades discussed in the text (see also Figure 1).The dashed line indicates the only taxon with a conflicting position among trees (see text).doi: 10.1371/journal.pone.0079291.g004

Figure S13 .
Figure S13.Tree from the ML analysis of the mitochondrial partition.Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S14.Tree from the ML analysis of the EF-1α gene region.Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S15.Tree from the ML analysis of the mago nashi gene region.Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S16.Tree from the ML analysis of the ITS2 gene region (ClustalW alignment).Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S17.Tree from the ML analysis of the ITS2 gene region (ClustalW alignment + Gblocks default parameters).Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S18.Tree from the ML analysis of the ITS2 gene region (ClustalW alignment + Gblocks relaxed parameters).Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF) Figure S19.Tree from the ML analysis of the RpL27a gene region (ClustalW alignment).Likelihood bootstrap values are indicated at nodes (1000 replicates).(PDF)

Table 1 .
Type species, diversity, distribution, host fig tree groups and diagnostic characters for the six previously defined genera of Sycoecinae.

Table 2 )
Permits for the collection of specimen in Sulawesi were obtained from the Research Center for Biology LIPI (No : 3180/SU.3/KS/2007);Specimen collection in Taiwan was funded by the ANR project BioFigs/National Science Council, Taiwan, R. O. C. code: 98WFA0100291).Wasps were collected by sampling figs containing either adults or developing wasp larvae.The figs, containing wasps no more than a few days short of their emergence, were placed in handmade wasp-rearing chambers.Once emerged, adult wasps were killed and preserved in 95% ethanol.While the relationships between the chalcidoid subfamilies remain controversial, closer and more distant relatives were included as outgroups [9,10,35].Six species belonging to the genera .Each time destructive extraction was used, vouchers were selected among specimens sampled from the same fig tree and the same fig after careful identification.Vouchers are deposited at CBGP, Montferrier-sur-Lez, France and Iziko South African Museum, Cape Town (SAMC

Table 2 .
List of Sycoecinae and outgroup species included in this study: voucher numbers and depository, taxonomic information, host Ficus species and locality data.

Table 3 .
Numbers and percentages of aligned base pairs, variable sites and parsimony-informative sites for the gene regions used in this study.

Table 4 .
Arithmetic and harmonic means (lnL) for trees obtained in Bayesian analyses based on alternative alignment and partitioning strategies.
*. excluding branch length and topology parameters.Given that all parameters are unlinked among partitions, the number of free parameters of the composite model is the sum of the free parameters of its submodels.For all partitions the best-fitting model chosen by MrAIC was GTR+I+Γ (10 free parameters).**BF 1/0 = 2*(ln L1 -ln L0) + (P1-P0) * ln (0.01) where ln Li and Pi are respectively, the harmonic mean of the ln likelihoods and the number of free parameters of model i.BF values from 2 to 6 were considered positive evidence, from 6 to 10 as strong evidence, and > 10 as very strong evidence favouring the alternative hypothesis over the null hypothesis.doi: 10.1371/journal.pone.0079291.t004

Table 6 .
Rate multiplier values (m) for each partitions included in the Bayesian analyses of the MAFFT + Gblocks default parameters and MAFFT + Gblocks relaxed parameters combined datasets (6 partitions).

6. A new genus associated with subsection Crassicostae ? (lineage C3).
).All species of Sycoecus are associated with fig trees from the subsection Cyathistipulae that grow in the evergreen forests of Central, West and East Africa (Figures 1 & 2, Table1).2.Unfortunately, we were able to include only the one known Philocaneus species associated with the subsection Crassicostae, a subsection with a species distribution centred in the poorly sampled Congo basin.The remaining seven species in this subsection have not yet had their associated fig wasp faunal assemblages sampled or if sampled (F.louisii and F. elasticoides) potential Philocaenus species were not reared.Our results show that the phylogenetic position of Philocaenus sp.nov.ex.F.usambarensis is ambiguous (Figures1 & 2, C3).Although it is recovered as sister to all other Philocaenus species, this relationship is not supported.Morphologically this species is also exceptional.The mandibles are very different and the ovipositor is exceptionally long for Philocaenus being longer than half of the metasomal length.There are a total of eight Ficus species in subsection Crassicostae.Given the host relationships of Philocaenus, this subsection may potentially host further related Philocaenus species.If inclusion of these species in the phylogeny results in a monophyletic clade, erection of a new genus may be warranted.However, for the time being, we prefer to keep this species as part of the genus Seres (in its extended delimitation, see here under).2.