New gSSR and EST-SSR markers reveal high genetic diversity in the invasive plant Ambrosia artemisiifolia L. and can be transferred to other invasive Ambrosia species

Ambrosia artemisiifolia L., (common ragweed), is an annual invasive and highly troublesome plant species originating from North America that has become widespread across Europe. New sets of genomic and expressed sequence tag (EST) based simple sequence repeats (SSRs) markers were developed in this species using three approaches. After validation, 13 genomic SSRs and 13 EST-SSRs were retained and used to characterize the genetic diversity and population genetic structure of Ambrosia artemisiifolia populations from the native (North America) and invasive (Europe) ranges of the species. Analysing the mating system based on maternal families did not reveal any departure from complete allogamy and excess homozygosity was mostly due the presence of null alleles. High genetic diversity and patterns of genetic structure in Europe suggest two main introduction events followed by secondary colonization events. Cross-species transferability of the newly developed markers to other invasive species of the Ambrosia genus was assessed. Sixty-five percent and 75% of markers, respectively, were transferable from A. artemisiifolia to Ambrosia psilostachya and Ambrosia tenuifolia. 40% were transferable to Ambrosia trifida, this latter species being seemingly more phylogenetically distantly related to A. artemisiifolia than the former two.


Introduction
Inferring recent demographic history and contemporary evolutionary processes are major goals in the field of population genetics. Climate change, human disturbances of natural habitats and human-aided dispersal can cause dramatic shifts in the distributions of natural species, and biological invasions are increasingly prevalent worldwide. Analyzing the genetic diversity and population genetic structure of native and introduced populations of an invasive species allows recovering pathways of invasion and identifying founding events and/or admixture events among invasive populations. All these processes affect the demographic success and future expansion of the invasive species and determine its potential for adaption to new environmental conditions. Their understanding is invaluable for devising appropriate management strategies [1,2]. The molecular markers used for population genetics studies are currently essentially of two kinds: genome-wide Single Nucleotide Polymorphisms (SNPs) identified by next-generationsequencing-based techniques such as Restriction site Associated DNA sequencing (RAD-seq) or genotyping-by-sequencing [3], and Simple Sequence Repeats (SSR) markers (microsatellites). Some limitations of SSR markers are low density throughout the genome, complex mutational patterns and possible presence of homoplasy and null alleles [4]. However, SSR markers are easy to score, highly polymorphic and thus highly informative and the theory and practice of SSR marker analysis and their afferent bias are well known [5], making them still the markers of choice for ecological and evolutionary studies. In comparison to SNPs, SSRs are especially well suited for analyzing processes occurring at small temporal or spatial scales and have proven highly relevant for revealing recent expansion and recent admixture or analyzing parentage and kinship [5][6][7]. Next-generation sequencing technologies now allow to rapidly develop large sets of SSRs [8]. In addition, the ever-increasing availability of transcriptome sequences (Expressed Sequence Tags, EST) in public databases enables fast and cost-effective development of genic SSRs marker (EST-SSRs). EST-SSRs are expected to be less polymorphic than gSSRs but also to display fewer null alleles and be more transferable among related species [9,10]. As their polymorphism may be influenced by selective processes, EST-SSRs may reveal somewhat different genetic patterns than gSSRs [11].
The genus Ambrosia in the Asteraceae family includes at least 51 species collectively known as "ragweeds" and mainly distributed in North America [12]. Four different species (A. artemisiifolia L., A. trifida L., A. psilostachya D.C. and A. tenuifolia Spreng.) occur in Europe but are native from America [13,14]. A. artemisiifolia is an annual herb mostly known as a successful invasive and a highly allergenic plant causing severe rhinitis and asthma [15,16]. It has been introduced in Europe in the 19 th century by the import of contaminated grain and forage [17]. A. artemisiifolia has colonized different types of habitats such as railways, riversides and wastelands, as well as cultivated fields where it is now a noxious weed competing with several summer crops [17]. To investigate the population genetic structure in A. artemisiifolia, reliable and polymorphic molecular markers are needed. To date, only a few gSSR markers have been developed from French A. artemisiifolia populations [18,19]. These few gSSRs were used to assess the population genetic structure and patterns of colonization across continental and regional scales in Europe [20][21][22][23][24][25], North America [26] and China [27]. In addition, most of the gSSR markers available showed PCR amplification failures and excess homozygote genotypes [20][21][22][23][24][25][26]. Excess homozygosity can be caused by the presence of null alleles resulting from mutations at primer binding sites that preclude PCR amplification. Alternatively, excess homozygosity has sometimes been interpreted as evidence for partial selfing in a mostly outcrossing species. This issue was highly debated in several SSR-based population genetics studies conducted on A. artemisiifolia [20][21][22][23].
The present study had three purposes: (a) develop new nuclear SSR markers for A. artemisiifolia following three different approaches (whole-genome enrichment followed by 454 sequencing, whole-genome Illumina sequencing, and use of existing EST databases), (b) investigate the genetic diversity, population structure and mating system of A. artemisiifolia using populations sampled in North America and Europe, and (c) assess marker transferability to A. trifida, A. psilostachya and A. tenuifolia.

Plant material
A total of 321 A. artemisiifolia individuals were sampled from 11 populations spanning the invasive range in Europe and 5 populations in North America (Table 1, Fig 1). Twenty individuals were sampled from two populations of A. trifida, 22 individuals from one population of A. psilostachya and 21 individuals from one population of A. tenuifolia. A 0.2-cm 2 leaf section was collected on each individual and DNA extracted as described in [28]. All three species studied are alien invasive, not protected species. Sampling locations were not localized within protected areas so that no specific permission was required. Ambrosia artemisiifolia is described as a diploid species (2n = 36, [13,29,30]). As the presence of triploid plants has sometimes been questioned [26], we counted nuclear chromosomes as described [31] in 10 plants randomly chosen from one French population. Results were in agreement with diploidy with 2n = 36. (S1 Fig). Ambrosia trifida is a diploid species with a different basic chromosome number (2n = 24) [29,30], while A. psilostachya and A. tenuifolia have the same basic chromosome number as A. artemisiifolia but variable ploidy levels [13,14,29,32].

Development of new nuclear SSR markers for A. artemisiifolia
Obtaining sequence data. For the SSR-enriched gDNA library approach, total gDNA from 8 A. artemisiifolia individuals was isolated using the DNeasy Plant Mini Kit (QIAGEN, Valencia, California, USA) and processed by GenoScreen, (Lille, France). A SSR enriched DNA library was obtained as described in [33]. Briefly, total DNA was mechanically fragmented and enriched for AG, AC, AAC, AAG, AGG, ACG, ACAT and ATCT repeat motifs. For the EST public data use approach, the two existing sets of A. artemisiifolia transcriptome 454 sequence data were downloaded from Genbank Sequence Read Archive [34]. They correspond to one individual sampled in the USA (accession SRX096892) and one sampled in Hungary (accession SRX098769). Both datasets were merged before analysis.
For each of the three sequence datasets, stringent sequence quality control and filtering were performed using the ShortRead package in the Bioconductor software [35]. Briefly, read ends were first trimmed by quality scores. Only sequences longer than 300 bp (454 reads) or 200 bp (Illumina reads) with a mean Phred quality score higher than 30% and less than 1% Ns were retained. Exact sequence duplicates were discarded. In the Illumina dataset, only matching paired-end reads were kept after quality filtering and overlapping reads were merged using FLASH [36]. Detection of SSR motifs was conducted on the merged reads only, ensuring that the size of the flanking regions was large enough to design good-quality primers.
SSR identification and primer design. SSRs were identified with QDD version 3.1 [37]. Only 2-to 6-nucleotides motifs were considered. The minimum repeat unit was set to eight for di-nucleotides, six for tri-nucleotides, and five for longer motifs. Expected amplicon sizes were constrained to a 100-300 bp range. Primer pairs were thoroughly tested for clear, stable amplification on 12 A. artemisiifolia individuals from three populations (one French population from the Rhône Valley, the German population DOM and the American population KEN). PCRs were performed in 10-μL as previously described [28]. Cycling parameters consisted in a first denaturation step (2 min at 95˚C) followed by 39 cycles of 5 s at 95˚C, 10 s at 60˚C and 30 s at 72˚C. Amplicons were visualised by electrophoresing five microliters of PCRs on 3% (wt/vol) agarose gels run for 25 min at 100V in Tris-Borate EDTA buffer.

SSR marker validation and assessment of genetic polymorphism in A. artemisiifolia
Genotyping. SSRs successfully amplifying in A. artemisiifolia were used to genotype 384 individuals, including 321 A. artemisiifolia (16 populations), 20 A. trifida (two populations), 22 A. psilostachya (one population) and 21 A. tenuifolia (one population) individuals (Table 1). Genotyping was performed at GENTYANE (INRA, Clermont-Ferrand, France). PCR products were labelled with one fluorescent tag (6-FAM, NED, VIC or PET) and loaded on an ABI 3730XL capillary DNA analyzer (Applied Biosystem) with the size standard GS500 LIZ. Peakscanner version 1.0 (Applied Biosystems) and the R package MsatAllele were used to read allele sizes [38]. A Principal Component Analysis (PCA) was performed on genotype data using the package adegenet [39] in R 3.1.2 in order to examine the genetic relationship among the four species studied.
Check for null alleles. MicroChecker 2.2.0.3 was used to check for the presence of null alleles and scoring errors due to stuttering and large allele dropout for each marker in each A. artemisiifolia population [40]. The markers showing the overall lowest occurrence of null alleles and stuttering were retained for further analyses. Frequencies of null alleles at the retained loci in each population were estimated using INEST 2.1 [41]. F ST outlier tests. All SSR loci were screened for evidence of selection based on an F ST outlier test that identifies loci with an F ST value unexpectedly high (diversifying selection) or unexpectedly low (balancing or purifying selection). We used data from A. artemisiifolia and the software Bayescan [42]. This program implements a Bayesian method based on a multinomial Dirichlet distribution for allele frequencies. The Dirichlet distribution holds under a variety of demographic models when populations derive from a common gene pool. As a recent range expansion has been shown to increase the proportion of false selection event detection [43], we used a conservative prior value of 100 for the 'odds of neutrality' (only 1 locus out of 100 was under selection). For each locus, probability for selection was examined based on relative posterior probabilities for models with and without selection. We implemented 20 pilot runs of 5,000 iterations, a burn-in period of 50,000 iterations and 100,000 subsequent iterations with a sample size of 5,000 and thinning interval of 20.
Estimation of the mating system in A. artemisiifolia The mating system of A. artemisiifolia was investigated using five gSSR markers (SSR10, SSR17, SSR47, SSR71 and SSR73) in six additional French populations sampled in 2014 and located within a few kilometres around population GEN13.03 (Table 1). These gSSR markers showed less null alleles than others. Leaf tissue and mature seeds were collected on six to eight mother-plants per population. Eight to 16 progeny-plants per mother-plant were genotyped, yielding a total of 614 individuals. MLTR [44] was used to estimate the multi-locus outcrossing rate tm, the maternal inbreeding coefficient F, the outcrossing rates between related individuals tm-ts and the correlation of paternity rp.

Genetic diversity and inbreeding
The allelic richness per locus and per population using a rarefaction method (A), expected heterozygosity (H S ) and the genetic differentiation (F ST ) were calculated using Fstat [45]. Significance of F ST values was based on 1000 bootstrap resampling over loci. Inbreeding coefficient (F IS ) were estimated with INEST 2.1 [41] using a Bayesian procedure robust to the presence of null alleles. To assess the statistical significance of inbreeding we compared the model with inbreeding with the random mating model (F IS = 0) based on the Deviance Information Criterion (DIC). Genetic diversity and differentiation parameters for A. artemisiifolia were calculated over all populations, over North American populations and over European populations.

A. artemisiifolia population structure
Population structure was assessed using Structure 2.2 [46]. The admixture model and correlated allele frequencies between populations were selected as specified [47] to determine the number of genetic clusters (K) best fitting the data. The length of the burn-in period was 100,000 runs followed by 500,000 Markov Chain Monte Carlo. Ten iterations were performed for each value of K from 1 to 15. K was determined graphically based on log likelihood values as previously described [46] using the web-based program Structure Harvester [48]. In addition, the ΔK method [49] was used to determine the best value of K. Finally, Clumpp 1.1.2 [50] and R 3.1.2 were used to produce graphical outputs for the inferred population structure.

Genetic divergence and bottleneck tests
Genetic divergence is likely to vary across populations because of differences in population effective sizes and local migration rates. This is especially the case when recent founder effects have occurred, such as during range expansions. Patterns of genetic divergence were estimated for the invasive range (Europe) by calculating population-specific F ST values based on the Fmodel [51]. We used the Bayesian method of Foll and Gagiotti [52] implemented in the software GESTE v2. To assess geographical patterns in genetic divergence, we compared three models: one null model that simply estimate population-specific F ST values, and two models that used either latitude or longitude as explanatory variables. In addition to the study of genetic divergence patterns, we investigated the signature of recent bottleneck events using the Wilcoxon test for excess expected heterozygosity implemented in INEST2.1 and based on the method of Cornuet and Luikart [53]. Analyses were run with the Two-Phase Mutation (TPM) model with default settings.

Development of new nuclear SSR markers
Sequencing results, filtering and success rates of microsatellite loci development are summarized in Table 2. Most 454 reads were eliminated because of insufficient length, while most Illumina reads were eliminated because paired-end reads could not be merged. As expected, the proportion of quality reads containing a SSR motif was much higher among the 454 reads obtained from an enriched gDNA library (24%) than among the Illumina reads obtained from raw gDNA (0.3%) or among the transcriptome 454 reads (0.1%). The low rate of SSR motifs obtained from Illumina sequencing of raw gDNA was more than compensated for by the high amount of reads generated and this method allowed the identification of ten times more potentially amplifiable loci than 454 sequencing of enriched gDNA ( Table 2). The distribution of motif length was very similar between the two methods used to develop gSSRs: on average the di-tri-, tetra-, penta-and hexa-nucleotides accounted for 40.2%, 48.7%, 7.5%, 2.5% and 1% of gSSRs, respectively. By contrast, most EST-SSRs were tri-nucleotides (81.6%), and di-, tetra-, penta-and hexa-nucleotides accounted for 11.1%, 7%, 0.2% and 0% of EST-SSRs, respectively. Success rate of PCR amplification were quite similar among SSR sets, yielding 67 gSSRs (Gen-Bank accession number KX867678-KX867743) and 41 EST-SSRs (GenBank accession number KX867744-KX867785) with consistent amplification in A. artemisiifolia (Table 2). Among these, 46 gSSRs and 32 EST-SSRs gave clear, easy to score patterns after capillary electrophoresis (S1-S3 Tables). Homology with known proteins was detected for 25 EST-SSRs (S3 Table).

Genetic polymorphism at gSSRs and EST-SSRs loci in A. artemisiifolia
Genetic polymorphism was assessed using 16 A. artemisiifolia populations for a total of 321 individuals (Table 1). After checking for null alleles and stutters at each locus in each population, 14 gSSRs and 13 EST-SSRs were retained as best markers for population genetic analysis. Among the 27 loci tested, only one (SSR86, S1 Table) was unambiguously detected as being under selection (Bayesian probability = 1, S2 Fig). SSR86 showed less genetic differentiation (F ST = 0.014) than other markers, but a very high within-population genetic diversity (Hs = 0.776), suggesting balancing selection at or near the locus. This locus was therefore discarded for further analyses.
All 26 retained loci were polymorphic and revealed high levels of genetic diversity ( Table 3). The frequencies of null alleles estimated over all populations ranged between 0.06 and 0.19 and were on average similar between gSSRs and EST-SSRs (0.11; Table 3). Allelic richness per locus and population, calculated based on a minimum sample size of eight individuals (Fig 2), was slightly lower for EST-SSRs than for gSSRs (4.438 versus 4.748) but the difference was not significant (Wilcoxon test p-value = 0.778) (Fig 2). The mean expected heterozygosity within populations was high (0.635 for gSSRs and 0.625 for EST-SSRs) and not significantly different between the two types of markers (Wilcoxon test p-value = 0.778). The mean genetic differentiation F ST was 0.072 for gSSRs and 0.058 for EST-SSRs (difference significant, Wilcoxon test p-value = 0.045).

Insight into the mating system in A. artemisiifolia
Thirty-six maternal progenies sampled from six French populations were analysed with five gSSRs to estimate mating system parameters. In addition, direct evidence for the presence of null alleles was sought by considering the progenies from maternal plants apparently homozygous at one locus. If a null allele was present, the maternal plant would actually be heterozygous (i.e., carrying one null allele and one detectable allele). Its progeny would thus contain some plants apparently homozygous for alleles different from the maternal one, but actually carrying one maternal null allele and one paternal detectable allele. Evidence for the presence of null alleles was obtained for all five markers. Depending on the marker considered, from 25% (3 out of 12) to 35% (8 out of 23) of the progenies from plants scored as homozygotes contained non-matching genotypes. Mating system parameters were estimated after excluding these progenies. The maternal inbreeding coefficient was non-significant (F = 0). Multilocus outcrossing rates (tm) were high and not significantly lower than 1 in all populations ( Table 4). The rates of mating between related individuals (tm-ts) were not significant. Paternity correlations (rp) were weak and only significant for two populations. These results suggested complete outcrossing for A. artemisiifolia and large numbers of pollen donor parents.

Patterns of genetic diversity and inbreeding in A. artemisiifolia populations
We compared patterns of genetic diversity between the native range (North America) and the invasive range (Europe). Allelic richness within population and mean expected heterozygosity within population were slightly higher in North America than in Europe (Table 5). However, for both parameters, the difference between the two ranges was not significant (Wilcoxon test p-values: 0.100 and 0.173 for allelic richness and expected heterozygosity, respectively). The inbreeding coefficient estimated taking null alleles into account was significantly higher than zero in only seven populations: four of the five North-American populations and three of the eleven European populations (S4 Table). Consequently, F IS values were on average higher in the native range than in the invasive range (Table 5), although the difference was not significant (Wilcoxon test p-values: 0.2951). F ST in the native range was low (0.042) but significant. F ST in the invasive range was higher (0.071) and also significant. The difference in F ST values between the two areas was significant based on 99% bootstrap confidence intervals.   (Fig 3). Most of the additional clusters were very specific to one or two populations (cluster 1: HOR-G, cluster 2: 89-P10, cluster 3: TAT-H, cluster 6: BES-I and DOM-G, Figs 3 and 4). The two main genetic clusters (clusters 4 and 5, Fig 4) were frequent in the western and south-eastern part of the invasive range, respectively, but only the first one (cluster 4) was observed at high frequencies in the native range. Structure analyses were also performed separately using gSSR data only or EST-SSR data only. The most likely numbers of genetic clusters among the 16 populations were four (S4 Fig)  and three (S5 Fig) for EST-SSRs and gSSRs, respectively. Overall, the same patterns were observed for both datasets, i.e., variation in cluster membership probabilities among populations and a west-east gradient of genetic variation across Europe.

Patterns of local genetic divergence and bottlenecks in the invasive range
Population-specific F ST values were best explained by the model that included latitude as a linear explanatory factor (posterior probability: 0.66) in comparison to the null model with no explanatory factor (posterior probability: 0.28) or the model including longitude (posterior probability: 0.04). Population-specific F ST values increased with latitude (Fig 5A). While there was no linear relationship with longitude, a non-linear pattern was revealed with populations from Central Europe (Italy: BES-I and Germany: HOR-G, DOM-G) and two populations located in the western (89-P10) and eastern (TAT-H) parts of the range showing elevated F ST values ( Fig 5B). Noticeably, these populations were those harbouring specific genetic clusters under the most detailed Structure models (Figs 3 and 4). Increased genetic divergence may be due to recent founder events for these populations. However, no significant signatures of recent bottlenecks were detected based on the Wilcoxon test for expected heterozygosity excess in any of the studied populations.

Transferability of SSRs and relationships among species
Cross-species transferability was tested for 31 gSSRs and 32 EST-SSRs. Among these markers, 32.2%, 54.8% and 67.7% of gSSRs and 46.9%, 75% and 81.2% of EST-SSRs gave consistent amplification and clear electrophoretic migration patterns in A. trifida, A. psilostachya and A. tenuifolia, respectively (S1-S3 Tables). Among the 26 markers used to analyse the genetic variation in A. artemisiifolia (Table 3), three gSSRs (SSR17, SSR26 and SSR73) and five EST-SSRs (EST-SSR13, EST-SSR61, EST-SSR69, EST-SSR111 and EST-SSR123) were scorable in all Genetic diversity at SSR markers in Ambrosia three other species. Relationships among species were visualised by a PCA based on data at these eight markers (Fig 6). In coherence with the transferability of SSR markers, A. trifida was the most genetically divergent species, while A. psilostachya and A. tenuifolia appeared to be very genetically close to A. artemisiifolia.

Discussion
New, highly polymorphic nuclear SSRs in A. artemisiifolia Next-generation sequencing technologies have considerably facilitated the development of SSRs for non-model organisms. Until recently, the method of choice was 454 sequencing of  Genetic diversity at SSR markers in Ambrosia generate paired reads of 2×250 bp or longer [8]. Here, we implemented a rigorous initial quality filtering of reads. Further, we merged Illumina paired-end reads and kept only long-enough sequences, which facilitated primer design. A similar amplification success of potentially amplifiable loci was obtained from Illumina and from 454 data. This, together with the Illumina technology yielding one to two orders of magnitude more reads than 454, highlights Illumina as the currently most efficient sequencing technology for developing SSRs, provided reads are carefully checked for quality and paired-end reads are merged. All markers developed were highly polymorphic in A. artemisiifolia. As compared to gSSRs, EST-SSRs are expected to be less polymorphic and more prone to the influence of selective processes: divergent selection may increase the estimation of genetic differentiation among populations at these loci, while purifying or balancing selection may have the opposite effect [9][10][11]. Here, allelic richness and expected heterozygosity were similar between the two kinds of markers, while genetic differentiation was slightly lower for EST-SSRs. Most EST-SSRs were tri-nucleotide repeats, for which length polymorphism does not result in any frameshift in the coding sequence. No influence of selection could be detected for any of the EST-SSRs analysed.
The high level of polymorphism observed at both non-genic and genic locations in the genome of A. artemisiifolia likely reflects very large effective population sizes in a plant species known to have recently undergone a demographic expansion both in its native range [26] and in invasive ranges [15]. Similarly, a large variation for life traits is known in A. artemisiifolia [54].
Null alleles rather than partial selfing explain excess homozygotes in A. artemisiifolia One undesirable counterpart of high nucleotide polymorphism is the presence of null alleles resulting from mutations at primer binding sites. Here, null alleles were observed for both gSSRs and EST-SSRs, with overall estimated frequencies of about 10%. This is consistent with a literature survey indicating that null alleles frequencies are often below 20% but can in some cases range from 40% to 75% [55]. Analyzing progenies in French populations provided direct evidence for the presence of null alleles, but no evidence for selfing or biparental inbreeding. Our results were consistent with a previous study of invasive populations from China [27] that also indicated complete allogamy and no shift towards partial selfing during invasion. Significant F IS values were observed in only seven populations out of sixteen, indicating that null alleles are the main cause for excess homozygosity. Significant F IS values were observed in a small minority of populations from Europe but in the majority of populations from the native range. As any evolution of the mating system towards loss of selfing during invasion seems unlikely, it remains to be investigated whether this might be due to a different functioning of the populations in the two ranges, with populations from the native range showing some Wahlund effect [26].
Genetic diversity, population structure and population-specific genetic divergence in A. artemisiifolia Genetic diversity within population was similar in North America (the native range) and in Europe, but genetic differentiation among populations (F ST ) was greater in the invasive range. A similar trend had also been observed previously [23]. This difference in F ST values may arise simply because only a small area of the native range was sampled (five American populations) in our study and in [23]. This pattern is also consistent with a scenario involving multiple introduction events, as previously proposed [20,22,23]. Rare alleles initially present in American populations may have shifted to high frequencies in different European populations after invasion and local demographic expansion [22]. The maintenance of high levels of genetic diversity within invasive populations, a trend opposite to that found in many other biological invasions processes [56], can be attributed to high numbers of introduced seeds in multiple events [57], high gene flow and possibly genetic admixture among introduced populations [21,58].
The main pattern of population structure we observed in Europe was a west-east gradient. Differentiation between the western and eastern parts of the European invasive range had previously been observed [22] and attributed to two main, distinct invasion sources. Here, we also observed that populations from central Europe (Germany to Italy) were genetically distinct from both Western and Eastern European populations. Several genetic clusters predicted by Structure were not observed or were very infrequent in North American populations, suggesting that we may have sampled only a fraction of the native sources. Alternatively, the additional clusters revealed by Structure under the more refined model (K = 6) could simply reflect the elevated genetic divergence of some populations. Indeed, Structure analyses are known to be biased towards inferring extra genetic clusters when some populations have undergone strong recent genetic drift; in that case, and contrary to the assumptions of the admixture model, not all genetic clusters are ancestral sources for the present populations [59]. Population-specific genetic divergences as estimated based on the F-model largely varied among populations, a pattern not expected if all populations similarly derived from a number of ancestral sources [51]. This, together with the outputs of Structure for increasing genetic partitioning (Fig 3), suggests secondary founding events associated with genetic drift. This hypothesis was not supported by signatures for recent bottlenecks; however, it is well known that bottleneck tests have a very limited power [60]. Populations from Italy, Germany and one population from Hungary (TAT) likely had their genetic sources in the South-Eastern part of Europe, whereas one population from France (89-P10) likely originated from eastern France. Although this would need to be validated based a more extensive sampling, genetic patterns revealed here are overall consistent with two main distinct colonization events in Europe (in South-Eastern France: the Rhone valley, and South-Eastern Europe: the Pannonian plain), with secondary colonization events arising northwards and towards Central Europe.

Genetic variation among species of the genus Ambrosia
Most SSR markers developed for A. artemisiifolia (65% and 75%, respectively) were transferable to A. psilostachya and A. tenuifolia, whereas only 40% were transferable to A. trifida. The genus Ambrosia is composed of many, not clearly delineated species [15] for which there is no well-established phylogeny. Former morphological classification considered A. artemisiifolia, A. psilostachya and A. tenuifolia as related species belonging to one same group, while A. trifida was classified in a separate group [30]. This is consistent with differences in gametophytic chromosome numbers (n = 18 for A. artemisiifolia, A. psilostachya and A. tenuifolia but n = 12 for A. trifida, [13,14,29,30,32]) and with a chloroplast DNA phylogeny [61]. The success of SSR marker transfer among species fully confirms these previous data. In addition, some degree of hybridization between A. artemisiifolia and A. psilostachya was suggested [32]. This, or homoplasy at SSR markers, may explain the overlapping genetic variation between the two species.
A. psilostachya and A. tenuifolia are perennial species that reproduce both sexually and clonally [13,14,32,62]. Although these species are of less concern than the annual A. artemisiifolia and A. trifida, being less widespread and invasive, our SSR markers will be useful for assessing vegetative versus sexual reproduction, as well as for identifying colonization sources and relatedness among populations. However, given that several ploidy levels were reported in these species [13,14,29,32], we recommend that ploidy is carefully checked before markers developed from A. artemisiifolia are transferred to A. psilostachya and A. tenuifolia. A. trifida is a noxious annual weed, very widespread in its native area [63] and introduced in several European countries including, for instance, France, Italy, Slovenia and Serbia [64,65]. Despite the potential threat set by this species on both human health and agriculture, population genetics studies are still lacking. Given its distant relation to A. artemisiifolia and the low success rate of markers transferability, we recommend that additional SSR markers are specifically developed for A. trifida.

Conclusions
Large sets of genomic SSRs and EST-SSRs were developed and validated in A. artemisiifolia, providing useful new resources for genetic studies of this highly noxious invasive weed. All markers were highly polymorphic. EST-SSRs revealed as many alleles as gSSRs and yielded similar genetic diversity estimates. The genetic patterns revealed for a set of American and European populations confirmed results from previous studies by showing a high within-population genetic diversity in both the native and invasive ranges. A geographical gradient of genetic variation in Europe was consistent with at least two major colonization events in Western and Eastern Europe, respectively. Secondary founding events were identified, especially in Central Europe. In addition, we settled a former controversy by demonstrating that inbreeding observed within populations is attributable mostly to the presence of null alleles rather than to selfing. Last, most SSRs were transferable to three other Ambrosia species. These SSRs can readily be used for studying key aspects of the biology and population dynamics of the two species most closely related to A. artemisiifolia (A. psilostachya and A. tenuifolia).