A Multi-Species TaqMan PCR Assay for the Identification of Asian Gypsy Moths (Lymantria spp.) and Other Invasive Lymantriines of Biosecurity Concern to North America

Preventing the introduction and establishment of forest invasive alien species (FIAS) such as the Asian gypsy moth (AGM) is a high-priority goal for countries with extensive forest resources such as Canada. The name AGM designates a group of closely related Lymantria species (Lepidoptera: Erebidae: Lymantriinae) comprising two L. dispar subspecies (L. dispar asiatica, L. dispar japonica) and three closely related Lymantria species (L. umbrosa, L. albescens, L. postalba), all considered potential FIAS in North America. Ships entering Canadian ports are inspected for the presence of suspicious gypsy moth eggs, but those of AGM are impossible to distinguish from eggs of innocuous Lymantria species. To assist regulatory agencies in their identification of these insects, we designed a suite of TaqMan® assays that provide significant improvements over existing molecular assays targeting AGM. The assays presented here can identify all three L. dispar subspecies (including the European gypsy moth, L. dispar dispar), the three other Lymantria species comprising the AGM complex, plus five additional Lymantria species that pose a threat to forests in North America. The suite of assays is built as a “molecular key” (analogous to a taxonomic key) and involves several parallel singleplex and multiplex qPCR reactions. Each reaction uses a combination of primers and probes designed to separate taxa through discriminatory annealing. The success of these assays is based on the presence of single nucleotide polymorphisms (SNPs) in the 5’ region of mitochondrial cytochrome c oxidase I (COI) or in its longer, 3’ region, as well as on the presence of an indel in the “FS1” nuclear marker, generating North American and Asian alleles, used here to assess Asian introgression into L. dispar dispar. These assays have the advantage of providing rapid and accurate identification of ten Lymantria species and subspecies considered potential FIAS.


Introduction
For countries dominated by forested land such as Canada, expanding global trade has increased the risks of introduction and establishment of forest invasive alien species (FIAS). This situation has called for heightened vigilance on the part of plant protection authorities and a strengthening of measures taken to prevent the accidental introduction of unwanted alien pests. Some FIAS represent a greater threat than others, and the Asian gypsy moth (AGM) ranks high on that list. For regulatory purposes, what is referred to as AGM is a group of closely related Lymantria (Lepidoptera: Erebidae: Lymantriinae) moths comprising two L. dispar subspecies (L. dispar asiatica Vnukovskij and L. dispar japonica (Motschulsky)) and three other Lymantria species (L. umbrosa (Butler), L. albescens Hori and Umeno, and L. postalba Inoue [1][2][3]. The European gypsy moth (EGM), L. dispar dispar (Linnaeus), is already established in North America following an accidental introduction in Massachusetts in 1869 [4]. Although the EGM has since become a very serious pest of hardwoods in parts of the United States and Canada, its females are flightless and individuals of this subspecies have a narrower host range than their Asian L. dispar counterparts, which have >500 known hosts and whose females are flight-capable, thereby facilitating their spread following an accidental introduction [5]. The other three species making up the AGM group (L. umbrosa, L. postalba, L. albescens) form a clade with L. dispar [6] and were once considered L. dispar subspecies [1]. Their host ranges differ from those of L. dispar asiatica and L. dispar japonica but they are considered a threat to some parts of North America; for this reason, they are regulated in Canada [2,7]. In addition, at least five other Lymantria species pose a threat to forests in Canada and the United States: L. monacha (Linnaeus), L. fumida Butler, L. mathura Moore, L. xylina Swinhoe and L. lucescens (Butler). The first two species are conifer defoliators while the last three are broadleaf defoliators [1]. For an assessment of the phylogenetic relationships among these species/subspecies, the reader is referred to earlier studies by other groups [6,[8][9][10][11][12][13][14].
In order to prevent the introduction of these insects into Canada, the Canadian Food Inspection Agency (CFIA) uses pheromone traps to monitor gypsy moths and conducts regular inspections of foreign vessels entering Canadian ports, where they search ships and cargo for moths, larvae and egg masses. Given that eggs of the above species and subspecies are virtually impossible to distinguish from one another and from other non-threatening Lymantria species (reliable identification requires rearing to the adult stage), we initiated the development of a rapid molecular suite of assays that can produce AGM diagnostic results hours after testing suspicious eggs, so that informed decisions can be made and actions taken to prevent accidental AGM introductions. Several other molecular diagnostic assays targeting AGMs and allied species have been developed by other groups, but most of these require either sequencing of amplified products followed by sequence comparisons [6,[8][9][10]15,16] or implementation of one of various approaches involving amplification of markers, DNA digestion with restriction enzymes and gel electrophoresis [11][12][13][17][18][19][20][21][22][23]. Although some of these approaches have proven to be operationally useful, they do not provide rapid species/ subspecies identification and may be of limited subspecies resolution. Recently, two rapid quantitative PCR (qPCR) TaqMan 1 assays were reported for AGM diagnostics, but their scope is limited to assigning the unknown to either L. dispar dispar or L. dispar asiatica/L. dispar japonica [14,24].
Here we present a set of qPCR-based molecular assays that can identify all three L. dispar subspecies, the three additional Lymantria species comprising the AGM complex and all five other threatening Lymantria species (referred to here as "OTLS") listed above. In cases where the unknown is identified as L. dispar dispar, one of the assays will also flag individuals displaying L. dispar asiatica introgression resulting from inter-subspecies hybridization; females of such hybrids can show strong flight capability [22,25], and regulators may decide to treat them as AGM. The suite of assays is built as a "molecular key" (analogous to a taxonomic key) and involves several parallel, singleplex and multiplex qPCR reactions. For any given reaction, it applies a combination of primers and probes, some of which feature locked nucleic acids (LNA probes [26]), designed to separate taxa through discriminatory annealing. The success of these assays is based on the presence of single nucleotide polymorphisms (SNPs) in the 5' region of mitochondrial cytochrome c oxidase I (COI-5P; also known as the "barcode region") or in its longer, 3' region (COI-3P), as well as from an indel in the "FS1" nuclear marker, generating North American and Asian alleles (FS1 is an anonymous marker identified using a RAPD-PCR strategy [20]). These assays have the advantage of providing rapid (< 1 day) and accurate identification of ten Lymantria species and subspecies, including several potential FIAS in Canada and the United States.

Sources of biological material and collection of data from databases
We built a panel of specimens comprising multiple individuals of the assays' target species, plus closely related ones, including species that could be encountered and whose eggs could not be easily distinguished from those of target species. Specimens were provided by collaborators possessing taxonomic skills ensuring authentication of moth identification. To capture intraspecific genetic diversity, specimens from different geographic regions were used when available. The majority of the specimens used were dry adults taken from existing collections or fresh adults obtained from laboratory rearings established~10 years ago. No specific permissions were required for collecting the remaining specimens, many of which were caught in pheromone traps or collected in the field as eggs or larvae in Canada (L. dispar dispar). None of the insects used belong to an endangered species.
For assay development, we used as many DNA sequences per taxon as possible to cover the range of genetic diversity. This includes DNA sequences from all specimens in our collection as well as all publicly available DNA sequences for a given target gene. The complete list of DNA sequences used for assay development is provided in S1 Table. DNA extraction from adults For some of the work conducted here, DNA was extracted from moths that were snap-frozen alive and held in the freezer until processed (specimens identified as "fresh adults" in Table 1). In these cases, adult males (wings and abdomen removed) were ground in liquid nitrogen and submitted to DNA extraction, using either the Qiagen Blood & Cell Culture DNA midi kit or the Qiagen DNeasy Blood & Tissue mini kit (Toronto, ON, Canada), following the manufacturer's instructions. For DNA extraction from archival specimens (typically, two moth legs), we also used the Qiagen DNeasy Blood & Tissue mini kit, with the following modifications: incubation at step 3 was done overnight and elution at step 8/9 was done in a volume of 100 μL. DNA extraction from egg masses for direct PCR DNA was extracted from fresh, post-diapause Lymantria dispar dispar egg masses (Insect Production Services, Natural Resources Canada, Sault Ste. Marie, ON) following the protocol described in [27]. Two eggs were added to 100 μL of 2x buffer (20 mM Tris, pH 8.3, 3 mM MgCl 2 , 100 mM KCl) in a 1.5 mL conical tube. Just prior to use, Tween 20 (final concentration 1% v/v) and 200 μg/mL Proteinase K (Life Technologies, Carlsbad, CA, USA) were added to the buffer. The samples were ground with a disposable micro-pestle (VWR, Radnor, PA, USA) and incubated at 55°C for 120 min, with a brief vortexing after the first 5 min of incubation. After centrifugation at 13,000 x g for 5 min, the supernatant was transferred to a new tube and heated at 99°C for 5 min to inactivate the Proteinase K. One hundred μL of dH 2 O was added to the sample to achieve a 1x dilution and 1 μL was used in qPCR.

Marker amplification and sequencing
Universal primers used to amplify and sequence mitochondrial and nuclear genes from specimens of our Lymantria collection were designed based on publicly available sequences (Gen-Bank, BOLD); primer sequences are provided in S2 Table. Mitochondrial genes initially selected for the development of our molecular assay included subunits of cytochrome c oxidase (COI and COII), NADH dehydrogenase (ND1, ND2 and ND6), ATP synthase (6 and 8) and cytochrome b oxidase whereas elongation factor-1 alpha (Ef-1α) was selected as a nuclear marker. DNA extraction, PCR amplification, and sequencing of the targeted molecular markers followed standard protocols [28][29][30]. PCR and sequencing of barcode region (COI-5P) generally used a single pair of primers that recovers a 658 bp region near the 5 0 end of COI, including the 648 bp barcode region for the animal kingdom [31]. For older museum specimens, primer pairs designed to amplify smaller overlapping fragments (307 bp, 407 bp) were employed [32]. PCR reactions were performed directly on total DNA extracts (0.5-1 ng) in a 25 μL final volume. PCR conditions were as follows: initial denaturation step at 94°C for 2 min, followed by 35-40 cycles of denaturing (94°C, 1 min), annealing (43°C-55°C, 1 min, depending on primers and/or sample), extension (72°C, 1-2 min, depending on amplicon size) and a final extension of 72°C for 10 min. PCR products were used for direct Sanger sequencing; resulting sequences were compared to identify discriminant SNPs.
Target-specific TaqMan-based real-time PCR assays All molecular detection assays used in this study are based on TaqMan 1 technology [33]. Primer and probe design was performed using Oligo Explorer v1.2 and Oligo Analyzer v1.2 (Gene Link, NY, USA). Primers (Table 2) and probes (Table 3) were designed to (i) minimize development of secondary structures and dimer formation at the 3' end of primers (minimal interaction between primers and probes) and (ii) to ensure amplicon length does not exceed 200 bp. Interspecific SNPs were preferentially localized at the extreme 3' end of the primers and the middle of the probes for maximum discriminatory effect. All primers and TaqMan probes were manufactured by Integrated DNA Technologies Inc. (IDT; Coralville, IA, USA). All assays were designed to work under the same thermocycling conditions. For initial singleplex testing, all probes were labelled with fluorescein (6-FAM) at the 5' end and the quencher Iowa Black FQ (IBFQ) at the 3' end. For subsequent duplex and triplex assay Table 2. List of primers developed for DNA quantification and for each simplex, duplex and triplex assay, as identified in Fig 1. The first 3-4 letters of primer names designate the taxa being targeted (except for FS1, which designates the marker), while "F" and "R" refer to forward and reverse. In primer sequences, bold/underlined letters designate degenerate sites while italicized/underlined letters designate ARMS bases.

Assay
Assay validation, probes labelled with the fluorophore Cy5 at the 5' end and the quencher Iowa Black RQ (IBRQ) at the 3' end and the fluorophore TEX-615 at the 5' end and the quencher Iowa Black RQ (IBRQ) at the 3' end were used. Additionally, for non-LNA probes (see section on LNA probes below), a ZEN™ (for 6-FAM probes) or TAO™ (for Cy5 probes) internal quencher was placed between the 9 th and 10 th base from the reporter dye on the 5' end of the probe sequence. Those internal quenchers shorten the distance between dye and quencher and, in combination with the terminal 3' quencher, provide a higher degree of quenching and lower initial background fluorescence. Duplex and triplex assays were analyzed for interactions between all primers and probes using Oligo Analyzer v1.2 and subsequently tested against the same panel of species used for the singleplex assays. Because the majority of the assays were designed in the COI-5P and COI-3P gene regions, care was taken to ensure that there was no overlap of amplicon regions in these assays.
Use of ARMS primers. Amplification Refractory Mutation System (ARMS) primers are useful when only one discriminatory SNP is available for the development of assays where discrimination is dependent on primers (i.e., probe is non-discriminatory). An artificial ARMS SNP can be added adjacent to an existing 3' end SNP at either position 2 or position 3 of the primer to significantly increase the discriminatory ability of that primer [34]. ARMS primer combinations were tested against non-ARMS primer pairs to determine the combination that gave the best discrimination of non-target species while having a minimal effect on the specificity of the target species amplification.
Use of locked nucleic acid (LNA) probes. LNA nucleotides are used to increase the sensitivity and specificity of annealing in qPCR probes. A triplet of LNA nucleotides surrounding a single base mismatch site maximizes probe specificity [26]. LNA probes were used primarily in assays that were designed with single SNP discrimination (duplex 1A, duplex 2A, duplex 2B; Fig 1) and in assays where primers alone gave insufficient discrimination (duplex 5A; Fig 1). LNA probes were designed using IDT's DNA Thermodynamics and Hybridization tool (biophysics.idtdna.com).
SYBRGreen-based real-time PCR quantification for standardization of moth DNA concentration DNA concentrations of all moth samples were standardized by qPCR quantification using lymantriine general primers (Table 2). Quantification and standardization of the DNA prior to Table 3. List of probes developed for each simplex, duplex and triplex assay, as identified in Fig 1. The first 3-4 letters of probe names designate the taxa being targeted (except for FS1, which designates the marker), while "RC" means reverse complement. In probe sequences, bases preceded by a "+" sign are LNA bases.

Assay
Assay its use in the TaqMan discrimination assays allowed us to confirm that DNA was present in a high enough concentration in all samples to ensure discrimination of all closely related species in the assays. It also simplified the interpretation and analysis of the TaqMan assay results. Lymantriine general primers were designed in a conserved region of the 5' end of the COI gene using Oligo Explorer v1.2 and Oligo Analyzer v1.2. Degeneracy in the primers was kept to three degenerate bases per primer in an attempt to conserve the efficiency of the amplification reaction. The length of the amplicon was 128 bp. Real-time PCR was performed with an Applied Biosystems 7500 Fast Real-Time PCR System (Life Technologies, Carlsbad, CA, USA). All reactions were performed in a final volume of 10 μL and contained 1x QuantiTect SYBR Green PCR Master Mix (Qiagen, Valencia, CA, USA), 0.5 μM of each of the lymantriine general primers (Table 2), and 1 μL of template DNA. Real-time PCR thermocycling conditions were set at 95°C for 15 min, followed by 40 cycles at 95°C for 15 s, 50°C for 30 s, and 65°C for 60 s. Fluorescence was read at the end of the extension step. Gene copy quantification was then performed using a Java program based on linear regression of efficiency [35], and sample DNA concentration was adjusted to 1000-2000 gene The assay is designed like a taxonomic key where molecular features substitute for morphological characters; it may therefore be thought of as a "molecular key". Each unknown sample is processed in a sequential manner through the key, starting at the top left, until it can be assigned to a taxon or to the category "non-target species" (NTS; bottom right).

Validation
Specificity validation of all the assays was performed using the panel of specimens listed in Table 1. Real-time PCR amplification was conducted using 1x QuantiTect Multiplex PCR NoROX Master Mix, with 0.5 μM of each primer, 0.1 μM of TaqMan probe, and~2,000 gene copies of template DNA, whenever possible, in a final reaction volume of 10 μL. Three technical replicates were performed for all reactions. Thermocycling conditions were set at 95°C for 15 min, followed by 45 cycles at 95°C for 15 s and 60°C for 90 s. Fluorescence was read at each cycle, at the end of the extension step. The fluorescence threshold (F t ) was set at 10% of F max for the analysis of these results to avoid false Ct values for any samples that may have a baseline drift.
Sensitivity of the TaqMan assays was evaluated in terms of both efficiency and limit of detection (LOD). For each target assay, experiments were conducted to (i) determine if Ct values were proportional to the amount of target template DNA (efficiency) and (ii) evaluate the LOD, which is the smallest amount of target DNA that can be detected for each of the assays. At least one isolate for each of the target species was selected, and TaqMan assay sensitivity was assessed on parallel sets of serial dilutions from the DNA stock.
To assess efficiency of the amplification reaction, TaqMan assays were run with serial dilutions of template DNA from the target species, with the DNA initially quantified using the COI-based lymantriine general primers described above. Standard curves were obtained by plotting the values of Ct against the log value of the target gene region copy number. Amplification reaction efficiency was calculated using the following formula: where E represents the amplification reaction efficiency and slope is the slope value of the line derived from the standard curve plot. Estimation of the LOD was done by performing 20 replicates of the TaqMan real-time PCR reactions for the lowest detectable DNA concentrations determined above. The lowest DNA concentration with a level of 95% successful amplification was identified as the LOD.

Rationale for the choice of marker genes
In the course of developing the assay presented here, we compiled existing Lymantria marker sequences gleaned from public databases (GenBank, BOLD) and sequenced several other markers (both mitochondrial and nuclear) from the various species/subspecies for which we had samples (Table 1). Following thorough comparisons carried out for each set of marker sequences, it became apparent that the full COI gene (i.e., 5' barcoding region [COI-5P] + remaining 3' region [COI-3P]) contained enough polymorphism to allow separation of all targeted species and subspecies through the design of specific qPCR primers and probes. In addition, the large number of COI-5P sequences deposited in public databases considerably increased our confidence in sequence consistency and, as a result, in assay reliability. The only other marker we included is the "FS1" nuclear marker [20]. In the absence of markers diagnostic of female flight capability, we reasoned that FS1 could be used to determine whether insects identified as L. dispar dispar, on the basis of mitochondrial markers, are in fact AGM-EGM hybrids, some of which could have flight-capable females [22].

Assay development and description
The architecture of the qPCR assays presented here resembles that of a standard taxonomic key, but one where genomic features substitute for morphological characters. The full assay is partitioned into three taxonomic subgroups (yellow shaded boxes; Fig 1): (i) AGM complex, (ii) EGM and (iii) other threatening lymantriine species (OTLS). Each subgroup comprises several independent assays that are run in simplex, duplex or triplex mode, for a total of two assay tubes per subgroup (red boxes; Fig 1) and six tubes for the whole assay when all reactions are run in parallel.
Each individual assay (light-blue shaded boxes; Fig 1) represents a dichotomic node featuring a question to which the qPCR run is expected to provide a "yes" or "no" answer, i.e. there should either be a clear amplification or no amplification at all. Of course, such a system will not perform adequately if provided "maybe" answers (e.g., a positive amplification but very late in the amplification cycle). To avoid this type of uncertainty, one must first select appropriate primers and probes that maximize target selectivity, but it is also critical to standardize the concentration of DNA used in each run. To this end, degenerate primers that amplify a small region of the COI-5P region in all species targeted by our assay (Table 1; see first tab of S1 File for details) were used as an indirect qPCR DNA quantification method. Low-Ct samples (i.e. highly concentrated) were then diluted to achieve a Ct of 22-23 (SYBR Green reading), using a calculation that assumes a doubling of DNA quantity at every amplification cycle. With the TaqMan probes employed in our assay, this dilution corresponds to a Ct of 25-28 for a mitochondrial marker and a Ct of 30-32 for a single-copy nuclear marker. Under these conditions, any run generating a Ct of >35 should be considered negative or may indicate DNA contamination. Fig 2 provides an example of how standardization of DNA concentration reduces variability in Ct values.
Below, we provide a description of each individual assay, including the rationale for SNP, primer and probe selection. Explanatory illustrations (sequence alignments and amplification curves) are provided herein for the first individual assay, but the reader is referred to S1 File for similar illustrations supporting the description of the other assays (one assay per tab). Primers and probes developed for each individual assay are presented in Tables 2 and 3.
Duplex Assay 1A: Is it L. albescens/L. postalba? This is the first of two assays designed to determine whether the unknown is a member of the AGM complex. It targets SNPs that are unique to the L. albescens/L. postalba species pair, which are here treated together because of the high degree of sequence identity displayed by their COI-5P regions. For this assay, discriminatory SNPs fall within the forward and reverse primer sequences; the Cy5 probe is non-discriminatory against L. dispar dispar and other members of the AGM complex (Fig 3). To enhance specificity, an ARMS base (red letter) was introduced into each primer. In validation tests, this assay produced amplifications for L. albescens and L. postalba, but none for any of the other species/subspecies examined (Fig 4 and S1 File).
Duplex Assay 1B: Is it L. dispar asiatica/L. dispar japonica/L. umbrosa? This assay aims at detecting the presence of any of the three remaining members of the AGM complex in a single qPCR step. If neither assay 1A nor assay 1B produces a positive amplification, then the unknown is not AGM and the molecular key bifurcates to the EGM assay. Discriminatory SNPs for assay 1B fall within both the forward primer (except for discrimination against L. dispar dispar) and FAM-LNA probe, which are near the 3' end of the COI-5P region; the reverse primer is located within the COI-3P region, where SNPs enable discrimination against L. albescens/L. postalba (S1 File). If a positive amplification is obtained in assay 1B, then the chart points towards additional assays that will determine which of the three targeted AGM complex members is present in the sample.
Duplex Assay 2A: Is it L. dispar asiatica/L. dispar japonica? This assay provides discrimination between L. umbrosa and the two Asian L. dispar subspecies: a negative amplification identifies the unknown as L. umbrosa (Fig 1). Both primers and probe fall within the COI-5P region, and discrimination against L. umbrosa is provided by a T/A substitution located in the FAM-LNA probe region (S1 File). Duplex Assay 2B: Is it L. dispar asiatica? If the previous assay indicated that the unknown is either L. dispar asiatica or L. dispar japonica, assay 2B will generate amplification only if the sample is L. dispar asiatica; absence of amplification implies that the unknown is L. dispar japonica. Discrimination is here provided by a G/A substitution within the region targeted by the Cy5-LNA probe, which is located in the 3' portion of COI (S1 File).
Simplex Assay 3A: Is it L. dispar dispar? When the above-described duplex assay 1 generates negative results for the AGM complex, the molecular key then tests the hypothesis that the unknown is EGM. To this end, discrimination between L. dispar dispar and other lymnatriids is provided by a COI-5P-based FAM probe that is specific to EGM. Degeneracy is introduced at one site in the forward primer to take into account some variation observed among independent L. dispar dispar COI-5P sequences. It must be pointed out here that the design of this assay does not take into account any AGM complex sequences, as this possibility is already eliminated at this intersection of the molecular key (S1 File).
Duplex Assays 4A and 4B: Does L. dispar dispar show evidence of Asian introgression? Previous studies have reported the existence of gypsy moth populations, primarily from central Asia, that feature mitochondrial DNA sequences diagnostic of L. dispar dispar while displaying biological characteristics that are typical of AGM, including flight capability in females [22]. From a regulatory standpoint, such insects may need to be treated as AGM. To assess the occurrence of Asian introgression into EGM, we made use of the FS1 nuclear marker, Fig 3. Example of the approach used to select and design qPCR primers and probes. Sequence alignment of the COI-5P region targeted for primer and probe design in the context of developing a qPCR assay that amplifies only L. albescens and L. postalba DNA. The primer and probe sequences are shown above the alignments. An ARMS base (red letter) was introduced into each primer to increase specificity. Sequences shown here were either gleaned from public databases or were obtained through specific PCR amplification followed by Sanger sequencing, as described in the Materials and methods section (see S1 File for details). for which North American ("N") and Asian ("A") alleles have been described and where the latter features a 103 bp insertion relative to the former [20]. Thus, the present assay is not a typical SNP-based assay; rather, discrimination between the two alleles relies on the design of two separate probes: a Cy5 probe that spans the gap in the North American allele and a FAM probe that is specific to the Asian insertion (Fig 5). Specificity of the N allele probe is further enhanced by the presence of two substitutions within the region targeted by the probe (S1 File). Validation of this assay involved the testing of bona fide Asian gypsy moths (L. dispar asiatica and L. dispar japonica), and L. dispar dispar specimens, as well as that of several other moths suspected of being hybrids, either based on earlier reports or on their geographical origins (i.e., near the boundary of the L. dispar dispar and L. dispar asiatica ranges). As expected, the L. dispar asiatica and L. dispar japonica specimens examined (from eastern China, South Korea,   [20]. See Results section and S1 File for details. Russian Far East and Japan) were homozygous for the A allele while most L. dispar dispar samples from North America were homozygous for the N allele. However, specimens from Siberia and Lithuania, previously reported to have flight-capable females while displaying L. dispar dispar COI-5P sequences [22], were homozygous for the A allele, as were several other specimens from central Asia, including Kazakhstan, Kyrgyzstan, Tajikistan, and Iran (Table 4; Fig 6). Specimens from the Czech Republic and Greece were heterozygous for the FS1 marker, as was a specimen from Connecticut (heterozygosity of some specimens from Connecticut was noted earlier [22]). On the European continent, specimens from France and from the Crimean Peninsula were homozygous for the N allele (Table 4; Fig 6). Regulatory implications of these findings are addressed in the Discussion section.
The duplex assay 5 and triplex assay 6 ( Fig 1) described below may be regarded as forming a single multiplex assay where each individual assay need not be treated in a sequential fashion; separation of these two assays and the sequential presentation of individual assays are done for convenience only.
Duplex Assay 5A: Is it L. monacha? In cases where assay 3A (Is it L. dispar dispar?) produces no amplification, the dichotomic key redirects the identification process to the OTLS subgroup, where the first assay targets L. monacha (Fig 1). Here, discrimination is provided primarily by the FAM-LNA probe and the reverse primer, both of which fall within COI-5P regions that feature several SNPs relative to other lymantriines (S1 File).
Duplex Assay 5B: Is it L. fumida? For the L. fumida assay, both primers and the Cy5 probe contribute to discriminating this species from others. Targeted regions are within the COI-5P marker and contain many SNPs (S1 File).
Triplex Assays 6A, 6B and 6C: Is it L. mathura, L. xylina or L. lucescens? Negative amplification at the previous step brings the identification process to this last set of assays, which are run in triplex for convenience. All three individual assays are similar to the L. fumida assay in that, for each, discrimination is provided by both primers and the probe (Tables 3 and  4). In the L. mathura assay, degeneracy is introduced at two sites in the forward primer to account for some sequence variation among populations of different geographic origins. In the L. xylina assay, ARMS bases are introduced in both forward and reverse primers to enhance discrimination relative to the non-target species (S1 File). In cases where upstream assays have not enabled identification of the unknown, failure to obtain an amplification in this triplex assay leads to the conclusion that the unknown may not be assigned to any of the species or subspecies targeted by the present molecular identification key (Fig 1).
To assist users in the identification process, we provide an Excel sheet tool where results of each individual assay (yes or no amplification) may be entered for automatic species/subspecies assignment after the full suite of assays has been run (S2 File).

Assay specificity validation
To assess the specificity of our molecular key, each individual assay was tested using a set of species/subspecies considered pertinent to the assay being evaluated. A total of 105 specimens were used for validation purposes (Table 5), a list that includes specimens processed to generate the FS1 assay data presented in Table 4, plus three additional specimens that were used for FS1 genotyping only (samples CFIA-LEP0146, CFIA-LEP0147 and AGM-0022 in Table 4). Details of each validation test may be found in S1 File, a summary of which is presented in Table 5.
Notwithstanding the L. dispar specimens whose subspecies status was considered uncertain, all identifications provided by the molecular assays matched those made by taxonomists before DNA extraction. Most unspecified L. dispar specimens that we tested were from central Asia; while all of them were identified as L. dispar dispar using the mtDNA-based assay (simplex assay 3), they were all homozygous for the Asian FS1 allele (Fig 6), suggesting that these insects have an L. dispar asiatica genetic background. One specimen from Lebanon (CFIA-LEP0145; Table 5) generated a late amplification (Ct~39) for the Asian FS1 allele, suggesting the presence of contaminants or substitution(s) in the region(s) targeted by the FS1 primers and/or probes.

Assay sensitivity and direct PCR
All six assays developed here displayed a level of qPCR efficiency close to 100% and all showed a very high degree of sensitivity, with LOD values 25 COI copies ( Table 6).
Given that simplification of the DNA extraction step could further reduce the time required to run the present set of assays, we examined the possibility of using a "direct PCR" approach, Table 4. FS1 genotyping (Duplex assay 4) for 25 gypsy moth specimens identified as L. dispar dispar using Simplex assay 3 (Fig 1). where the assay is run on an egg homogenate (see Materials and methods for details), as opposed to a purified DNA extract. As we were not able to obtain fresh eggs for most species and subspecies considered here, we ran our tests on L. dispar dispar eggs only. Homogenates of 2 and 4 eggs contained~250,000 and~500,000 COI gene copies, respectively. A 100-200x dilution of such homogenates is suitable for running the TaqMan assays.

Discussion
The present suite of TaqMan assays was developed in response to a need expressed by the CFIA, the federal agency that has the responsibility of identifying potential FIAS intercepted at Canadian ports, including AGM and other related lymantriines considered a threat to North American forest resources. In developing the assays, we focused on four features for which improvements were needed relative to existing molecular assays: (i) accuracy/reliability of identification, (ii) resolution at the subspecies or population level, (iii) scope (i.e., number of species the assay can identify), and (iv) rapidity. Clearly, the method proposed here achieves significant improvements relative to earlier AGM assays, particularly when all four features are considered together. The CFIA's diagnostic entomology lab has so far relied primarily on the well known "NB" assay [19,22] for AGM identification, along with a second, microsatellite-based assay [22,36]. The NB system is a method that involves the digestion of COI-5P PCR amplicons with two restriction enzymes, each targeting SNPs that allow identification of unknown samples as North American, European/Siberian or Asian. Although this assay has often proven very Fig 6. Geographical distribution of FS1 genotypes, as determined using Duplex assay 4, for gypsy moths identified as L. dispar dispar using Simplex assay 3. Blue circles: homozygous for the FS1-N allele; red circles: homozygous for the FS1-A allele; blue/red circles: heterozygous for the N and A alleles. Black letters near each circle identify specimens identified as L. dispar dispar using Simplex assay 3; red letters designate L. dispar asiatica and L. dispar japonica positive controls (refer to EGM and other L. dispar specimens of uncertain subspecies designation 4 CFIA-LEP0035 Lymantria dispar dispar 3A; 4A or 4B -- useful, cases of ambiguous identification have been encountered [16,22], and it is suboptimal in regard to the other three features for which we sought improvements. Similarly, the microsatellite assay has been shown to provide limited subspecies resolution [20]. In addition, sample processing using these two assays takes two full days while the method we propose here takes less than a day (rapidity can be further improved by the direct PCR approach we developed). Thus, our suite of TaqMan assays has clear potential to enhance AGM diagnostic capacity once implemented at the CFIA. Other approaches that rely on the amplification and sequencing of Assays identified in Results section 2 "+": amplification at expected Ct; " -": no amplification; "•": not tested 3 Positive controls for the Asian FS1 allele 4 All specimens found in this section generated a positive amplification in the EGM assay (i.e., Simplex assay 3) 5 This amplification had a Ct of~39, above the threshold of Ct = 35 set for this assay.
doi:10.1371/journal.pone.0160878.t005 mitochondrial markers [e.g., 6] or on the analysis of several microsatellite markers (e.g., [13]) have the potential of achieving levels of accuracy, resolution and scope similar to those of the present set of assays, but will require longer processing time. Conversely, the two AGM Taq-Man assays developed earlier by other groups [14,24], although rapid, are limited in scope and subspecies resolution. With respect to cost, the suite of assays proposed here is relatively inexpensive to run (provided the necessary thermocycling equipment is available) and comparable in price to the sequencing of PCR products. Several scientists have raised concerns about potential misidentifications when relying solely on mitochondrial markers to identify L. dispar strains [8,13,16,19,23,36]. Indeed, hybridization between L. dispar dispar and L. dispar asiatica can produce individuals displaying L. dispar dispar mitochondrial haplotypes with features of L. dispar asiatica nuclear genomes; in some cases, females of such insects have been observed to have strong flight capabilities [22]. Inclusion of an FS1-based assay in the procedure we developed was meant to flag such individuals, some of which may need to be considered AGM from a regulatory perspective. Most striking are individuals collected in central Asia that were identified as L. dispar dispar by Simplex assay 3 but were homozygous for the Asian FS1 allele (Table 4; Fig 6). Using nine microsatellite markers, Wu et al. [13] showed clearly that gypsy moths from central Asia (Kazakhstan and Kyrgyzstan) have a mixed European and Asian genetic background, likely indicative of extensive hybridization at the boundary between the ranges of European and Asian populations. Interestingly, the original description of L. dispar asiatica was based on samples collected in Kazakhstan [37], pointing out potential ambiguities in subspecies assignment. Although gypsy moths from central Asia are much less likely to find their way into North America than their far eastern counterparts, due to differences in current trading pathways, any unknown sample displaying the EGM-COI/AA-FS1 genotype in Simplex assay 1 and Duplex assay 4 may need to be treated as AGM. It will not be possible to say with certainty whether such samples arose from populations with flight-capable females, but the outcome of these two tests should help narrow down the likely geographic origin of these moths. With respect to samples identified as heterozygous for the FS1 marker, an earlier study indicated that some European individuals with this genotype had flight-capable females [22]. These samples should therefore be treated with caution.
We believe that the TaqMan assays we developed have broad applicability and will be useful to regulatory agencies in any jurisdiction where invasive lymantriines are of concern. Minimal training is required for processing samples and diagnosis is made effortless if one uses the identification tool we provided (S2 File). Since this suite of assays is modular, users can also decide whether or not all modules need to be run for a given sample, providing some operational flexibility. The assay validation results we presented above indicate that a qPCR-based approach can provide both rapid and reliable identification of several invasive lymantriines. This, of course, does not preclude subsequent confirmation of species/subspecies identification through sequencing of the PCR products generated by the assays. In future work, we will seek to bring further improvements to our assay system, including the addition of markers that could provide greater resolution with respect to the identification of the geographic origins of unknown samples. Characterization of the genomic determinants of female flight capability could also be instrumental in the development of an assay module aimed at assessing this trait in unknown samples.
Supporting Information S1 File. Details of primer and probe design, along with validation results for each individual TaqMan assay (1 assay/tab; first tab: general lymantriine primers). (XLSX) S2 File. Excel sheet tool for Lymantria species/subspecies identification. (XLSX) S1 Table. List of sources of DNA sequences used for assay development. (XLSX) S2 Table. Sequences of primers used for PCR amplification of potential markers. (DOCX)