Rapid Focused Sequencing: A Multiplexed Assay for Simultaneous Detection and Strain Typing of Bacillus anthracis, Francisella tularensis, and Yersinia pestis

Background The intentional release of Bacillus anthracis in the United States in 2001 has heightened concern about the use of pathogenic microorganisms in bioterrorism attacks. Many of the deadliest bacteria, including the Class A Select Agents Bacillus anthracis, Francisella tularensis, and Yersinia pestis, are highly infectious via the pulmonary route when released in aerosolized form. Hence, rapid, sensitive, and reliable methods for detection of these biothreats and characterization of their potential impact on the exposed population are of critical importance to initiate and support rapid military, public health, and clinical responses. Methodology/Principal Findings We have developed microfluidic multiplexed PCR and sequencing assays based on the simultaneous interrogation of three pathogens per assay and ten loci per pathogen. Microfluidic separation of amplified fluorescently labeled fragments generated characteristic electrophoretic signatures for identification of each agent. The three sets of primers allowed significant strain typing and discrimination from non-pathogenic closely-related species and environmental background strains based on amplicon sizes alone. Furthermore, sequencing of the 10 amplicons per pathogen, termed “Rapid Focused Sequencing,” allowed an even greater degree of strain discrimination and, in some cases, can be used to determine virulence. Both amplification and sequencing assays were performed in microfluidic biochips developed for fast thermal cycling and requiring 7 µL per reaction. The 30-plex sequencing assay resulted in genotypic resolution of 84 representative strains belonging to each of the three biothreat species. Conclusions/Significance The microfluidic multiplexed assays allowed identification and strain differentiation of the biothreat agents Bacillus anthracis, Francisella tularensis, and Yersinia pestis and clear discrimination from closely-related species and several environmental background strains. The assays may be extended to detect a large number of pathogens, are applicable to the evaluation of both environmental and clinical samples, and have the potential to be applied in military, public health, and clinical diagnostic settings.


Introduction
The rapid detection of environmental and clinical biothreat agents will play an increasingly important role in protecting the population, and improved methods of doing so represent an urgent national need [1,2]. For both clinical and environmental biothreat detection, it would be preferable for a single nucleic acid based assay to interrogate an unprocessed sample for dozens of pathogens. Just as importantly, each pathogen should be interrogated at multiple loci to ensure sensitive and specific detection. Taken together, these two goals suggest the use of highly multiplexed assays designed to interrogate simultaneously hun-dreds of loci or more. Furthermore, the selection of the loci to be interrogated should allow strain typing to assist in forensic attribution (e.g., geographical origin, subspecies, biovars), to determine virulence and antibiotic resistance status, and to inform initiation of appropriate treatment modalities. In addition, the loci should be chosen to allow discrimination of a given pathogen from non-pathogenic closely-related species (including non-pathogenic near neighbors), environmental background strains (EBS), and common clinical contaminants secondary to sample handling.
Several PCR-based assays have been described for the identification of biothreats [3,4]. Many such assays rely on single or dual target detection of either chromosomally [5] or plasmid-encoded [6] loci, but these assays may generate false negatives due to presence of strains that have lost their plasmids or near neighbor strains that harbor highly homologous chromosomal loci. A 3-plex PCR assay coupled with microarray-attached probe detection suffers from the same drawbacks as the single and duplex assays, as it too targets single loci specific for each of three biothreat agents [7]. A real time-PCR assays specific for Yp targets 6 loci in two separate reactions [8] or 4 loci in one reaction [9]. Furthermore, a recently described 10-plex RT-PCR assay simultaneously targets 3 loci from each of Ba, Ft, and Yp [10]. Even these multiplexed assays do not allow extensive characterization of the biothreat due to the relatively limited number of probed loci [11,12]. Microfluidics offers the potential for the development of even more highly multiplexed assays. In particular, the combination of microfluidic amplification and electrophoretic separation and detection is extremely powerful. Labeling amplicons with 4, 6, 8 or more fluorescent dyes in a high-resolution separation system allows a given separation channel to provide hundreds of bases of sequence or to distinguish thousands of amplicons. We have previously developed rapid microfluidic PCR assays that perform simultaneous amplification of up to 27 human loci in under 20 minutes using a reaction volume of 7 ml with near single copy limit of detection [13,14].

Molecular Identification of Bacillus anthracis (Ba) Strains
Ba is considered a highly monomorphic species [15] as it contains few sequence polymorphisms across isolates [16,17] (greater than 99% sequence identity between strains). The organism harbors an approximately 5.6 Mb chromosome and typically two plasmids, pXO1 (approximately 182 kb) and pXO2 (approximately 96 kb), that are essential for virulence and toxicity. Ba is a phylogenetically young species (10-20,000 years old) and closely related to Bacillus cereus (Bc) and Bacillus thuringiensis (Bt). The three species are often referred to as B. cereus senso lato due to their close phenotypic and genotypic characteristics [18,19]. Sequence identity within that group is considerable and, accordingly, classic molecular strain typing methods such as rDNA sequence analysis and a conventional Multilocus Sequence Typing (MLST) approach are not suited for identifying and genotyping Ba from environmental samples that may also contain other members of the Bc senso lato group [20][21][22]. A well-established Ba genotyping method is based on VNTR-repeats [23,24], and a subset of 25 of these loci was recently used to genotype Ba isolates by sizing the resulting PCR products via capillary electrophoresis. Although an excellent approach to typing isolated Ba strains, VNTR typing is not well-suited to the analysis of environmental samples that may contain other members of the Bc senso lato group. Furthermore, this assay requires four separate PCR reactions [25].

Molecular Identification of Francisella tularensis (Ft) Strains
The genus Francisella is subdivided into at least three species including, F. philomiragia (Fp), and other Francisella spp. Three of the four recognized Ft subspecies, subspp. tularensis (also known as nearctica, biovar A), holarctica (palearctica, biovar B), and mediasiatica are human pathogens whereas subsp. novicida is non-pathogenic to healthy individuals [26]. The North American subsp. tularensis causes Type A Tularemia, which has a high mortality rate [27]. In contrast, subsp. holarctica and mediasiatica are found primarily outside of North America and cause the milder Type B Tularemia. Fp and the other Francisella spp. are not known to be human pathogens. The conventional MLST approach revealed only moderate sequence type diversity within subspecies (e.g., 3 sequence types in 8 North American tularensis isolates) [28], with even lower genotypic resolution obtained by rDNA typing [29,30]. Genotypic resolution improves with the interrogation of loci harboring gene rearrangements, deletions and insertions, and SNPs [6,31]. Both Multi Spacer Typing (MST) [32] and Multiple Locus VNTR Analysis (MLVA) [33,34] are suitable Ft genotyping methods. For example, using 25 MVLA loci, 120 genotypes were observed in 192 global Ft isolates [35].

Molecular Identification of Yersinia pestis (Yp) Strains
Like Ba, Yp is considered a genetically monomorphic and phylogenetically young species, having branched from Y. pseudotuberculosis (Ypst) 1,500-20,000 years ago [36,37] and with which it shares identical 16S rDNA sequences [38]. Three Yersinia species, Yp, Ypst and Y. enterocolitica (Ye) are serious human health threats (Ypst and Ye are water and food-borne pathogens), while certain other Yersinia species are opportunistic pathogens. Yp is subdivided into biovars antiqua (ant), mediaevalis (med) and orientalis (ori) that are associated with geographical origin in Africa, Central Asia, and East Asia, respectively. Biovar ori is the only naturally-occurring Yp in North America. Yp contains a 4.6-4.8 Mbp chromosome and several plasmids of varying size (0.06-0.12 Mb). The most prevalent plasmids are pMT1 (100-110 kb) and pCD1 (70-75 kb). Yp strains have been characterized by analysis of VNTRs [39] via MLVA [37,40] and MST [41]. While conventional MLST analysis is less informative due to high sequence conservation [36], genotyping focusing on heterogeneity in certain metabolism genes can provide valuable information [37,42].
Taken together, Ba, Ft, and Yp are representative of biothreats in general in that it is clear that simultaneous interrogation of a large number of loci is optimal for detection and characterization of biothreats. Here, we present two related multiplexed assays. First, we developed a highly multiplexed PCR assay based on the amplification and sizing of ten loci per pathogen, and, in a model system, detected Ba, Ft, and Yp in a single reaction. For all three pathogens, significant strain typing is accomplished by differential fragment sizing of several of the resulting amplicons and discrimination from closely-related species (including non-pathogenic near neighbors) and EBS based on selective primer binding. Furthermore, we show that the assay can provide an initial characterization of the detected agent including geographical origin, subspecies, and biovar. The amplification of 10 target loci from each of the three pathogens achieves specificity and genotypic resolution based on the resulting Fragment Length Type (FLT) obtained by microfluidic electrophoretic fragment size analysis. Second, we extended the amplification assay by performing microfluidic Sanger sequencing on the amplicons. The Rapid Focused Sequencing of approximately ten loci per pathogen allows an even greater degree of strain typing than PCR alone. Both the highly multiplexed fragment sizing and sequencing assays hold promise to support a rapid, field-forward, and selfverifying rapid response in cases of suspected release of Ba, Ft or Yp, and ultimately, a broad range of biothreat pathogens.

Results
An overview of the two closely-related multiplexed assays is presented in Figure 1. In both cases, purified nucleic acids are subjected to a multiplexed (in this case 30-plex) microfluidic amplification in a 7 ml reaction volume. In the multiplexed PCR sizing assay, the PCR reaction products are separated by microfluidic electrophoresis and detected based on labeled primers incorporated during the amplification reaction. In the Focused Sequencing assay, the multiplexed amplification utilizes the identical but unlabeled 30-plex primer set. Following amplifica-tion, the reaction is spit into 20 equal aliquots, and each aliquot is diluted to for microfluidic Sanger sequencing. Each sequencing reaction contains a forward or reverse sequencing primer for one locus from each of the three pathogens. The 20 reactions therefore comprise one forward and one reverse sequencing primer for all ten loci for the three agents, and each sequencing reaction is separated and detected on an individual electrophoresis channel.
Development of the two assays for detection and characterization of the biothreat agents Ba, Ft and Yp was conducted in five steps: (1) selecting targets and designing primers using available whole genome sequence (WGS) data of the biothreat and their near neighbor species; (2) testing of primer specificity and sensitivity in singleplex PCR with reference strain template DNAs representing each agent; (3) combining 10 primer pairs specific for each agent into a 10-plex PCR panel, followed by iterative primer testing and redesign to ensure all loci were capable of simultaneous amplification; (4) testing specificity and sensitivity of each 10-plex panel, including sequence analysis of all amplicons; and (5) combining the three 10-plex panels to generate a 30-plex PCR panel for simultaneous detection and characterization of all three biothreat agents, followed by sequence analysis of all amplicons.

B. anthracis Multiplexed PCR Sizing and Rapid Focused Sequencing Assays
Using published WGS data from 19 Ba strains and from Ba near neighbor species in the B. cereus sensu lato group, we evaluated a number of loci for inclusion in the multiplex panel. Potential targets were screened to fulfill at least two of the following criteria: (1) highly conserved in Ba, i.e., present in all Ba strain WGS data (except for plasmid encoded loci); (2) species-specific, i.e., sequence variations in the target between Ba and its near neighbors can be utilized for designing Ba specific primers; and (3) discriminating, i.e., motifs within the target are useful for Ba genotyping and discrimination of Ba against near neighbors. Note that interrogating Ba plasmid loci is important to identify strains harboring toxin and virulence genes but is insufficient to unambiguously identify Ba within the Bc senso lato group. Certain Bacillus species have been shown to harbor sequences highly homologous or identical to pXO1 and pXO2-encoded genes [43,44]. For example, B. cereus strain G9241 harbors pBCXO1, a pXO1-like plasmid [45] and B. cereus biovar anthracis CI contains two plasmids with very high sequence identity as compared to pXO1 and pXO2 [46]. However, those isolates are less closely related to Ba based on chromosomal sequence identity. Accordingly, we also selected the Ba genotyping targets to balance chromosomal and plasmid- Figure 1. Flowchart illustrating process steps of the two related multiplexed assays: PCR Sizing and Rapid Focused Sequencing. In the multiplexed PCR sizing assay, the PCR reaction products are separated by microfluidic electrophoresis and detected based on labeled primers incorporated during the amplification reaction. In the Rapid Focused Sequencing assay, the multiplexed amplification utilizes the identical but unlabeled 30-plex primer set. The resulting PCR reaction products are then subjected to Sanger Sequencing followed by microfluidic electrophoresis and detection based on labeled primers incorporated during the sequencing reaction. doi:10.1371/journal.pone.0056093.g001 specific information to distinguish Ba from pXO-harboring near neighbors from non-Ba species and from attenuated or avirulent Ba strains that have lost one or both plasmids [47][48][49][50].
Eight chromosomally-encoded and two pXO1-encoded loci were selected for primer pair design for inclusion in the 10-plex PCR Ba panel. They include parts of genes encoding conserved sporulation protein and sporulation regulation factors (sspF and spoVT, respectively), a housekeeping enzyme (hemL), a metabolism enzyme (GBAA0872), conserved plasmid-encoded virulence factors and toxins (lef and gerXB), and loci harboring VNTR-like sequences (basB and pbp1A) and non-coding spacer regions (adjacent to yihY and codY). Development, assembly, and optimization of the multiplexed panel were performed and amplification and initial sequence data generated prior to the acquisition of the necessary permit to obtain and handle nonattenuated genomic DNAs from several pXO2-harboring agents. Hence, primers for the pXO2-encoded target were not included in the final assembly but will be incorporated into an expanded panel in the future.
To enable laser-induced detection of resulting amplicons from each locus, one primer from each pair was labeled at the 59 end with one of the fluorescent dyes FAM, JOE, TMR or ROX. The labels were selected to allow appropriate spectral resolution of the PCR products based on their predicted fragment sizes. Primer sequences and target information are shown in Table S1. Primer pairs were initially tested in rapid singleplex PCR reactions followed by fragment size analysis on Genebench, a microfluidic electrophoresis instrument, as described in Material and Methods. PCR specificity, as judged by presence of a PCR product of the expected length, and PCR sensitivity, as judged by fluorescent signal strength, were evaluated in the presence of 50 genome equivalents of the pXO2-deficient Sterne_2 strain DNA. Successful primer pairs were then combined sequentially to generate the 10-plex PCR Ba panel. The panel was tested in reactions containing template DNA from 20 Ba strains and 10 Ba near neighbors, including B. cereus (Bc) and B. thuringiensis (Bt) strains.
The multiplexed PCR sizing assay allows discrimination of Ba strains and near neighbors. Figure 2 shows the resulting electropherograms from Ba strains Sterne, Ames, and BA1035, from which all Ba 10 loci are amplified, and the closely-related species Bc E33L and Bt 97-27, from which only 6 out of 10 loci are amplified. Bc E33L and Bt 97-27 are missing the two pXO1 amplicons (lef, gerXB) because they do not harbor the plasmid, the BA0872 amplicon due to the absence of the gene, and the basBamplicon due to primer/target mismatches, respectively. Eight amplicons (sspF, spoVT, BA0872, hemL, yihY, codY, lef and gerXB) do not vary in length among the Ba examples, but the presence and length of these amplicons unambiguously indicates the presence of Ba DNA. The observed Fragment Length Types (FLTs, defined as the amplicon fragment lengths observed for a given locus) in the basB and pbp1A amplicons allow discrimination between strains Ames, Sterne, and BA1035; fragment length in pbp1A also distinguishes Bc E33L from Bt 97-27. In addition, the FLTs of pbp1A in Bc E33L and Bt 97-27 are not present in any of the Ba strains analyzed herein.
The experimentally determined and in silico predicted FLT from the 10 loci generated from 36 Ba strains and 12 Ba near neighbor species strains are listed in Table 1. As predicted in silico, PCR products from loci harboring VNTR-like repeats and indels (e.g., basB and pbp1A), display the highest degree of fragment size variation among Ba strains. Fragments specific for the other 8 loci vary to a lesser degree, generally by 1-2 bp between strains. A total of 20 high resolution amplified fragment length polymorphisms (hAFLP) types (defined as the combined FLTs of 10 loci) were detected in 36 Ba strains (Table 1; the 20 hAFLP types include 5 WGS strains that have no homology for one or two housekeeping genes (most likely due to physical sequence gaps). Several of these strains are expected to share identical hAFLP types as they are closely related or identical. For example, the closely related Ames, Ames 35, and A2012 strains share hAFLP type a, and all Sterne and Sterne strain derivatives share hAFLP type e.
Specificity of the Ba panel was tested on template DNA from 10 near neighbor strains. hAFLP types recorded from the closelyrelated species are clearly distinguishable from those observed in the Ba group. The distinct fragment sizing profile between Ba and near neighbors B. cereus E33L and Bt 97-27 are also shown in Figure 2. Although 4 of 10 amplicons (sspF, spoVT, hemL and yihY) share FLTs with Ba group strains, the overall pattern is distinct from any observed in Ba. BA0872-specific products are missing from the profile of all closely-related species due to the absence of BA0872-specific primer binding sites, and basB-specific fragments are also absent (with the exception of Bc G9241, Bc 3A, and Bc 4342, all of which show discriminatory basB amplicon sizes). As expected, most near neighbors do not yield detectable pXO1-lef and pXO1-gerXB specific products; an exception is G9241, known to harbor pXO-like plasmids [45]. Three Ba isolates, Pasteur (pXO1 2 ), A1055, and Ames Porton Down (pXO1 2 , pXO2 2 ), do not generate pXO1-lef and pXO1-gerXB -specific amplicons, as is the case for the majority of the near neighbor strains. However, presence of a BA0872-specific fragment identifies them as likely Ba group strains. Conversely, the BA0872 fragment is absent in the profile of near neighbor strain Bc 3A, the isolate that shares the greatest similarity to Ba based on individual FLTs. Based on hAFLP types, this isolate could represent a plasmid-deficient Ba strain with a mutation in the BA872 primer-binding site; sequence data unambiguously eliminates this possibility (see below).
Focused sequencing of the 10 Ba panel loci (Table 1) showed a total of 22 Sequence Types (STs, defined as the combined locus sequence genotypes of 10 loci) in the 36 Ba isolates. This increase (as compared to 20 hAFLP types) is due to both SNPs and sizeneutral indels (insertion/deletions which cancel impact on size) that are detected and by sequencing. For example, strains Australia 94, K1811 and SK-102 are indistinguishable from the Sterne strains and A0370 by fragment sizing. In contrast, sequence analysis revealed SNPs in the pXO1-lef and hemL amplicons. The contribution of SNPs to the number of observed genotypes is most prevalent in loci pXO1-lef (4 GTs versus 1 hAFLP type within Ba) and basB (8 GTs versus 6 hAFLP types). In certain cases, the additional genotypic variability observed by Rapid Focused Sequencing does not increase the number of STs but adds to the overall genotyping confidence.
The additional genotypic information gained by Rapid Focused Sequencing as compared to PCR sizing is valuable to clearly distinguish 'atypical' Ba strains, such as plasmid-deficient strains Pasteur, Ames Porton Down (in silico data), and A1055 (in silico data) from close near neighbors such as Bc 3A (no plasmid loci but almost identical FLTs of chromosomal loci) and Bc bv. anthracis CI (plasmid loci and chromosomal loci almost identical to Ba strains). Every Ba near neighbor strain is associated with STs that are distinct from those observed in Ba strains, and most closely-related species have unique sequences not observed in any of the Ba strains. B. coagulans and B. megaterium strains did not generate amplicons from any of the 10 loci, an expected outcome as the targets were selected to discriminate against such phylogenetically more distant species. Finally, there were no discrepancies observed between published WGS data of the analyzed near neighbor strains (i.e., Bt konkukian 97-27, Bt Al Hakam, Bt kurstaki HD1, Bt sv. israelensis ATCC 35646, Bc E33L, Bc G9241, Bc 4342) and the sequence data generated using the Ba primer set. Figure S1 shows the phylogenetic tree based solely on the Ba 10-plex panel experimentally obtained gene sequences and whole genome sequence available data that have been concatenated after clustal alignment and visualization with PHYLIP (TreeView, [51]). The high conservation within the Ba species can be seen; the closelyrelated Bc and Bt are clearly distinct from the Ba species cluster.

F. tularensis Multiplexed PCR Sizing and Rapid Focused Sequencing Assays
Using published WGS data from 23 Ft and Ft near neighbors, a total of 10 target loci were selected for primer design (Table S2). Targets were chosen to distinguish 1) the four Ft subspecies; and 2) strains within each subspecies, including two highly virulent North American subspecies (tularensis type A1 and A2 clade strains). Primers were designed to be discriminatory against homologous Fp targets. The Ft target panel includes amplicons from virulence (acpA, pdpD, iglC), antigen (fopA), putative metabolic (gyr, speA), and transcriptional regulator (migR, FTT008) genes. Two amplicons also harbor VNTR-like repeats, indels, and SNPs present within intergenic regions (i.e., between virulence gene tul4 and ORF FTT0900, and downstream of ORF FTT0082). Targets FTT0082 and pepO are annotated as pseudogenes in certain subspecies (including reference strain Schu S4). These were included in the panel because they are present in all Ft subspecies, are highly conserved, and contain variable regions valuable for genotypic resolution. Finally, the pdpD gene, encoded on the Francisella Pathogenicity Island, is present in type A (subsp. tularensis) and most novicida strains but deleted in type B (subsp. holarctica) strains. One primer from each locus was fluorescently labeled, and primer sequences are found in Table S2.
Following individual primer pair optimization and testing, the primer pairs were combined sequentially to generate the 10-plex PCR Ft panel. The panel was tested in PCR reactions containing DNA from 100 genome equivalents of Ft subsp. tularensis Schu S4 DNA and 28 additional Francisella strains. Electropherograms obtained from 4 subsp. tularensis and holarctica strains are shown in Figure 3 (left panel).
Six of the 10 fragments in the 10-plex Ft panel are invariable in length between strains subsp. tularensis SchuS4 and WY96, and subsp. holarctica 14LVS and 425. The lack of pdpD-specific product in the profiles of the two holarctica strains is expected as this locus is deleted in that subspecies but not in subsp. tularensis and novicida strains. In contrast, the VNTR-like repeat containing the gyrspecific amplicon of strain 425 differs significantly in length as compared to those from SchuS4, WY96 and 14LVS ( Figure 3). Five distinct gyr FLTs were observed in the analyzed strains. In contrast, the PCR products of speA and FT0082 differ by only 1 or 2 bp across strains. However, as these two loci have several indel regions that also result in neutral size variation (one indel cancels the other), sequence analysis is better suited to discriminate between isolates. VNTR-like repeat motifs in the Ft panel loci (e.g., gyr, speA), exhibit less fragment size variation than that observed in VNTR loci of the Ba and Yp 10-plex panels. Nevertheless, size variability still allows classification of all examined strains into species and subspecies. Experimentally determined and in silico predicted individual FLTs and combined hAFLP types from 31 Ft strains and 13 Ft near neighbor species strains are listed in Table 2. Sixteen hAFLP types were observed; three hAFLP types segregate with the highly virulent types A1 (hAFLP types a and b) and A2 (hAFLP type c), while seven hAFLP types (d, e, f, g, h, n, o) are associated with less virulent type B strains. Failure to obtain detectable pdpD-specific PCR products in type B strains is expected as all previously characterized holarctica strains carry a pdpD deletion [52]. Nonpathogenic near neighbor Ft subsp. novicida strains have distinct hAFLP types (hAFLP types k, j, m, l, p) as compared to the pathogenic tularensis subtypes, and the three analyzed Fp strains fail to amplify all but one locus, leading to unambiguous distinction from pathogenic Ft subspp. Although predicted by in silico analysis, non-pathogenic near neighbor strain novicida GA99-3548 did not yield a FTT0082 specific fragment in the 10-plex PCR, and the novicida strain GA99-3549 yielded no product from the pdpD gene. This discrepancy between published WGS and experimental data could be due to a mutation that affects one or both primer-binding sites in the here-analyzed isolate as compared to the WGS strain.
Sequence analysis (including in silico) of fragments obtained from the 10 loci (Table 2) showed a total of 21 STs in the 41 Ft isolates (note, this number excludes the 3 Fp strains due to the lack of detectable sequence in 9 of 10 loci). No discrepancies between the focused sequencing data and published WGS data (23 strains available) were found. The five additional STs (as compared to 16 hAFLP types) based on sequence analysis are associated with subsp. holarctica strains and are due to SNPs and single basepair indels that are only detected by sequencing. Several of the sequenced DNAs originate from closely related derivatives or share a close geographical origin. As expected, STs are indistinguishable from parental strains in case of seven type A1 (subsp. tularensis) strains, which all have ST C. All of these strains were isolated in Illinois and may be from the same clonal origin. Interestingly, Scherm, another type A1 strain isolated in Ohio in 1944, also shares ST C.
The Ft 10-plex panel allows classification of strains by Ft subspecies: Type A1 strains have either ST A or C, and type A2 strains all share ST B. ST C is observed in all type A1 Illinois isolates, in strain Scherm (Ohio) and in NE061598 (Nebraska). Interestingly, strain Scherm and Schu S4, which are reportedly closely related [53] are distinguishable by their STs (ST C and ST A, respectively). Only three STs are observed in subsp. tularensis, but subsp. holarctica strains have much greater diversity with 11 STs. This is expected as these strains are isolated from the entire Northern hemisphere, as opposed to type A, which is, with some exceptions, concentrated in North America. Critically, there is no overlap in STs found in pathogenic Ft and the opportunistic novicida subspecies. Furthermore, the phylogenetically more distant species Fp readily distinguishable from Ft (both by fragment sizing and sequencing) as only one locus (i.e., pdpD) was amplifiable under the stringent assay conditions from Fp DNA. Figure S2 shows the phylogenetic tree based solely on the Ft 10-plex panel experimentally obtained gene sequences and whole genome Table 1. Cont. sequence available data that have been concatenated after clustal alignment and visualization with PHYLIP (TreeView, [51]). This tree corroborates the clustering into Ft subspecies tularensis, holarctica, and novicida as explained above. Figure S3 shows a sample sequence trace from direct sequence analysis of Ft fopA using the 30-plex assay PCR amplicon.

Y. pestis Multiplexed PCR Sizing and Rapid Focused Sequencing Assays
Published WGS data from 30 Yp and near neighbor strains were used to select 10 target loci for primer design. Targets were chosen to distinguish 1) Yp from Ypst and Ye; and 2) all Yp biovars. pCD is frequently detected in Ypst and Ye [54] and its presence is therefore not diagnostic of Yp. In addition, some Ypst strains also harbor other Yp or Yp-like plasmids or plasmid loci [55], stressing the importance of including both chromosomal and plasmid loci for discrimination of Yp from near neighbors Ypst and Ye. Six chromosomal and four plasmid-encoded targets (2 each encoded on pMT and pPCP) represent a mix of loci harboring SNPs, indels, and VNTR-like repeats. Primer sequences and target information are found in Table S3. The fully assembled 10-plex Yp panel was tested on 100 genome copy equivalents from each of 34 Yp strains representing the 3 biovars (14 orientalis, 13 medievalis, and 7 antiqua) and 10 near neighbor strains including Ypst and Ye. Five representative fragment sizing profiles of 4 Yp strains and one near neighbor Ypst strain are shown in Figure 3 (right panel).
Size variability in Yp 10-plex fragment panel is observed primarily in 4 loci (glpD, ail, argS-YPO2047 and YPO1976). For example, glpD-specific amplicons are either 436 bp (orientalis biovar strain) or 529 bp (antiqua and mediavalis biovar strains) and are therefore useful in determining the presence or absence of   (Table 3). The additional ST (as compared to hAFLP types) distinguishes strain ori FV-1 (ST FF) from ori F1991016 (ST L). The nearly identical number of hALFP types and STs is due primarily to the convergence of genotypes detected in the VNTR-repeat containing loci argS-YPO2047 and YPO1976 (12 and 8, respectively) as no polymorphisms beyond the VNTR repeat number variation are present in the amplified fragments. In contrast, 7 locus GTs compared to only 4 FLTs are present in pPCP-encoded pst-PCP06 because sequencing detects synonymous insertions and deletions that do not affect fragment size. Previously unreported GTs (and consequently STs) due to SNPs were detected for the glgX locus in strain Nairobi (GT 3) and the pPCP pla locus of strain ori El Dorado (GT 3). WGS data from 7 of the 34 Yp strains are published, and no discrepancies between in silico predicted and fragment size genotypes derived by 10-plex PCR were discovered.
The  Table 3. These two strains were very likely switched at the source or following deposit to ATCC. Figure S4 shows the phylogenetic tree based solely on the Yp 10-plex panel experimentally obtained gene sequences and whole genome sequence available data that have been concatenated after clustal alignment and visualization with PHYLIP (TreeView, [51]).
Specificity of the 10-plex panel was evaluated by testing the assay against 10 5 copies of closely-related species Y. pseudotuberculosis, Y. kristensenii, Y. aldovae, and Y. enterocolitica. In addition, assays were performed with the four near-neighbors present in 100-fold molar excess of the target pathogens ( Figures S5A-5D). In the presence of 10 5 genome equivalents of Ypst YPIII DNA, all 6 chromosomal loci are amplified ( Figure S5A, left panel) as expected. However, the length of the Ypst argS amplicon is   Table 3. Cont. Ypst DNA is spiked in 100-fold excess to KIM10 DNA, the chromosomal targets of this strain are out-competing the targets of KIM10. However, all expected Yp KIM10 loci are still present, albeit at a low level ( Figure S5A, right panel). In the presence of 10 5 genome equivalents of Yk CDC1457-B1, no loci are amplified ( Figure S5B, left panel), and a 100-fold molar excess of Yk DNA relative to Yp KIM10 DNA does not alter the KIM10 profile ( Figure S5B, right panel). For Yal 670-B1, 10 5 genome equivalents generate only a low-intensity peak for nuoG ( Figure S5C, left panel). When Yal DNA is spiked in 100-fold molar excess to KIM10 DNA, all specific Yp loci are still clearly identifiable ( Figure  S5C, right panel). Finally, at 10 5 genome equivalents of Ye WA DNA, no Yp specific products are observed in the profile ( Figure  S5D, left panel), and the addition of 100-fold molar excess of Ye DNA also does not disturb the KIM10 profile ( Figure S5D, right panel). Taken together, these data suggest that specific target organism can be unambiguously identified even in the presence of closely-related species in molar excess.

Ba-Ft-Yp 30-plex
We have combined 60 primers (20 from each of the Ba, Ft, and Yp 10-plex PCR assays) and performed an initial evaluation of the resulting 30-plex panel in the presence of 100 genome copy equivalents from Ba and Yp and 1000 from Ft representatives of each species. The resulting electropherograms using Ba Sterne, Yp med KIM10v, and Ft subsp. tularensis Schu S4 templates are shown in Figure 4. The 30-plex fragment sizing profiles are identical with the individual 10-plex assays (e.g. the profiles of Sterne in Figure 2, WY96 and KIM10v in Figure 3). The electropherograms for each species are readily distinguished, with each biothreat species displaying a characteristic signature. In addition, certain fragments such as a full length (526 bp) Yp glpD amplicon (KIM10v) or a 361 bp Ft pdpD amplicon (SchuS4), correctly indicate the presence of Yp bv. mediaevalis or antiqua strain and a subsp. tularensis strain, respectively. For Ft, 1000 copies of template DNA are required to detect two of the FAMlabeled targets (acpA and speA). This is due to the fact that Ft DNA has the lowest GC content (32%) of all three species (Ba 35% and Yp 48%). Primers for these two loci have high AT content which resulted in reduced PCR efficiency due to slower annealing kinetics. Primer redesign for these two Ft loci is in progress.
The specificity of 30-plex panel was evaluated against genomic DNAs from 14 environmental background species (EBS, 10 3 genome equivalents per strain): Clostridium perfringens ATCC 13124, Staphylococcus aureus ATCC 10832, Pseudomonas aeruginosa Boston 41501, Burkholderia cepacia 249, Fusobacterium nucleatum 1612A, Bacteroides fragilis NCTC 9343, Streptococcus pneumoniae TIGR4, Shewanella oneidensis MR-1, Synechocystis sp. PCC 6308, Lactobacillus plantarum 17-5, Listeria monocytogenes EGDe, Bacillus megaterium ATCC 14581, Chryseobacterium indologenes NCTC 10796, and Bacillus psychrosaccharolyticus T25B. No detectable PCR products have been observed in the fragment sizing profiles from any of these species. Figure S6A is a representative profile detecting only background noise from all EBS. To further evaluate specificity and sensitivity of the 30-plex assay, target DNAs were spiked with 10 6 copies of EBS. In the presence of this high copy number EBS, the 10-plex signatures of the biothreat agents remained unaffected. Figure S6 (B and C) shows the simultaneous amplification of the 10 Ba loci (100 copies of Sterne) and 10 Yp loci (100 copies of Kim10) from the 30-plex panel in the presence of EBS (10 6 copies of B. cepacia 249). Taken together, these data demonstrate specificity of the designed primers and sensitivity of the 30-plex panel in a large molar excess of background DNA. This data Table 3. Cont. strain is deficient of this locus; n: no PCR product and no sequence detected; t: strain FV-1 has 3 copies of pPCP-pla locus in whole genome data belonging to GT 1 and 4; *hAFLP Type and ST assigned despite missing data (marked with ''#'') in the WGS data. doi:10.1371/journal.pone.0056093.t003 confirms and extends the power of the multiplexed assay to identify and strain type biothreats while clearly discriminating against closely-related species and environmental background strains. Finally, purified DNA was obtained from two types of realworld air filters (from the Pentagon Protection Agency and BioWatch, both acquired with the assistance of DHS S&T). These DNA samples were characterized for the presence of inhibitors based on 16S rDNA gene amplification with universal bacterial primers. Percent PCR inhibition was assessed from the intensities of the 16S PCR band on ethidium bromide-stained agarose gel resulting from amplification of air filter DNAs spiked with and without a known concentration of B. subtilis DNA as templates. Purified DNAs from one Pentagon and from two BioWatch (I, II) air filters were found to have PCR inhibition of 0%, 6% and 21%, respectively. The 30-plex panel was then tested using DNA purified from the three air filters and with air filter DNAs spiked with 100 copies of Ba Sterne. Resulting electropherograms are shown in Figure 5. Background noise was detected for all air filter samples without spiked Ba and the 10-plex Ba-signature was observed for all Ba-spiked air filter DNA samples. A very low nonspecific 444 bp peak was also observed in one BioWatch sample which was readily distinguished from the 10 specific product peaks. This initial result suggests that the purification, amplification, and electrophoresis protocols described here can be used to test DNA purified from environmental samples.

Discussion
We have developed two related assays, a multiplexed PCR sizing assay and its derivative, a Rapid Focused Sequencing assay. Both assays enable the detection, identification, and characterization of biothreat agents, demonstrated here by the simultaneous interrogation of 30 loci, 10 each from Ba, Ft, and Yp. The primary difference between the two assays is that Rapid Focused Sequencing reveals additional variations (SNPs, size neutral indels) that do not lead to fragment size alterations. Sequence information of continuous 500 base fragments from 10 chromosomally and plasmid-encoded loci is in effect a self-verifying assay, providing a level of confidence that is not achievable by quantitative-PCR in conjunction with probe-or melt-curve-based or microarray-based detection. In addition, focused sequence analysis can detect novel mutations/polymorphisms that are indispensable for identification of novel strains. By sequencing approximately ten carefully selected loci per genome, the approach allows the precision of whole genome sequencing but is much simpler and amenable to rapid and autonomous detection. Furthermore, the accuracy of Sanger sequencing is far better than whole genome sequencing methods [56][57][58]. Taken together, we believe that the advantages of Rapid Focused Sequencing render the approach well-suited to biothreat detection in the field.
Rapid Focused Sequencing allows identification and strain differentiation of the biothreat agents and clear discrimination from closely-related species and environmental background strains. It will be critical to expand the number of biothreats detected to approximately two-dozen species. This expansion will require two parallel assays, an estimated 120 targets for a total of 12 DNA-based species and a similar number for RNA-based organisms. Within the microfluidic biochip, purified nucleic acid will therefore be divided into one multiplexed PCR and one multiplexed reverse transcription-PCR reaction prior to sequencing. This approach is likely feasible based on our previous work developing microfluidic RT-PCR and multiplexed assays based on 84 primer pairs (data not shown). However, it is noted that as the number of targets grows, artifacts generated by unrelated primer pairs must be designed out of the assay.
Each of the 3 10-plex PCR biothreat agent panels was validated by analyzing genomic DNA isolated from 20 Ba, 30 Ft and 34 Yp strains as well as DNA from 10 representatives each of their respective near neighbors strains or closely-related species. The choice of loci and the primer design warrants correct discrimination of non-pathogenic near neighbors from biothreat strains and allows identification of species and characterization into subtypes, biovars and potentially attenuated strains. In contrast to MLST methods, the Rapid Focused Sequencing approach is not limited to selectively neutral housekeeping genes but instead also utilizes virulence factor, VNTR-like, non-coding, and other targets. The development of an appropriate set of targets allows powerful identification and genotyping of pathogens of interest.
The total sequence information analyzed via the 10-plex PCR Ba panel represents approximately 0.07% of the ,5.8 Mbp Ba genome (including chromosome and plasmids), 0.19% of the ,2 Mbp Ft genome, and 0.09% of the ,4.6 Mbp Yp genome. Rapid Focused Sequencing of this small fraction of the Ba genome, for example, is sufficient to determine presence or absence of the pathogen itself, to distinguish and characterize the majority of Ba strains by virulence (presence/absence of plasmid loci), and to unambiguously distinguish Ba strains from near neighbors. Similarly, Rapid Focused Sequencing of the Ft genome is sufficient to determine presence or absence of Ft-specific DNA, to distinguish between subspecies (tularensis and holarctica), and between subsp. tularensis type A1 and A2 strains. The approach also unambiguously distinguishes the near neighbors, subsp. novicida, from from pathogenic Ft subspp. tularensis and holarctica strains. The number and nature of the biothreat targets can be modified as the understanding of the genetics and biology of the pathogens advances over time.
The ability to perform the highly multiplexed amplification and Sanger sequencing reactions in microfluidic biochips offers the potential to perform these assays in the field, on both environmental and clinical samples. The choice of assay is dependent on the given concept of operation. For example, in certain clinical settings, fewer near neighbors may be present, and the multiplexed PCR assay may provide sufficient diagnostic information. In most environmental settings, however, the potential for an enormous variety of near neighbors and the need for forensic analysis may lead to the preferential use of the sequencing assay. We have recently developed a ruggedized system that incorporates nucleic acid purification, multiplexed amplification, and electrophoretic separation in a single microfluidic biochip. We believe that this foundational system can be adapted to perform both the multiplexed PCR sizing and Rapid Focused Sequencing assays. A fully integrated microfluidic platform that enables rapid pathogen analysis in the field has the potential to dramatically improve biothreat detection capabilities. . 30-plex PCR profiles generated from select biothreat agents Ba, Ft, and Yp. Electropherogram obtained from 100 copies of Ba strain Sterne (A); from 1000 copies of Ft subsp. tularensis WY96 (B); and from 100 copies of Yp bv. mediavalis strain KIM10v (C). Note that 1000 copies of Ft DNA are required to detect two of the FAM-labeled targets (acpA and speA) due to their high AT content resulting in reduced PCR efficiency. doi:10.1371/journal.pone.0056093.g004 Figure 5. Efficiency and sensitivity of the 30-plex panel with real-world air filter samples. Electropherograms obtained from Pentagon filter (A); BioWatch filter I (B); and BioWatch filter II (C). Resulting profiles on the left were from amplification of DNAs purified from air filter; profiles on the right were from air filter DNAs spiked with 100 copies of Ba Sterne. Background noise was detected from direct amplification of the three air filter samples, and the 10-plex Ba-signature was generated for all Ba-spiked air filter DNA samples. The observed nonspecific peak at 444 bp (highlighted in red) was readily distinguished from the 10 specific product peaks (1-pXO1_lef, 2-codY, 3-spoVT, 4-hemL, 5-pXO1_gerXB, 6-sspF, 7-pbp1A, 8-yihY, 9-BA0872, and 10-basB). doi:10.1371/journal.pone.0056093.g005

Primer Design
Ba Ames, Ft SchuS4 and Yp CO92 were selected as reference strains and loci of interest as published in the finished genomes were selected to search for homologous sequences in Genbank databases nr and WGS using blast [59]. Homologous sequences were aligned via ClustalX 2.1 [60] and conserved regions were identified for primer design. Primers were designed and tested in silico using VisualOMP (DNA Software, Ann Arbor, MI) and evaluated for specificity in silico via blast against Genbank nr and WGS, and by Thermoblast (DNA Software, Ann Arbor, MI) against selected near neighbor genome sequences.

Primer Evaluation by Microfluidic Singleplex PCR, Assembly of the Microfluidic Multiplex Panel, and Microfluidic Electrophoresis
Individual primer pairs designed for each target loci were initially tested for performance by microfluidic singleplex PCR using 100 genome equivalents of reference strain template DNAs representing each agent and using only half of the total PCR reaction for separation and detection. The equivalent 50 copies for detection with high sensitivity as judged by PCR peak intensity allow better screening of primers for use in the multiplex panel assembly. The pXO2-deficient Sterne strain Sterne_2, med KIM derivative (ATCC BAA-1504), and subsp. tularensis WY96, were used as reference DNA strains for Ba, Yp, and Ft, respectively. The microfluidic biochip and rapid thermal cycler have been utilized for forensic [13], clinical diagnostics (manuscript for submission), and biothreat [61] applications. Seven microliters of reaction mixtures were amplified in 20 minutes using a 33-cycle protocol. Following amplification, PCR products were retrieved and 2.7 mL aliquots were microfluidically separated and detected by laserinduced fluorescence using the Genebench-FX instrument and accompanying biochip as previously described [13]. With single basepair resolution in Genebench, highly resolved PCR amplicon length was determined via migration of a standard size marker. Output data using Genemarker HID STR Human Identification Software, Version 1.51 (SoftGenetics LLC, State College, PA) shows fluorescent signal strength in RFU indicative of PCR sensitivity. Primers that gave fluorescence intensities of approximately 10 4 RFUs of the specific product band without generating artifactual peaks were utilized for the multiplex assembly. The 10plex panel was assembled by iterative addition of successful primers in a single 7 mL reaction. Peak signal balancing in the assembled multiplex assay was achieved by optimizing concentrations of the individual primers. Primers that caused signal inhibition of other loci were replaced with an alternative primer pair either by redesign or use of alternative target.

Microfluidic Sanger Sequencing
PCR amplicons were subjected to microfluidic Sanger sequencing as previously described [61]. Singleplex PCR was carried out with similar template DNA amounts and thermal cycling conditions as in the multiplex PCR but using unlabeled primers for both amplification and sequencing. PCR products were directly used as template for microfluidic sequencing reactions using the BigDyeH Terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA) and the rapid thermal cycler and microfluidic PCR biochip as previously described [13]. For a given biothreat, a total of 20 independent Sanger sequencing reactions (10 forward and 10 reverse) were performed, with each reaction containing one sequencing primer (forward or reverse) to obtain bidirectional reads. Thermal cycling began with 60 seconds activation at 95uC, followed by 33 cycles (5 seconds at 95uC, 10 seconds at 50uC, and 40 seconds at 60uC) over approximately 32 minutes. Raw electropherograms were processed and basecalled using nnimbc4 (NNIM LLC, Salt Lake City, UT). Forward and reverse reads were aligned by importing the sequences in Codon Code Aligner (CodonCode Corporation, Dedham, MA).

DNA Purification from Air Filter Media and Assessing PCR Inhibition
Air filter samples were washed in 5 mL sterile water with constant agitation for 5 minutes. Pelleted cells were lysed by sonication for 5 minutes using the VialTweeter sonicator (Hielscher USA, Inc.) followed by DNA purification as previously described [61]. To characterize presence of PCR inhibitors from the purified air filter DNA samples, PCR reactions containing 1 ng B. subtilis DNA and B. subtilis DNA spiked with air filter DNA as templates were prepared. A 16S rDNA gene universal bacterial primer was utilized and the expected amplicons analyzed on ethidium bromide-stained agarose gels. Percent PCR inhibition was calculated from the decrease in amplicon signal relative to unspiked B. subtilis control DNA. Figure S1 Phylogram of experimentally obtained and retrieved Ba, Bc and Bt concatenated sequences after clustal alignment and visualization of the drawn PHYLIP tree with TreeView 1.6.6. Note that strains from which less than 4 loci were amplified were not included in the alignment/tree (i.e., Bt sv. kurstaki HD1, sv. israelensis 35646, B. megaterium and B. coagulans ATCC 7050). Also, strains included in the genotype table that have physical gaps of one or more Ba panel loci in the whole genome shotgun data (indicated with ''#'' in the table) have been omitted to avoid skewing of data (i.e., A2012, A0488, A0465, A0442, A0174, A0193, Tsiankovskii-I, A0389).  Figure S4 Phylogram of experimentally obtained and retrieved Yp and Ypst concatenated sequences after clustal alignment and visualization of the drawn PHYLIP (TreeView). Strains included in the genotype table that have physical gaps of one or more Yp panel loci in the whole genome shotgun data (indicated with ''#'' in the table) have been omitted to avoid skewing of data (i.e., IP275 and K1973002). (TIFF) Figure S5 Testing the Yp 10-plex assay against molar excesses of Yp near neighbors. Yp 10-plex panel against 10 5 copies of Ypst YPIII DNA (left panel) and 1:100 molar ratio of Yp Kim 10:Ypst (right panel). The length of the Ypst argS amplicon is clearly distinguishable from the Yp amplicons. Two argS specific products are observed for the spiked Yp DNA: one specific to Ypst YPIII at 395 bp and another specific for Yp KIM10 at 428 bp. When Ypst DNA is spiked in 100 fold excess to KIM10 DNA, the chromosomal targets of this strain are out-competing the targets of KIM10. All expected Yp KIM10 loci are still visible. The 3 red peaks (glgX, argS, and glpD) and the 2 yellow peaks (nuoG and YPO1976) are due to incomplete color correction/bleedthrough from signal saturation (A). Yp 10-plex panel against 10 5 copies of Yk CDC1457-B1 DNA (left panel) and 1:100 molar ratio of Yp Kim 10:Yk (right panel). No specific products are amplified from Yk, as the primers were designed to be discriminatory against Yk loci based on available genome sequence data. A 100-fold molar excess of Yk DNA into Yp KIM10 DNA does not alter the KIM10 profile. The 3 small peaks in red are due to color pull-ups (B).