Conceived and designed the experiments: RS SH DE KR VH. Performed the experiments: RS MF CW CB JG KS DM ME LB RM CI TP BL NF FL HL RR AS LA. Analyzed the data: RS SL AD CW CB KR DE JG KR CM VH LB FL HL TH. Contributed reagents/materials/analysis tools: RR CG MF AD DS GG MM KR SY KS DM KR MS SY ME TH DW MP JD DM. Wrote the paper: RS SL DE CM.
The authors from Ibis and SAIC are full-time employees at “for-profit” corporations. None of the other authors have a financial conflict of interest.
Effective influenza surveillance requires new methods capable of rapid and inexpensive genomic analysis of evolving viral species for pandemic preparedness, to understand the evolution of circulating viral species, and for vaccine strain selection. We have developed one such approach based on previously described broad-range reverse transcription PCR/electrospray ionization mass spectrometry (RT-PCR/ESI-MS) technology.
Analysis of base compositions of RT-PCR amplicons from influenza core gene segments (PB1, PB2, PA, M, NS, NP) are used to provide sub-species identification and infer influenza virus H and N subtypes. Using this approach, we detected and correctly identified 92 mammalian and avian influenza isolates, representing 30 different H and N types, including 29 avian H5N1 isolates. Further, direct analysis of 656 human clinical respiratory specimens collected over a seven-year period (1999–2006) showed correct identification of the viral species and subtypes with >97% sensitivity and specificity. Base composition derived clusters inferred from this analysis showed 100% concordance to previously established clades. Ongoing surveillance of samples from the recent influenza virus seasons (2005–2006) showed evidence for emergence and establishment of new genotypes of circulating H3N2 strains worldwide. Mixed viral quasispecies were found in approximately 1% of these recent samples providing a view into viral evolution.
Thus, rapid RT-PCR/ESI-MS analysis can be used to simultaneously identify all species of influenza viruses with clade-level resolution, identify mixed viral populations and monitor global spread and emergence of novel viral genotypes. This high-throughput method promises to become an integral component of influenza surveillance.
Influenza viruses cause serious global economic and public health burdens. Annual influenza epidemics resulted in more than 30,000 deaths a year in the United States during 1990–1999
Currently, rapid methods for influenza virus diagnosis rely on antigen-specific antibody probes
We have developed a method based on broad-range RT-PCR followed by electrospray ionization mass spectrometry (RT-PCR/ESI-MS) for rapid and accurate detection of influenza virus, sub-species characterization, and early identification of genetic changes in circulating viruses. This method has previously been applied to detection of other pathogens in human clinical samples
To measure the breadth of coverage and resolution offered by the panel of primers described in
Base composition signatures are shown in A, G, C, T order. Identical base compositions within a column are the same color. Base compositions represented only once are shown in white. Base compositions from human H1N1, human H3N2 and avian/human H5N1 isolates (in green, blue and red boxes, respectively) are included in
Base composition signatures provide a multidimensional fingerprint of the genomes of the various viruses, which can be used to determine clusters of related species/sub-types. One such representation (
Each axis represents base composition bins (A, G, C, T) from a single primer pair. Solid symbols represent experimental measurements from this study, while open symbols are calculated base compositions determined from published sequences. Human isolates are shown as cubes and avian isolates as spheres. H1N1 isolates are shown in green, H3N2 in blue, and H5N1 in red. Arrows indicate avian influenza viruses isolated from humans.
To assess the utility of the RT-PCR/ESI-MS assay for surveillance of influenza virus in human populations, we analyzed 656 blinded clinical samples collected over a seven-year period (1999–2006). The results were compared with conventional analysis of the same samples by virus culture/serology and real-time RT-PCR methods. Two hundred forty-three samples were influenza positive both by RT-PCR/ESI-MS and conventional assays. Ten samples were positive only by RT-PCR/ESI-MS while eight samples were positive only by culture/real-time RT-PCR, corresponding to approximately 97% sensitivity and 98% specificity. Of the influenza-positive samples, RT-PCR/ESI-MS analysis identified 186 as influenza A virus and 67 as influenza B virus, in complete agreement with conventional typing methods.
Base composition analysis of multiple RNA segments enables further categorization of isolates into previously established clades determined by sequencing (details shown in
In addition to identification and species typing, RT-PCR/ESI-MS provided a quantitative estimate of the number of viral genome copies in the original patient sample. This was achieved by including a fixed amount (300 copies/well) of an internal RNA calibration standard in each PCR reaction
To demonstrate the capabilities of RT-PCR/ESI-MS to track the evolution of circulating influenza viruses, we created a tree representation of the H3N2 influenza virus sequences from Genbank (
Unique base composition types are reported using a six-letter code (see text) and are chronologically sorted bottom to top (color boxes, seasons 1997 to 2006). From year 2000 onwards, seasons were labeled “North” and “South” to reflect the northern or southern hemispheric origin of the samples. Thick vertical bars represent the persistence of main types between consecutive seasons. Within each season, the number of isolates is reported between parentheses for types encountered more than once. Thin horizontal lines represent the spawning of new types through the accumulation of single mutations (left to right). Black font: types determined through sequence analysis; blue font: experimentally determined base composition types; red font: experimentally determined base composition types for season 2005–06. Ten rare sequence types (∼1.5%) were not uniquely discernable by the base composition analysis of the eight amplicons used in this analysis, as more than one subtype produced the same BC-type. These BC-types are indicated by asterisks.
The tree was further expanded by the addition of 174 influenza A H3N2 base compositions, resulting in 29 distinct base composition types (BC types), from samples collected in North America during the 2005–2006 season (
The areas of the circles are scaled to the number of human samples that contained the BC-types. Each concentric ring represents a single, double and triple mutations removed from the founder isolate, color coded for the gene containing the mutation. The order of the letters in the BC-type correspond to the six primer pairs used in this study, targeting PB1, NP, PA, M1, NS1 and NS2, respectively.
Infections with more than one influenza virus type can be identified with high sensitivity by RT-PCR/ESI-MS.
Panels A, B, and C are representations of mass spectral data. The heat maps in the top sections are a charge state representation of the data; the spectral plots in the lower sections were created by filtering the charge state responses to create signal representations vs. mass. The main peaks on the spectral plots are the primary amplicons and appear as hot spots in the charge state representations; the secondary amplicons appear as “cloudy” regions to the right and left for the forward and reverse strands, respectively. Panels A and B contain two species in relatively large ratios (20–50% mixtures) and involve the season 2005–2006 parent BC-type (AADFAA) and a type with a single mutation (panel A, within the M1 amplicon, BC-type AAHFAA; panel B, within the overlapping NS1 and NS2 amplicons, BC-type AADFBB). Panel C shows detection of a low abundance type (2–5%). Panel D shows a close-up view of the mass spectrum from Panel C. In this view, the shoulder of the peak is fit with a single mass model (blue dotted line) and a two mass model (dashed red line).
The dynamic range for mixed RT-PCR/ESI-MS detections has previously been determined to be approximately 100∶1
A total of 293 non-overlapping nucleotides, excluding the primer regions, were analyzed using the genetic loci targeted by the influenza A primers. This corresponds to 2.15% of the influenza A virus genome. Out of 174 human samples analyzed from the 2005–2006 season, only two showed evidence for mixed viral populations at one of the six loci, corresponding to 1.1% of the samples. Thus, assuming the same mutation rate for the broader viral genome as for the region analyzed by PCR/ESI-MS, about 50% of the human H3N2 virus samples would have a mixed population of viruses.
Choosing amongst the various molecular methods available for pandemic influenza surveillance requires consideration of both practical issues (e.g., broad availability, convenience, cost, and throughput) and scientific issues relevant to public health (e.g., sensitivity, breath of coverage, and the depth and value of the information provided). At one end of the spectrum, a conventional RT-PCR test with specific primers and probes provides a highly-specific, sensitive, rapid, convenient, quantitative, relatively inexpensive, and high-throughput format that can provide valuable surveillance information. However, these tests are not optimal for surveillance when the exact nature of the pandemic virus is not known. Moreover, without supplemental nucleic acid sequencing, conventional RT-PCR-based tests are not capable of signaling the appearance of new genetic variants, except by potentially demonstrating a loss of sensitivity. Further, a single RT-PCR test can achieve only a single presence/absence analysis limited to the specific target for which it was designed. Discrimination of all known variants of influenza at the level of resolution described here would require hundreds of independent RT-PCR reactions. At the other end of the spectrum, virus isolation using culture methods followed by complete genome sequencing does not require prior knowledge of the virus' sequence and provides clade-level resolution and highly detailed information regarding virus evolution. Unfortunately, this method is slow, labor intensive, expensive, and low throughput, rendering it ineffective in public health arenas requiring rapid response.
In this work we describe a novel method that employs some of the best properties of each of the discussed techniques, and also supplies additional valuable information not provided by those techniques. For example, our method may identify mixed populations of viruses, either as viral quasispecies as previously illustrated (i.e., development of “drift” strains) or co-infections with circulating strains (i.e., potential for development of “shift” strains). Recent advances in ESI-MS using bench-top mass spectrometers have enabled analysis of PCR amplicons with sufficient mass accuracy that the nucleotide base composition (the A, G, C, and T count) of the PCR amplicon can be unambiguously determined
Second, mixed viral populations in the same sample that differ only by a single mutation and is present in as low as 1–2% of the virus population can be identified, providing early insights into viral evolution as an integral component of surveillance. This information-rich result is provided with the same throughput and consumable costs as conventional, sequence-specific RT-PCR assays.
The results from 174 influenza A H3N2 samples from the 2005–2006 season in the northern hemisphere were particularly interesting because they revealed viral evolution during a single season. The viruses detected appeared to have been seeded from two of the more abundant BC types circulating during the previous season in the southern hemisphere. The majority (97) of the samples had identical BC types and probably arose from a single founder from the previous season in the southern hemisphere. Most of the remaining samples differed from this founder BC type by one or two additional point mutations within the target regions described here. Surprisingly, when mutations occurred, they became fixed rapidly in the viral population, since only two of the 174 samples from the 2005–2006 season showed evidence of mixed populations. Sequencing revealed that both mutations were silent transitions in third codon positions.
In summary, the RT-PCR/ESI-MS method has the capacity to provide rapid diagnosis of human influenza in individual patients with respiratory symptoms, as well as public health surveillance of emerging, potentially pandemic strains, including novel reassortants. The use of RT-PCR/ESI-MS for human as well as avian/animal surveillance offers the potential for new insights into viral evolution on a scale and at a cost previously not possible. Bench-top mass spectrometers are capable of analyzing complex PCR products at a rate of approximately one reaction product/minute, making the RT-PCR/ESI-MS technology practical for large-scale analysis of clinical specimens or for animal surveillance. Further, as we have demonstrated with influenza viruses, this method provides sensitive detection directly from patient specimens with specificity approximating sequence-level resolution. The ability to quantitate virus shedding, detect low-abundance, mixed infections, and identify new genetic variants without prior knowledge of viral sequence also make this technology ideally suited for monitoring the emergence of drug-resistance mutations during therapy or for identifying newly emerging antigenic variants.
Eighteen human influenza virus A (H1N1, H3N2) and six human influenza virus B isolates were obtained as tissue culture fluids from the Naval Health Research Center (NHRC, San Diego, CA). NHRC also supplied 336 human respiratory specimens (240 throat swabs, 26 nasal swabs, and 70 nasal wash specimens) collected and archived from various U.S. military bases around the country from 1999 through 2005. The New York State Department of Health, Wadsworth Center (Albany, NY) supplied 100 respiratory specimens collected between 1999–2005 (88 nasopharyngeal aspirates; 12 assorted nasal aspirates, BAL, tracheal aspirates, and throat swabs). Johns Hopkins University (JHU, Baltimore, MD) provided 229 nasal aspirates collected during 2003–2005. The Texas Department of State Health Services provided a total of 574 assorted throat, sputum, and nasopharyngeal aspirate specimens collected from 2005 to 2006. The 63 avian isolates from 16 different avian species were obtained from the United States (The University of Georgia collection), Egypt, and Asia/Middle East (Naval Medical Research Unit #3 (NAMRU-3), Cairo, Egypt). NAMRU-3 also provided all of the HPAI (avian H5N1) isolates collected from 2006 outbreaks in Egypt, Iraq, and Central Asia. Equine (VR-297) and swine isolates (VR-333) were obtained from the American Type Culture Collection (ATCC). Additional swine isolates were provided by Dr. Gregory Gray from the University of Iowa. All samples were collected under appropriate human or animal use protocols, or from granted exempt use of anonymous clinical specimens, from the respective organizations.
Viral stock samples consisting of cultured virus or stocks obtained from ATCC were prepared for analysis using the Qiagen QiaAmp Virus kit (Valencia, CA). Both manual (mini spin) kits and BioRobot kits were used. Robotic-based isolations were done on both the Qiagen MDx and Qiagen BioRobot 8000 platforms. Clinical swab samples were stored in Viral Transport Media (VTM). VTM solution (1 ml) was passed over a 0.2 micron filter, which was then subjected to bead beating in a small amount of lysis buffer. The resulting viral lysate was then used following the same protocol as above.
Based upon analysis of multiple influenza sequence alignments, pan-influenza virus RT-PCR primer sets were developed that were capable of amplifying all three influenza virus species (A, B, and C) and subtypes (HxNy) from different animal hosts (e.g., human, avian, and swine) and distinguishing essential molecular features using base composition signatures. Additional primer pairs were designed that broadly amplified all known members of a particular species, but that did not cross-amplify members of different species (e.g., pan-influenza A and pan-influenza B primers). A surveillance panel of eight primer pairs (
One-step RT-PCR was performed in a reaction mix consisting of 4 U of Ampli
The RT-PCR products were analyzed using the Ibis T5000 universal biosensor platform (Ibis Biosciences, Inc., Carlsbad, CA;
To demonstrate the capabilities of RT-PCR/ESI-MS to track the evolution of circulating influenza viruses, we used a bioinformatic approach to develop a framework on which to display the RT-PCR/ESI-MS results obtained with H3N2 viruses. Complete genome sequences of all H3N2 human influenza viruses available in GenBank were analyzed. A total of 731 genomes were included, from which we inferred the phylogeny of H3N2 influenza virus since 1996. Using the 565-nucleotide concatenated sequence of the six loci that we analyzed by RT-PCR/ESI-MS, we constructed a non-redundant alignment of 105 sequence types. Base compositions were determined for each genome segment (i.e., locus) analyzed and each unique base composition at each of these loci was assigned a letter according to decreasing number of occurrences (therefore, the letter A represents the most common allele identified at each locus). The concatenation of the six base-composition letters from the PB1, NP, M1, PA, NS1, and NS2 loci from the sequences of 731 H3N2 viruses (1996–2005) available in GenBank yielded 95 six-letter codes referred to as base composition types (BC types). The predominant type is labeled AAAAAA. The topology of the tree was then deduced from the alignment of non-redundant sequences using the programs
Influenza virus primer pairs used in this study. Genbank reference sequence for each segment is indicated; however, the primer sequences are not identical to the reference sequence as described in the
(0.06 MB PDF)
Distribution of BC-types observed in influenza A H3N2 positive human respiratory samples. Unique base compositions at each genome segment locus analyzed were assigned letter codes and concatenation of letter codes across the six loci analyzed yielded BC-types. H1N1 samples were not assigned a BC-type. Experimentally determined BC-types (marked RT-PCR/ESI-MS Analysis results) were compared to BC-type signature information of sequences currently available in GenBank and the closest matching strain is shown (right pane). Last column shows comparison to clade designation described in Holmes et al
(0.24 MB PDF)
Mutations observed in influenza virus isolates in 2005–06 season. Out of the 174 influenza A H3N2 positive samples analyzed from the 2005–2006 season, 29 genotypes were assigned based on RT-PCR/ESI-MS and 30 were assigned based on sequencing. There were no instances of “ESI-MS silent” compensating double mutations (e.g., simultaneous A>G and G>A within the same amplicon leading to no change in the observed BC-type) as verified by sequencing.
(0.10 MB PDF)
The authors thank the CDC (Extramural Grant: R01-CI-000099) and NIAID (NIH Grant 1UC1AI067232-01) for financial support, Drs. Gregory Gray and Troy McCarthy from University of Iowa for providing swine influenza isolates, Dr. Wendy Sessions from the Medical Virology Laboratory, Texas Department of State Health Services, Austin, Texas for 2005–06 human influenza samples, and Dr. Jackie Wyatt for critical review of the manuscript. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army.