Transcriptional Changes in the Hookworm, Ancylostoma caninum, during the Transition from a Free-Living to a Parasitic Larva

Background Third-stage larvae (L3) of the canine hookworm, Ancylostoma caninum, undergo arrested development preceding transmission to a host. Many of the mRNAs up-regulated at this stage are likely to encode proteins that facilitate the transition from a free-living to a parasitic larva. The initial phase of mammalian host invasion by A. caninum L3 (herein termed “activation”) can be mimicked in vitro by culturing L3 in serum-containing medium. Methodology/Principal Findings The mRNAs differentially transcribed between activated and non-activated L3 were identified by suppression subtractive hybridisation (SSH). The analysis of these mRNAs on a custom oligonucleotide microarray printed with the SSH expressed sequence tags (ESTs) and publicly available A. caninum ESTs (non-subtracted) yielded 602 differentially expressed mRNAs, of which the most highly represented sequences encoded members of the pathogenesis-related protein (PRP) superfamily and proteases. Comparison of these A. caninum mRNAs with those of Caenorhabditis elegans larvae exiting from developmental (dauer) arrest demonstrated unexpectedly large differences in gene ontology profiles. C. elegans dauer exiting L3 up-regulated expression of mostly intracellular molecules involved in growth and development. Such mRNAs are virtually absent from activated hookworm larvae, and instead are over-represented by mRNAs encoding extracellular proteins with putative roles in host-parasite interactions. Conclusions/Significance Although this should not invalidate C. elegans dauer exit as a model for hookworm activation, it highlights the limitations of this free-living nematode as a model organism for the transition of nematode larvae from a free-living to a parasitic state.


Introduction
Parasitic nematodes are of considerable medical, veterinary and agricultural importance. For example, it is estimated that the morbidity attributable to hookworms, Trichuris and Ascaris, the three most prevalent parasitic nematodes in humans globally, could be as high as 39 disability adjusted life years (DALY) [1]. This assessment takes into account the long-term impact of infection on cognitive and physical development and the overall health of the host. World-wide, ,1.3 billion people are infected with at least one of these geohelminths [2]. The prevalence of the human hookworms, Ancylostoma duodenale and Necator americanus, alone approaches 740 million, with the foci predominantly within Asia, sub-Saharan Africa, and Latin America [3].
Facultative developmental arrest in the free-living nematode, Caenorhabditis elegans, can occur transiently in the first larval stage (L1) as well as for prolonged periods at the L3 stage. Developmental arrest (often referred to as the dauer stage) in the L3 is triggered in response to conditions, such as crowding, scarcity of food and elevated temperature [4]. When the environment improves, worms exit the arrest to resume development. However, under permissive conditions, arrest is bypassed and adult and reproductive development is favoured. For many parasitic nematodes, arrest at the L3 facilitates survival in the environment.
The exit from arrest marks the return to growth and development as well as the transmission of the parasite to its host. Larvae invade a suitable host and undergo a migration through particular tissues to then establish in a target organ and complete the life cycle or arrest in specific tissues. The infective L3 of many parasitic nematodes produce mRNAs which are thought to relate to invasion, migration, and/or survival [5][6][7][8][9][10]. Therefore, the characterization of mRNAs transcribed in the L3 during its transition from the free-living to the parasitic stage may aid in the identification of genes associated with these processes. An attractive parasite model in which to experimentally study this transition is the dog hookworm, Ancylostoma caninum, for which an in vitro serum-stimulation assay exists [11].
Several molecular aspects associated with serum stimulation have been investigated previously in A. caninum. Some researchers have focused on the release of activation-associated proteins; these molecules include the pathogenesis related protein (PRP) superfamily members Ac-ASP-1 [12] and Ac-ASP-2 [13], and the metalloprotease Ac-MTP-1 [14], all of which represent the most abundant excreted/secreted proteins released by serum-stimulated (activated) L3. Other workers have studied activation-associated genes of hookworms using a transcriptomic approach. For instance, Mitreva et al. [8] generated expressed sequence tags (ESTs) for A. caninum (serum stimulated, unstimulated and tissuearrested L3) and A. ceylanicum (unstimulated L3 and adults), being the first systematic study of genes associated with the host invasion process. However, this study had some limitations in that (1) comparative analyses made between larval stages were qualitative rather than quantitative; (2) some of the observed differences in the abundance of ESTs between activated and non-activated A. caninum L3 seemed to be attributable to differences in the procedures employed for the construction of the cDNA libraries from these life-cycle stages; and (3) the study included a relatively small number of randomly generated sequences available at the time for A. caninum (n = 3840) and A. ceylanicum (n = 3149). Moser et al. [7] addressed the first two points by conducting a quantitative microarray analysis of A. caninum genes associated with the transition to parasitism, focusing on decreased transcription after serum stimulation (i.e., those mRNAs which are ''switched off'' or reduced in transcription upon host entry). However, this study was also limited to known ESTs available in the public databases.
To infer the mRNAs involved in the infective process of A. caninum, we conducted herein a quantitative study of all known A. caninum sequences as well as newly identified genes discovered through suppressive-subtractive hybridisation (SSH) of activated versus non-activated L3 of A. caninum. The method of SSH was employed to selectively enrich differentially transcribed genes [15]. In summary, 242 potentially up-regulated and 109 potentially down-regulated mRNAs were identified by SSH. There were many mRNAs that were differentially expressed but not identified by SSH, although this might be a function of the number of clones randomly sequenced from our subtracted libraries. The final repertoire of activation-associated genes consisted of 240 upregulated and 362 down-regulated mRNAs. Among these nearly 600 activation-associated genes were numerous (often substantially up-regulated) mRNAs encoding PRPs and three of the major catalytic classes of proteases (metallo-, cysteine, and aspartic). Several mRNAs encoding novel secreted proteins without any known homologues were also identified. These mRNAs, if demonstrated to be integral to the parasitic process, could represent a new generation of potential vaccine antigens and drug targets against hookworms.

Parasitological methods
A. caninum L3 were isolated from the faeces of stray dogs in the greater Brisbane area and surrounding towns in Queensland, Australia, using a standard charcoal coproculture method. Cultures were incubated at ,23uC in a humidified chamber for one week, after which L3 were concentrated using a modified Baermann technique and purified through a nylon filter (20 mm). Larvae were stored for up to four weeks in 50 mM Na 2 HPO 4 , 22 mM KH 2 PO 4 , 70 mM NaCl, pH 6.8 [16] in 12.5 cm 2 vented tissue culture flasks in the dark at room temperature until use. In total, four separate groups of L3 representing four separate infections from different geographical locations were obtained. The first group was used for SSH and time course studies, whereas the others were employed as biological replicates in microarray validation and real-time PCR analyses ( Figure 1).

Verification of the specific identity of A. caninum L3
The specific identity of the parasite material was confirmed by PCR amplification of the first and second internal transcribed spacers (ITS-1 and ITS-2) of nuclear ribosomal DNA (as described by [17]) and automated sequencing (using BigDye chemistry, ABI). The sequences determined were required to be identical to those with GenBank accession numbers Y19181 (ITS-1) and AJ001591 (ITS-2).

In vitro activation of larvae
Prior to in vitro activation (serum-stimulation), ensheathed L3 were incubated in 1% HCl for 30 min at ,23uC and then resuspended in RPMI-C (RPMI-1640 tissue culture medium supplemented with 25 mM HEPES (pH 7.0), 100 IU/ml of penicillin, 100 mg/ml of streptomycin, and 40 mg/ml of gentamycin) [12]. To each well of a 24-well tissue culture plate, 5,000 L3 were added. For the SSH, a total of 40,000 L3 were activated in

Author Summary
Hookworms are soil-transmitted nematodes that parasitize hundreds of millions of people in developing countries. Here we describe the genes expressed when hookworm larvae make the transition from a developmentally arrested free-living form to a tissue-penetrating parasitic stage. Ancylostoma caninum can be ''tricked'' into thinking it has penetrated host skin by incubating free-living larvae in host serum -this is called ''activation''. To comprehensively identify genes involved in activation, we used suppressive subtractive hybridization to clone genes that were up-or down-regulated in activated larvae, with a particular focus up-regulated genes. The subtracted genes, as well as randomly sequenced (non-subtracted) genes from public databases were then printed on a microarray to further explore differential expression. We compared predicted gene functions between activated hookworms and the free-living nematode, Caenorhabditis elegans, exiting developmental arrest (dauer), and found enormous differences in the types of genes expressed. Genes encoding secreted proteins involved in parasitism were over-represented in activated hookworms whereas genes involved in growth and development dominated in C. elegans exiting dauer. Our data implies that C. elegans dauer exit is not a reliable model for exit from developmental arrest of hookworm larvae. Many of these genes likely play critical roles in host-parasite interactions, and are therefore worthy of pursuit for vaccine and drug development.
15% serum and 25 mM S-methylglutathione in RPMI-C, whereas 25,000 L3 were incubated in RPMI-C alone (non-activated control). Five thousand L3 were sampled at each of four time points during the activation (1, 6, 13, and 24 h), in order to perform a time-course analysis of transcripts using real-time PCR (described in ''Validation of transcription via real-time PCR''). The same number of non-activated control L3 were separately prepared for this analysis, leaving 20,000 activated and 20,000 non-activated worms for SSH.
For the microarray analysis, activated and non-activated L3 (50,000 of each) were prepared from each of two separate populations of A. caninum. Also, activated and non-activated L3 (5,000 of each) were prepared for real-time PCR. For the in vitro activation, L3 were incubated overnight at 37uC in 5% CO 2 ; pharyngeal pumping in activated L3 was verified by feeding ,100 of them with FITC-BSA (10 mg/ml) for 3 h and fluorescence was detected using a Leica DM IRB inverted microscope with a Leica DC 500 high-resolution digital camera [11]. Activated and nonactivated L3 were each washed twice in phosphate-buffered saline (PBS, pH 7.4; 23uC) and immediately frozen at 280uC. For RNA isolation, larvae were resuspended in 100 ml of Trizol reagent and homogenized in a 1.5 ml tube using an RNase-free, disposable, intube pestle and subjected to three rapid (1 min) freeze/thaw cycles. Trizol was added to a final volume of 500 ml, before snap freezing in liquid nitrogen. These samples were stored for #1 month at 280uC before RNA was isolated.

RNA isolation
Frozen samples of L3 in Trizol were brought to 4uC and centrifuged (16,0006g at the same temperature) for 10 min to remove insoluble debris and residual genomic DNA. RNA was then extracted with chloroform, precipitated with isopropanol, washed with absolute ethanol and resuspended in 50 ml of RNAsefree water. Each RNA sample was treated with 2 U of DNase I (Promega) prior to heat denaturation of the enzyme (75uC for 5 min) and frozen immediately at 280uC. The integrity of RNA was verified to have an RNA Integrity Number .8.0 using an Agilent 2100 Bioanalyzer and RNA 6000 LabChip Kit (Agilent Technologies). RNA used for microarray analysis was stored as an ethanol precipitate in 75% ethanol at 280uC.

Suppression subtractive hybridisation (SSH)
First strand cDNA was synthesized from 1 mg of total RNA using the SuperSmart cDNA synthesis kit (Clontech), according to the manufacturer's protocol. Subsequently, double stranded cDNA was produced through 17 rounds of PCR amplification and purified by phenol:chloroform:isoamyl alcohol (25:24:1) extraction, followed by sodium acetate precipitation. SSH was carried out using the PCR Select cDNA subtraction kit (Clontech) according to the manufacturer's protocol. Briefly, cDNA from activated or non-activated A. caninum L3 was digested with the endonuclease Rsa I and ligated to adapters, yielding tester cDNAs for each treatment. Activated tester cDNA was denatured and allowed to re-hybridize in an excess of non-activated ''driver'' cDNA. This hybridization was termed the forward subtraction and enriched for cDNAs with a higher abundance in activated worms. A reverse subtraction was also performed which enriched for cDNAs which were more abundant in the non-activated worms. Also, an unsubtracted control was prepared according to the standard protocol. Hybridized cDNAs were amplified via two rounds of PCR (according to the recommended protocol), purified by spin-column (QIAGEN) and then cloned into the plasmid vector pGEM-T (Promega). Chemically competent Escherichia coli (TOP 10) were transformed and grown for 8 h at 37uC in Luria Bertani medium (LB) with 100 mg/ml ampicillin. Stocks were stored in glycerol (20%) at -80uC. Immediately prior to sequencing, 50 ml of LB (22uC) was added to each E. coli stock and grown overnight on LB agar plates containing 100 mg/ml of ampicillin. Recombinant colonies were isolated by blue/white selection and then arrayed on a grided LB plate containing 100 mg/ml of ampicillin. Inserts amplified using the TempliPhi DNA Sequencing Template Amplification kit (GE Healthcare Life Sciences) were sequenced unidirectionally using the T7 vector primer in an ABI 3730X1 DNA analyser.

EST sequencing and bioinformatic analysis
The chromatograms for all raw ESTs were inspected and processed to remove poor quality sequence, with subsequent removal of contaminating vector sequences using BioEdit software v.7.0.1. Following this pre-processing, ESTs were organized into contigs and clusters through an iterative approach using the Cap contig assembly facility in BioEdit under strict conditions, requiring at least a 100 bp overlap and 95% identity among sequences. The resultant contigs and singletons were named according to a simple convention. A ''C'' in the sequence name identifies sequences composed of multiple ESTs while singletons are indicated with an ''S''. Sequences from the forward-subtracted library have four digit identifiers, whereas those from the reversesubtracted library have three digits. For example, Ac_SSH_C_ 0056 indicates that the SSH sequence 0056 is composed of multiple ESTs from the forward subtracted library. SSH sequences were compared with existing sequences in GenBank and Wormbase (www.wormbase.org) via BLASTx through NCBI (www.ncbi.nlm.nih.gov/BLAST/) and WU-BLAST (www.ebi.ac. uk/blast2). Alignments were considered statistically significant if an E-or P-value was #1610 25 . Neural networks and hidden Markov models were used to predict signal peptides and transmembrane domains by way of the SignalP 3.0 (www.cbs. dtu.dk/services/SignalP/) and TMPred (www.ch.embnet.org/ software/TMPRED.form.html) interfaces, respectively. Conserved protein motifs of activation-associated ORFs were identified using the InterProScan website (www.ebi.ac.uk/InterProScan). Potential proteases were classified using the MEROPS protease database (http://merops.sanger.ac.uk/index.htm). Contigs were also mapped to gene ontology (GO) terms based on sequence similarity using the BLAST2GO platform (www.blast2go.de) which compares all contigs with sequences available in several databases, including Wormbase and Uniprot [18]. Only BLASTx hits with a maximum E-value #1610 210 and a minimum of 50% similarity (default software settings) were selected for annotation. A modified one-tailed Fisher exact test based on a hypergeometric distribution was employed in the identification of GO terms for differentially transcribed genes, which were significantly over-represented [19]. This assessment was made relative to the total number of A. caninum genes which had been GO-annotated. Setting the ''false discovery'' rate limit to 0.5 aided in controlling for multiple testing errors [18].

Microarray: probe design
Sequence data for 9,618 A. caninum ESTs were obtained from the Washington University Genomics Department via the NCBI sequence database (http://www.ncbi.nlm.nih.gov/Genbank/ index.html). Chromatograms were pre-processed with the Phred software [20,21] and organized into contigs and clusters with the Cap3 contig assembly program [22], employing a minimum sequence overlap length of 30 bases and an identity threshold of 95%. Contigs (n = 1311) were assembled from the ESTs and are hereafter designated with ''Contig'', followed by a number between 1 and 1311. The remaining singletons were filtered by BLAST E-values (,0.001) to remove potentially spurious sequences and are henceforth referred to by their GenBank accession number. In total, 2,889 individual sequences were identified from the total EST dataset for A. caninum. Sequences representing individual clusters assembled from the sequence data from the forward and reverse subtracted cDNAs as well as the publicly available repository were combined. The combined dataset (a total of 3,100 representative sequences) were submitted for the design of 60-mer oligonucleotides using eArray (Agilent). A total of 9,288 oligonucleotides (3 per target) were proposed for 3,096 contigs. Of these oligonucleotides, 3,443 possessed a non-self perfect match, resulting in 5,845 representing 1,967 genes suitable for microarray analysis. These 5,845 oligonucleotide probe sequences were electronically submitted using eArray for ink-jet in-situ synthesis onto glass slides by Agilent Technologies.

Microarray: cRNA preparation, hybridization and data analysis
To generate cRNA, 200 ng of total RNA extracted from each activated and non-activated L3 population of A. caninum was reverse transcribed and simultaneously labelled with Cy3 or Cy5 (Agilent). Immediately prior to hybridisation, 500 ng of labelled cRNAs from each of activated and non-activated worms were quantified using a NanoDrop ND-1000 UV-VIS spectrophotometer (NanoDrop), assessed for size distribution and Cy5-dye incorporation using an Agilent 2100 Bioanalyzer and RNA 6000 LabChip Kit (Agilent), mixed together and fragmented. The cRNA from the combined treatments for each population was hybridised to the array in duplicate, with the second hybridisation representing a dye swap to control for any bias in signal intensity between the two dyes. Hybridisations and washes were conducted as per Agilent's Two-colour Gene Expression Hybridisation protocol version 5.0.1. Briefly, 250 ml of hybridisation solution was applied and the microarrays were hybridised for 17 hours at 65uC, 10 rpm. Slides were then washed for 1 minute in Wash Buffer 1 (RT), 1 minute in Wash Buffer 2 (37uC), 1 minute in Acetonitrile (RT) and 30s in Stabilisation and Drying Solution (RT). Slides were scanned using a DNA Microarray Scanner (Agilent). Scanning and feature extraction were performed using Feature Extraction software version 9.1 (extraction protocol GE2-v4_91; Agilent). During extraction, signal intensities were Linear and Lowess-normalized, dye-corrected, and adjusted for local background. Data handling and analysis were carried out using the program SAS v.8.0 (SAS Institute). Processed signal intensities for each probe were averaged across genes, replicates and populations for comparison between treatments by a two-sided ttest with a Type I error rate of 0.01. Only signals differing by at least 1.5 fold (P#0.01) for each population were considered to represent molecules differentially transcribed in A. caninum as a consequence of serum stimulation in vitro. The effects of dye and probe on the mean signal were assessed graphically. Fold changes in hybridisation were expressed as log 2 -transformed ratios. The absolute log 2 ratios within each level-three GO category were averaged and divided by the mean absolute log 2 ratio of all spots on the chip to derive an expression quotient (EQ). The EQ provides an indication of the degree of differential expression associated with a specific GO term.

Validation of transcription via real-time PCR
Reverse transcription real-time PCR was used for the validation of microarray data and for studying levels of transcription in L3 at different time points during the course of serum stimulation in vitro. Ten target sequences were chosen at random and seven others were selected to represent contigs with high, medium and low levels of hybridisation in the microarray. The sequences of all of the primers used in the real-time PCR are listed in Table S2. The single-stranded cDNA template was quantified spectrophotometrically and diluted to an appropriate concentration (2 ng/ml). Two ng of cDNA from each activated and non-activated A. caninum L3 population were subjected to PCR in the presence of 100 nM of the forward and reverse primers in 16 Platinum SYBR Green qPCR SuperMix-UDG (Invitrogen). All experiments were repeat-ed three times with two replicates in each using a Rotor-Gene 6000 Series 2-Plex real-time PCR thermal cycler (Corbett Life Science) employing the following cycling parameters: 50uC for 2 min, 95uC for 2 min, and 40 cycles of 95uC for 15 sec and 60uC for 30 sec. A melt curve analysis was performed from 60uC to 95uC in 1uC intervals to demonstrate the specificity of each amplicon and to identify the formation of primer dimers. Amplicons were also inspected on a 1.2% agarose gel and subjected to automated sequencing to prove their identity. Fold changes in transcripts between activated and non-activated L3 were normalized to the 60S acidic ribosomal protein gene (accession number BF250585) [23] according to an established method [24,25]. The standard error of the log 2 ratios was calculated from the error of the crossing points and observed reaction efficiencies propagated through the calculation of the ratio. Non-parametric statistical inference testing of log 2 -transformed ratios was performed using a pairwise fixed reallocation randomisation approach with 10,000 simulations which calculated the probability of observing ratios of randomly assigned control and treatment pairs greater than or equal to the treatment effect observed.

Activation and feeding
Approximately 120,000 activated and 120,000 non-activated L3 of A. caninum were prepared for RNA extraction. As evidenced by the ingestion of FITC-BSA, .95% of all activated L3 resumed feeding, whereas ,4% of the non-activated L3 fed ( Figure 2). L3 that failed to feed in the presence of the serum stimulus or those that fed in the absence of the stimulus could not be separated from each other.

SSH cDNA libraries and general characteristics of the ESTs
The RNAs from activated and non-activated L3 were extracted, processed, subjected to forward and reverse SSH and cloned into a plasmid vector. A total of 958 sequencing reactions from the forward library and 171 from the reverse library yielded high quality ESTs. The sequences were deposited in GenBank (accession numbers ES671894-ES672870) and dbEST (accession numbers 46880363 -46881339) databases. Clustering of these SSH ESTs yielded totals of 242 forward subtracted and 109 reverse-subtracted contigs (Table 1). Approximately half of all forward subtracted sequences were represented by a single EST, although this figure was ,80% for reverse subtracted sequences. The minimum and maximum lengths of the ESTs were 100 bp and 1500 bp, respectively, with the forward subtracted contigs being slightly larger (571 bp) than those from the reverse subtracted contigs (499 bp). Contigs assembled from the publicly available ESTs for A. caninum had a similar size distribution compared with those assembled from the subtracted ESTs.
From the SSH-derived ESTs, almost 30% of all contigs from both libraries lacked significant sequence similarity to any of the ESTs generated previously for this species [8]. Furthermore, 64-70% of the ESTs from the forward and reverse subtracted libraries respectively did not exhibit significant similarity (at both the nucleotide and protein levels) to sequences within the databases queried (GenBank, EMBL and WormBase) ( Table 2). Most mRNAs identified by SSH (63.8%) had a predicted ORF of .50 amino acids. Of these, 19% from the forward subtracted contigs had ORFs with a predicted signal sequence as compared with 6% from the reverse subtracted contigs.

Microarray validation of SSH
The differential hybridisation of forward and reverse subtracted contigs identified by SSH was verified using a custom designed oligonucleotide microarray. In order to assess the sensitivity and specificity of SSH, we clustered the entire A. caninum EST dataset (9,618 ESTs) and submitted the union of the public ESTs and the SSH contigs for oligonucleotide design. Slides were hybridised with cRNA derived from two separate populations of hookworms (Groups III and IV; Figure 1). The number of L3 obtained from Group III was sufficient to produce two separate pools, serving as a technical replicate for RNA extraction. In total, eight separate hybridisations were performed, one for each of the three RNA samples, plus a dye swap as well as two self-hybridisations, in which Cy3 and Cy5 probes generated from the same RNA stock were used together to hybridise to the slide. For the two populations of A. caninum L3 used, the response to serum stimulation was very similar, as can be seen from the Magnitude (M) versus Amplitude (A) plots generated for each population ( Figure 3). The three different oligonucleotides designed for each target yielded consistent log 2 ratios among the 50 A. caninum control genes (data not shown). Similar log 2 ratios were also observed between arrays and dye-swaps ( Figure S1). Furthermore, real-time PCR analysis, performed on 17 randomly chosen SSHderived sequences, demonstrated the validity of the microarray data. However, the microarray consistently under-estimated the log 2 ratio for highly abundant mRNAs, most likely attributable to probe saturation at both the 100% and 50% scans (data not shown). The data discussed in this publication have been deposited in NCBIs Gene Expression Omnibus (http://www.ncbi.nlm.nih. gov/geo/) and are accessible through GEO Series accession number GSE8155 Activation associated mRNAs In total, 602 mRNAs were associated with log 2 transcription ratios significantly greater than zero (P#0.01). A total of 103 mRNAs new to A. caninum (not in the public databases prior to this study) were identified by SSH, of which 79 had microarray data available. More than 60% of these mRNAs exhibited no significant similarity to any sequences other than A. caninum nor did they have any homologues/orthologues in the publicly available sequences for the congeneric hookworm, A. ceylanicum, or other strongylid nematodes. Sixty-three percent of unique genes, which were up-regulated upon serum stimulation, possessed a signal sequence, in comparison with 33% for down-regulated mRNAs ( Table 3). The ten most abundant mRNAs in activated and nonactivated L3 are listed in Table 4. Cytochrome c oxidase large subunit of nuclear ribosomal RNA and two novel mRNAs were amongst the most highly expressed mRNAs in both activated and non-activated larvae, with all but the two novel mRNAs being slightly, albeit significantly (P#0.01) up-regulated upon stimulation. One of the most abundant mRNAs in non-activated larvae was Ac-mtp-1, encoding a metalloprotease involved in skin penetration [26], and this was the only molecule to also exhibit dramatic differential transcription upon stimulation ( Table 4).
The types of proteins encoded by mRNAs that were upregulated upon serum stimulation are very different from those that were down regulated (Tables 5 and 6). Among the 30 most highly up-regulated mRNAs, 17 encoded members of the PRP superfamily [27]. In addition .30% of these mRNAs were predicted to encode secreted proteins. In contrast, most mRNAs that were highly down regulated upon serum stimulation did not possess signal sequences and represented a more diverse group of molecules, including two heat-shock proteins, three PRPs, several novel sequences and a cytochrome P450. Thirteen mRNAs encoding proteases representing all four mechanistic classes, as determined from the MEROPS database [28], were also differentially expressed following serum stimulation (Table 7). In general, the most abundantly represented group of mRNAs associated with activation was the PRPs. Sixty-one different PRP transcripts were identified among the publicly available ESTs and the SSH dataset herein. Thirty-two of these PRPs were associated with activation, 21 of which shared greater than 50% amino acid identity with at Data points are colour-coded to highlight the difference between significant and non-significant log ratios as well as agreement with predictions based on results achieved using suppressive subtractive hybridisation. Yellow, pink, and white points represent sequences with log 2 ratios which were not significantly different from zero (P.0.01). All other points denote sequences with log 2 ratios significantly different from zero (P,0.01). Results for worms from Group III (A) and Group IV (B). doi:10.1371/journal.pntd.0000130.g003 least one of the other activation-associated PRPs. All but three of the activation-associated PRPs, Ac-asp-2 (AW626807), an mRNA similar to Ac-asp-1 (SSH Contig 017), and un-clustered A. caninum EST (BQ667555), were up-regulated upon serum stimulation.

Time course of activation-associated mRNAs
Selected mRNAs were examined at different time points throughout the course of serum stimulation (Table 8). Four mRNA's were up-regulated and 4 were down-regulated within 1 hour of stimulation, then a further 80 responded to stimulation after 6-13 hours.The mRNAs that were rapidly up-or downregulated following activation included the heat shock protein gene, hsp 12.6, the cysteine protease Ac-cp1, a metalloprotease and an mRNA similar to the Onchocerca related antigen (ora-1) from C. elegans. Several PRPs, as well as the metalloprotease Ac-mtp-1 and a neuropeptide-like protein exhibited obvious differential transcription after at most 6 h. Interestingly, Ac_SSH_0042_A, one of the most highly up-regulated activation-associated PRPs, did not appear to increase in transcription until after 13 h of incubation.

Gene ontology analysis
All activation-associated mRNAs were annotated with GO terms based on sequence similarity using the Blast2GO platform, and 3-level summaries were prepared for each aspect of GO, molecular function, biological process and cellular component ( Figure 4). It is important to note that these classifications provide an estimation only of gene function, because the sequence data used are mRNAs, and often only partial sequences. Nonetheless, we identified a number of gene families that were highly  upregulated in activated L3s. The GO category for catalytic activity was significantly over-represented in the mRNAs which were up-regulated upon serum stimulation. Inspection of higherlevel terms in the GO tree showed that this significance was likely accounted for by the many proteases and other hydrolytic enzymes which were up-regulated during serum stimulation (Table 7). Based on the frequency of proteases encoded by the current gene/ cDNA entries (n = 48) for A. caninum in the NCBI databases, only three such molecules would be expected among the 66 annotated, up-regulated mRNAs. Instead, a total of seven were observed. Additionally, three of the 13 genes predicted to encode metallopeptidases were up-regulated in activated L3. The importance of proteases in activation seems evident from Figure 4, which shows that a majority of the up-regulated genes encode 'protein catabolism' functions, a sub-category of the 'biological process' GO category.
The GO data were also analysed in the context of mRNA expression data [29]. This analysis focused on the degree to which mRNA expression within specified GO categories was greater than the global array average ( Table 9). The absolute log 2 ratios of genes associated with defence and the response to external stimuli were 2.4 to 2.9 times greater than the average. The log 2 ratio of genes encoding proteins with a predicted extracellular localization was greater than three times the average, with this trend being reflected primarily by the PRPs and proteases. Interestingly, mRNAs associated with carbohydrate binding were more highly represented than average (EQ = 2.7). The largest group of annotated genes was associated with catalytic activity. Collectively, these genes had log 2 ratios, which were 1.3 times greater than the average. However, hydrolases and lyases were largely responsible for the trend. Hydrolases (n = 93) and lyases (n = 14) were generally up-regulated 1.5 to 1.7 times more than the average, which was considerable given the size of those GO categories.
The GO classifications can be useful for functional comparisons among species. A comparison was made between activated L3 in A. caninum and C. elegans larvae exiting from dauer. The 3-level charts ( Figure 5) display the distribution of GO terms specific to the category of biological process for the 30-36% of hookworm ESTs where a GO function could be assigned. While the ''poststimulation'' transcriptome of both organisms was dominated by genes associated with cellular and physiological processes, it was evident that dauer exit in C. elegans was associated with a substantial increase in the proportion of genes involved in growth, development and reproduction. By comparison, serum stimulation in A. caninum did not result in an increased representation of these mRNAs. Interestingly, even the pool of annotated sequences from C. elegans dauer larvae included (15%) of mRNAs associated with development. This was not the case for the ensheathed, nonactivated L3 of A. caninum.

Discussion
For many parasitic nematodes, developmental arrest at the L3 stage is critical for their survival in the environment. In the hookworm model, the exit from developmental arrest is associated with the invasion of the host, assessment of host suitability (larval hypobiosis vs. development to adulthood) and evasion of the host immune response. Therefore, it is plausible that genes associated with this exit facilitate these processes. Using a SSH-based EST approach, we explored transcription in the L3 stage of A. caninum during the transition from a free-living to a parasitic larva, by simulating the earliest stage of parasitism by hookworms in the mammalian (canine) host via serum stimulation. The ESTs produced using this approach were then incorporated into a customised oligonucleotide microarray together with a set of known A. caninum sequences, in order to carry out a large-scale analysis of transcripts during this transition to parasitism.

Pathogenesis-related proteins
High-level GO summaries demonstrated that a large proportion (12%) of differentially expressed mRNAs appear to be involved in extracellular localization, of which the majority (27) encode a group of proteins belonging to the PRP superfamily. The greater than average transcription of genes encoding PRPs highlights their Log 2 ratios are reported for the most highly down-regulated genes as evidenced by microarray. The standard error of the Log 2 ratio is provided in parentheses. Sequence similarity searches were conducted through dbEST, NCBI, and Swiss-prot using a translated query. (SP) indicates a predicted signal sequence peptide. (PRP) indicates members of the PRP superfamily. PRPs have been termed Ancylostoma secreted proteins (ASPs) in hookworms. doi:10.1371/journal.pntd.0000130.t006 importance in the transition of A. caninum from a free-living to a parasitic larva. Although the functions of these molecules are largely unknown, the identification of eight A. caninum PRP superfamily members from the excretory/secretory products of larvae [13,30] and adults [31][32][33] suggests involvement in hostparasite interactions. Consistent with this hypothesis is the observation that the N. americanus orthologue of Ac-ASP-2, a major vaccine antigen from N. americanus [34], possesses a crystal structure similar to a chemokine, suggesting that it may serve as an extracellular ligand for an unknown host receptor involved in inflammation [35]. Furthermore, Ac-NIF (neutrophil inhibitory factor) [33] and Ac-HPI (platelet inhibitor) [32], the only two A. caninum PRPs for which in vivo functions have been proposed, both exhibit in vitro activities in the mediation of the inflammatory response. All eight of the A. caninum PRPs characterized to date are secreted proteins. Similarly, the presence of predicted signal peptides in most of the activation-associated PRPs suggests that they too are secreted.

Activation-associated proteases
Although most of the mRNAs encoding extracellular proteins were PRPs, others encoded proteases. The importance of the activation-associated proteases was evident in the high-level GO summary of differentially expressed mRNAs (Table 9). In broad terms, catalytic activity was over-represented in mRNAs from activated L3, with proteases largely reflecting this trend. The activation-associated proteases represented four of the major catalytic families, namely the metallo-, aspartic, cysteine and serine proteases. These proteases may serve roles in host tissue degradation, digestion and/or development. For example, the activation associated mRNA Ac_SSH_C_0180 is a likely homologue of Parelaphostrongylus tenuis cpl-1, which encodes a cysteine protease implicated in the digestion of host tissue during the escape of the L3 from the intermediate snail host [36]. While many mRNAs encoding proteases were upregulated upon activation, a few such as those encoding the astacin-like metalloprotease, Ac-MTP-1, were down-regulated. MTP-1 is thought to play a critical role in skin penetration in vivo [26], thus supporting its candidacy as a vaccine antigen [37]. A similar scenario exists in larval schistosomes, where the major protease involved in tissue penetration is pre-synthesized and its mRNA is down-regulated before the cercarial stage infects the mammalian host (reviewed in [38]). Proteases also serve roles in nematode development. For example, O. volvulus cathepsin Z (Ov-cpz-1) is expressed in the cuticle of O. volvulus and is essential for the moult of the L3 to L4 stage [39]. Evidence suggests that the cpz-1 orthologue in C. elegans is also necessary for normal moulting and development [40]. The activation-associated A. caninum mRNA, BQ125325, is cathepsin Z-like and could therefore fulfil a similar role in the moult of hookworm L3 to L4. In addition to moulting, the beginning of the exit from dauer in C. elegans involves a major neurological restructuring [41]. The aspartyl protease Ce-ASP-2 plays an important role in neurodegeneration in this species [42], and the association of a likely hookworm orthologue (Ac_SSH_C_0068) with serum stimulation may also indicate a role in neurological development.
Lastly, other activation-associated proteases may be involved in the digestion of host proteins for nourishment. Ac_SSH_C_0046 is 60% identical to H. contortus pepsinogen (CAA96571) and necepsin I The expression of selected activation-associated mRNAs and hsp-12.  (also referred to as Na-APR-2) from N. americanus (CAC00542.1) [43]. The pepsinogen of H. contortus is expressed in the gut of the adult stage, and mRNAs have been detected in the L4 and adult stages but not in the L3 [44]. Furthermore, its ability to degrade haemoglobin indicates that it could be involved in feeding [44]. The N. americanus aspartyl proteases Na-APR-2, Na-APR-1 and the A. caninum orthologue, Ac-APR-1, are all expressed in the gut of the adult stage where they digest haemoglobin [45].

Comparisons with C. elegans
Mitreva et al. [8] observed that nearly 80% of the A. caninum clusters that were publicly available (before our study here) shared some degree of significant sequence similarity with C. elegans sequences. Moser et al. [7] compared the ''serum-stimulated expression data'' from many of these clusters to the 1,984 mRNAs associated with dauer exit in C. elegans. This was done on a geneby-gene basis, and it was observed that cytochrome P450, two neuropeptides, phospholipase and alcohol dehydrogenase were enriched in the dauer form of C. elegans and in non-activated L3 of A. caninum [7]. Conversely, these authors identified mRNAs representing cytochrome c oxidases, an arginine kinase, a heat shock protein, a glycerol hydrolase and glyceraldehyde 3phosphate dehydrogenase, which were up-regulated in both nematode species following stimulation. Our findings are in accordance with these reports. Neither our study nor that of Moser et al. [7] identified major similarities between the ''activated'' states of C. elegans and A. caninum. This was attributed to the fact that many of the C. elegans genes which were up-regulated were under-represented in the A. caninum dataset [7]. However, even after enriching for mRNAs that are differentially expressed between free-living and activated L3, the lack of similarity between recovered dauers and activated hookworm L3 persisted.
In lieu of a gene-by-gene approach, we used species-independent GOs to assess the similarity of the relevant A. caninum and C. elegans transcriptomes. This analysis made use of the microarray data generated by [46]. This comparison demonstrated that GO annotations specific to growth, development and reproduction were highly represented in the recovered dauers of C. elegans ( Figure 5). This was not the case in A. caninum and is consistent with the observation that serum-stimulation does not invoke moulting of the L3 stage [47]. Conversely, mRNAs from C. elegans dauers or larvae recovered 12 h after beginning of the exit from dauer did not exhibit the significant over-representation of extracellular products as was observed for A. caninum. This finding supports the hypothesis that many of the highly up-regulated mRNAs encoding putatively secreted products are involved in parasitism.
Another major difference between ''activation'' in C. elegans and A. caninum is the down-regulation of several mRNAs encoding genes involved in G-protein coupled signal transduction during dauer exit in C. elegans. Such a down-regulation was not observed for A. caninum in the present study or that of [7]. As opposed to the PRPs of activated A. caninum L3, the predominant transcripts in C. elegans 12 h after beginning the exit from dauer included a plethora of collagens, many of which were up-regulated $32-fold [46]. Even after enrichment for activation-associated mRNAs, only three potential collagens were identified from A. caninum and only one of these (cuticulin) was significantly up-regulated. Based on this information, the activation of A. caninum larvae and the exit from dauer involve considerably different mRNAs. However, the mechanisms by which these mRNAs are regulated may be similar. Ce-hsp-12.6 is a well-known direct target of the fork head transcription factor (designated DAF-16) in C. elegans [48]. Transcripts for this gene are down regulated during dauer exit. Interestingly, it was observed that the A. caninum orthologue of hsp-12.6 (contig 313 from the publicly available ESTs) was also significantly down-regulated during serum stimulation. Assuming that the transcription of this gene is also under the direct regulation of a DAF-16 homologue, the earliest transcriptional events in the transition of A. caninum L3 to parasitism may also be regulated by DAF-16. Real-time PCR conducted on several activation-associated mRNAs at various time points throughout the serum-activation process showed that the levels of many of these mRNAs changed rapidly, as half of those assessed achieved log 2 ratios of noticeably more than zero in less than 1 h. The hsp-12.6 transcript was represented in this group of ''early responder'' molecules. Other mRNAs with similar expression profiles may also be directly regulated by DAF-16. Interestingly, one of the most highly up-regulated PRP mRNAs did not increase substantially in transcription until 13 h after serum-stimulation, which suggests that it may be under the indirect control of DAF-16. Expression quotients were calculated for each 3-level gene ontology term mapped to activation-associated genes. The quotients are calculated as the mean of the absolute log 2 ratio of a category divided by the global log 2 ratio of the chip. A single asterisk indicates that the 95% confidence interval for the quotient did not include 1. Differences observed in n (number of genes with array data) and N (number of annotated genes in the suppression subtractive hybridization and public EST databases) are due in combination to probe failure as well as an inability to produce unique probes for some genes. doi:10.1371/journal.pntd.0000130.t009

Genes of unknown function
The SSH enrichment of activation-associated mRNAs identified 17 sequences which were up-regulated $4-fold and appeared to be unique to parasitic nematodes (Table S1). For example, the EST Ac_SSH_C_0056 was similar in sequence to an uncharacterised gene, ora-1, from C. elegans which is related to an O. volvulus antigen (Ov39) and is thought to play a role in the ocular pathogenesis caused by this parasite [49]. The mRNA representing Ac_SSH_S_0199 was up-regulated nearly 9-fold with the EST showing sequence similarity to the genes Hc-nim-1 and Hc-nim-2 from H. contortus. The mRNAs encoding these genes are abundant in adult H. contortus and represent almost 10% of total mRNA [50]. Hc-NIM-1 is expressed in the hypodermis of the pharyngeal region of the adult worm [50]. Lastly, SSH contigs 0099 and 0032 were among the most highly up-regulated mRNAs in activated L3 of A. caninum, with log ratios of ,4.7 and 5.9, respectively. Both were predicted to possess a signal peptide and appeared to be specific to A. caninum. Their apparent novelty and stage specificity suggest that they are parasite-specific molecules which might be involved in interactions with host tissues. Functional characterization of these and other novel activation-associated mRNAs may provide insights into the roles that such molecules play in the transition to parasitism. Furthermore, this information may warrant investigating their potential as targets for novel therapeutics.

Conclusions
Having identified a suite of mRNAs associated with serum stimulation, future efforts should be focused on gaining an understanding of the biological function/s of selected members of these parasitism-associated genes. Of particular interest is the large group of PRPs that are up-regulated upon serum stimulation. In combination with their considerable stage specificity and diversity, many of these PRPs may have evolved to perform several coordinated yet distinct functions involved in the parasitic process.
Given that PRP-like proteins occur in a wide range of taxa, delineating their function could potentially provide a deeper insight into their roles in parasitism as well as their broader biological significance. It is also of interest that the two most efficacious hookworm vaccine antigens, ASP-2 and APR-1, are members of the two most represented families/groups of proteins associated with this transition to parasitism, PRPs and proteases. We believe that this bodes well for the pursuit of these new molecules identified by SSH as targets for novel vaccines and drugs.
The near absence of mRNAs associated with reproduction, growth and development among activated hookworm L3 probably reflects their ability to further arrest in tissues of non-permissive hosts or in the external environment when conditions for transmission are unfavourable. Although this should not invalidate C. elegans dauer exit as a model for hookworm activation, it highlights the limitations of this free-living nematode as a model organism for the transition of nematode larvae from a free-living to a parasitic state. Figure S1 Magnitude (M) versus Amplitude (A) plot of dye-swap self-hybridisations. RNA from serum-activated third-stage larvae (L3) of A. caninum was labelled with Cy3 or Cy5 for ''self'' hybridisation. The resultant log ratios were plotted against the log2 of the mean signal intensity, thus providing a visual means of inspecting potential artefacts for a range of signal intensities. A selfhybridisation was also performed for RNA extracted from nonactivated L3 as a control. Found at: doi:10.1371/journal.pntd.0000130.s001 (0.08 MB PPT)