Stress Responses of Small Heat Shock Protein Genes in Lepidoptera Point to Limited Conservation of Function across Phylogeny

The small heat shock protein (sHsp) family is thought to play an important role in protein refolding and signal transduction, and thereby protect organisms from stress. However little is known about sHsp function and conservation across phylogenies. In the current study, we provide a comprehensive assessment of small Hsp genes and their stress responses in the oriental fruit moth (OFM), Grapholita molesta. Fourteen small heat shock proteins of OFM clustered with related Hsps in other Lepidoptera despite a high level of variability among them, and in contrast to the highly conserved Hsp11.1. The only known lepidopteran sHsp ortholog (Hsp21.3) was consistently unaffected under thermal stress in Lepidoptera where it has been characterized. However the phylogenetic position of the sHsps within the Lepidoptera was not associated with conservation of induction patterns under thermal extremes or diapause. These findings suggest that the sHsps have evolved rapidly to develop new functions within the Lepidoptera.


Introduction
Heat shock protein (HSP) families have been widely associated with physiological processes and stress resistance. The small heat shock proteins (sHSPs) (12-43 kDa) represent a particularly heterogeneous [1] family of HSPs, in contrast with other HSPs such as HSP70 and HSP90 that are highly conserved [2,3]. Many sHsp genes are upregulated under thermal stress and may protect insects against stressful conditions; for example Hsp26 is thought to protect Aedes aegypti larvae and pupae from heat stress [4], while Hsp20 may underlie a tradeoff between thermal protection at the cost of reproductive impairment in Liriomyza huidobrensis [5]. The sHsp genes may also play a role in cold tolerance, as suggested by the upregulation of Hsp20 expression in the recovery of Drosophila melanogaster from cold injury [6]. Hsp23 is strongly upregulated during pupal diapause in the flesh fly, Sarcophaga crassipalpis [7,8], while Hsp20.8 expression has been linked to diapause in Sesamia nonagrioides [9]. The sHsps therefore represent an important inducible chaperone family involved in stress responses.
The conservation of sHsps has rarely been considered in insects, in contrast to Hsp70 which shows a high degree of evolutionary conservation both in its sequence and in its induction pattern [10,11]. The functional significance of the sHps in many insect orders including the Lepidoptera remains unclear, despite an increasing amount of sequence information becoming available. Lepidopteran sHsps are thought to have evolved independently from sHsps in other orders, except that one ortholog, Hsp21.4-like protein, has been discovered in insect species from multiple orders [12][13][14]. The Hsp10 family also appears to be highly conserved across insects, and also in vertebrates where it acts as a co-chaperone with Hsp60 [15]. Some studies have considered sHsp expression patterns in Lepidoptera in an attempt to understand their potential function. Changes in expression patterns suggest that the sHsp genes in the armyworm, Spodoptera litura, may have roles in cold responses as well as moth metamorphosis [14,16]. In the codling moth, Cydia pomonella, three sHps were upregulated under heat stress and were persistently expressed during development [17], suggesting a role in both processes.
The Oriental fruit moth (OFM), Grapholita molesta, one of the world's most invasive orchard pest insect of stone and pome fruits, causes severe economic loss in global fruit industry [18][19][20]. It is thought to have originated in China and spread to East Asia, Europe, Africa, Australia, and America during last century [20]. OFM has gained recognition as a global pest of peaches and other fruit crops since its introduction into America [21]. Its generation time depends on temperature and latitude, and varies considerably across regions; for example one or two generations occur in northeastern Asia, compared to six to seven annually in Georgia, USA [22,23]. To overwinter in cold environments, OFM diapause at a late larvae stage, which is induced by low temperature and a short photoperiod [18]. The developmental threshold of OFM ranges from 4°C to 11°C at the lower end, and from 30°C to 35°C at the upper end, suggesting a wide thermal performance breadth which has likely assisted their worldwide expansion [19].
Here we describe the sHsps genes in OFM and their expression patterns across developmental stages, across tissues and in response to heat stress, cold stress and diapause. We use these data to address a number of questions. (1) How do the sHsp and Hsp10 genes of OFM compare to those of other Lepidoptera? Do they fall into specific phylogenetic groupings? (2) Are there unique or common patterns of expression changes in the sHsps when they are under the different stresses or at different temporal/spatial stages? (3) Has the putative functional role of particular sHsps as defined by their patterns of induction been conserved across different lepidopteran species, such that related sHsps group together?

Insect preparation
We collected OFM larvae from an experimental station in Hebei province (N39°47 0 , E118°41 0 ), where the pests could be sampled without permission. The field studies did not involve endangered or protected species. OFM were reared on fresh apples in the laboratory over two years at 24°C and 60% relative humidity under a photoperiod of 15: 9 (L: D). Six developmental stages of OFM, 3-day-old eggs, 1 st instar larvae, 3 rd instar larvae, 5 th instar larvae, pupae, and adults (equal numbers of males and females), were collected and frozen in liquid nitrogen before being stored at -80°C for further RNA extraction. The 5th instar larvae were dissected on ice, and each tissue including head, thorax, abdomen, fat body, and midgut were kept in RNAlater (Qiagen, Hilden, Germany) at -80°C.
To characterize expression patterns of Hsps in adults under thermal extremes, high and low temperatures were initially defined based on pilot experiments; these were the most severe conditions to which recently-eclosed adults could be exposed in the absence of mortality. Moths that had eclosed within 24 hours were therefore kept at -5°C or 40°C for 1 and 2 hours, and then left to recover at 24°C for 2 hours before being frozen in liquid nitrogen. Controls were treated identically but were not stressed. The thermal treatment is based on Drosophila Hsp studies [6,24] and a pilot experiment, in which different recovery and stress periods were considered. Those sHsps that were responsive under 2 h were also highly responsive under shorter or longer exposure times, while non-responsive sHsps failed to be expressed under all conditions tested. For the diapause treatment, diapause was induced by rearing moths at same temperature of 24°C under a photoperiod of 12 hours [25]. Each treatment consisted of five biological replicates, with 14-20 individuals (7-10 per sex) per replicate.

RNA extraction and cDNA synthesis
Total RNA was isolated with an RNeasy Mini kit (Qiagen, Valencia, CA, USA) and digested with DNase I (Invitrogen) following supplier's instructions. The concentration of RNA was measured using a NanoDrop spectrophotometer (ND-1000, Thermo Fisher Scientific, Wilmington, DE). 2 μg of total RNA was used as a template in first strand cDNA synthesis using M-MLV reverse transcriptase following the manufacturer's instruction with modified oligo (dT) (Promega, Madison, WI, USA). The final 25 μl volume cDNA templates were used for Real-time quantitative PCR of gene expression. The single strand cDNA from the whole body of OFM fifth instar larvae serving as the RACE template was synthesized by 1 μg of total RNA following instructions in the SMART RACE cDNA Amplification Kit (Clontech, Mountain View, CA, USA).

Clone of full length OFM sHsps
The conserved α-crystalline domain of lepidopteran sHsps was used to design the degenerated primers for full length sHsps in OFM. 5'-and 3' RACE was done separately by universal primers and gene specific primers with the following cycles: five cycles consisting of 94°C for 5 s, 72°C for 3 min, then five cycles consisting of 94°C for 5 s, 70°C for 10s and 72°C for 3 min, and then 25 cycles consisting of 94°C for 5 s, 68°C for 10 s and 72°C for 180 s. A final extension was performed under 72°C for 10 min. The PCR products were visualized by electrophoresis in a 1.5% agarose gel and purified using an ABgene Ultra PCR Clean-Up Kit (Thermo Scientific) before they were cloned into a pGEM-Teasy vector (Promega, Madison, WI, USA). The products were sequenced on a 3730xl DNA Analyser (Applied Biosystems).

Real-time quantitative PCR (RT-qPCR)
The sHSP RT qPCR primers were designed based on the full length cDNA transcripts (S1 Table). All cDNA templates were reversely transcribed from 2 μg total RNA in the final 25 μl volume, and 1 μl was used in the RT-qPCR reaction. β-actin and GAPDH were both chosen as the reference genes for normalizing the mRNA expression level, however data were also normalized without using reference genes through the Norma-gene normalization algorithm [26] which involves normalization based on the experimental data rather than reference genes to avoid the issue of error in reference gene expression, and the assumption that the reference genes are not influenced by the different conditions tested. The sHsp transcript standards were produced by cloning the PCR products into plasmids. Seven 10-fold serial dilutions of stock plasmid served as quantitative standards to estimate the relative expression of each sHsp gene transcript. RT-qPCR was conducted in a 20 μl volume comprising of 10 μl 2×SYBR Green PCR Master Mix (Takara, Dalian, China), 1μl cDNA template/plasmid, 1μl RT-qPCR primers (10mM working solution), and 8 μl dd H 2 O. A Stratagene Mx3000P thermal cycler was used. The PCR cycling parameters were as follows: initial denaturation of 10 s at 95°C, and 40 cycles of 5 s at 95°C, 20 s at 55°C, and 20 s at 72°C. Each experimental group included five biological replicates, each of which contained three technical repeats.
Differences between treatments for normalized expression were compared either by t-test (for comparison of two means), or by one-way analysis of variance (ANOVA) followed by a Tukey B test for posthoc comparisons among means, run with SPSS 18.0 software. Treatment differences were considered significant at p < 0.01. Values are denoted as means ± SE (standard error). Unsupervised hierarchical clustering was performed with Cluster v3.0 software using uncentered Pearson correlations and complete linkage, and run with Java TreeView software [27,28].

Phylogenetic analysis
The NCBI BLAST tool was used to detect sHsps and ensure the fragments sequenced involved sHsp conserved domains. The open reading frames (ORFs), translations, and predicted secondary structure of sHsps were obtained online (http://www.bioinformatics.org/sms2). The molecular weights of sHsps were also estimated by an online tool (http://web.expasy.org/ protparam). For comparisons of sHsps across diverse groups, the sHsp gene sequences derived from other insect species were downloaded from GenBank, (S2 Table) and aligned in MEGA6 [29]. All phylogenetic analyses were run using Bayesian Inference (BI), Maximum Likelihood (ML) and Neighbor Joining (NJ) methods. BI analysis was performed using MrBayes, v 3.1.2 [30]. Two sets of four chains were allowed to run simultaneously for 1,000,000 generations. Each set was sampled every 100 generations with a burn-in of 25%. Stationarity was considered to have been reached when the average standard deviation of split frequencies was less than 0.01. Bayesian posterior probabilities (BPP) were estimated based on a 50% majority rule consensus tree of the remaining trees. ML analysis was conducted using an online program (http:// www.atgc-montpellier.fr/phyml/) with a GTR substitution model. Bootstrap values were obtained for the nodes based on 1000 replicates. NJ trees were built in MEGA5 based on p-distance and bootstrap values for nodes were based on 1000 replicates.

Phylogenetic reconstruction and signal detection
To examine the possible evolutionary conservation of sHsp functions in response to temperature and diapause, the expression traits were analyzed to trace phylogenetic signal based on a tree generated through BI. We collected data on temperature responses of lepidopteran species from published references, and characterized sHsp expression into three categories (upregulated, no change, or unknown) under the three conditions considered (heat, cold and diapause). Downregulated patterns were not considered because these were absent from our study and the published literature. We then evaluated the number of evolutionary steps between genes with similar functions (same pattern of expression under a condition) based on parsimony reconstruction, and compared these with the number of steps expected by chance under a null model. Expression characteristics were reshuffled 1000 times across the tips of the phylogeny to obtain confidence intervals and test for significance under p = 0.01. These comparisons were run in Mesquite 1.12 [31].

Cloning and characterization of GmHsps
The full-length Hsps were obtained by using 5' and 3' rapid amplification of cDNA ends from the cDNA library of the whole body of OFM. Fourteen GmHsp genes were sequenced and deposited in GenBank (KP843895-KP843908). Based on their predicted molecular weight, they were named as GmHsp11. 1 GmHsps were classified into two families, 13 of them being sHsps and one being Hsp10 (also named as chaperonin 10, Cpn10) (Figs 1 and 2). A conserved α-crystalline domain comprising about 100 amino acids was detected in all OFM sHsps, containing 9 β-sheet sandwich structures numbered β2 to β10 (Fig 1). The typical 12 putative dimer interfaces were found to have highly conserved sites of the metazoan α-crystalline domain (ACD), while GmHsp22.1 only presented eight putative dimer interfaces. While GmHsp20.4 was classified as a sHsp, no dimer interface was found, as its conserved domain was ACD sHsp p23-like. Due to the high divergence of the sHsp sequences, genetic distance comparisons were based on the 105 bp conserved ACD characterized by NCBI. GmHsp20.4 showed higher average genetic distances based on the number of amino acid substitutions per site between sequences, ranging from 1.881 substitutions with GmHsp24.8 to 2.708 with GmHsp19.8a. While C and N terminals showed significant differentiation between the full-length gene of GmHsp21.4 and GmHsp21.7, their distance between conserved ACD domains was the lowest (0.00). The only GmHsp11.1 in OFM showed typical Cpn10 superfamily characteristics (NCBI-Conserved domains accession: cl09113), with two roof hairpins, 13 putative oligomerisation interfaces, and one mobile loop (Fig 2).

Phylogenetic analysis and alignment of GmHsps
To analyze the relationships of OFM sHsps to those of other insects, 71 lepidopteran sHsps including the 13 GmsHsps were collected, and about 100 bp of ACD were truncated to construct Bayesian phylogenetic trees (Fig 3A), which were similar with other trees obtained by different approaches (S1 Fig). The sHsps from lepidopteran species presented two clusters, one with a metazoan ACD and the other one with an ACD p23-like domain, though only with 50% and 73% supporting values respectively. The high level of variation among metazoan ACDs was also evident among lepidopteran sHsps. Three GmHsps showed a close relationship with those from the related species Cydia pomonella, from which only three sHsps have been identified so far [17]. The phylogenetic tree also showed that not all the sHsps from the same species clustered in the same branch, as in the case of Bombyx mori and Danaus plexippus. The orthologous clusters were prevalent in lepidopteran sHsps, including GmHsp19.6-like, GmHsp21.3-like, and GmHsp31.8-like clusters. An unrooted neighbor-joining tree of 13 sHsps was constructed to show the relationship within the sHsp family in this species (Fig 3B). The conserved domain (ACD) of sHsps was clustered with high supporting values, while GmHsp20.4 containing the p23-like domain was isolated. This analysis also indicated that the sHsps with a similar molecular weight did not follow phylogeny, suggesting that the molecular weights of sHsps depended on the variable N-and C-terminal extensions rather than overall similarity.
Compared with the complex relationship and high level of divergence of OFM sHsps, GmHsp11.1 of the Cpn10 family exhibited a high level of conservation among insects from different orders (Fig 4). Three lepidopteran Cpn10 genes were clustered within a branch with 100% support. Within the Diptera, the suborders Nematocera (mosquitos) and Brachycera (fruit flies) were separated.

Expression in response to thermal treatments and diapause
Expression of most GmHsp genes was significantly upregulated under thermal stress, with average fold changes relative to controls ranging from 5-fold to 270-fold. In response to heat  and cold stress, eight genes (GmHsp19.6, 19.8b, 19.9, 21.4, 21.7, 24.8, 31.8, 11.1) were consistently upregulated, particularly GmHsp19.6, whose expression significantly increased with treatment duration (Tables 1 and 2). Ten GmHsp genes were upregulated under heat shock compared with eight genes under the cold treatment. Three genes (GmHsp19.8a, GmHsp20.4 and GmHsp21.3) did not change in expression during the thermal treatments. The hierarchical cluster analysis for all differentially expressed genes revealed changing patterns of expression for most genes in response to heat and cold stress (  Low phylogenetic signal in lepidopteran sHsp expression patterns The lepidopteran sHsp genes were tested for phylogenetic signal of expression patterns in Mesquite (S3 Table). Expression responses to the three stresses of heat, cold, and diapause showed no evidence of being conserved within phylogenetic lineages (S5 Fig). This was based on the number of steps expected for each character when compared to the number of steps expected from randomly generated trees. A character was considered to possess phylogenetic signal if there were fewer steps than in 99% of the randomly generated trees. There were four evolutionary steps between two traits in the heat response character, falling out of the range of the null model (mean step = 8.4, range 6-9). For expression changes under cold and diapause conditions there were 15 and 14 evolutionary steps respectively, which did not differ significantly from the 18.1 (range 15-21) and 13.1 (range [11][12][13][14] steps predicted under a null model. These findings suggest that there was no detectable phylogenetic signature in functional patterns of the sHsps based on the criteria given.

Discussion
In the present study, fourteen new members from the sHsp and Hsp10 families have been identified from OFM. The expression levels of nearly half of these sHsp genes changed significantly in response to thermal stresses. These sHsp are likely to be important Hsp members involved in development and stress tolerance in OFM. However, no phylogenetic signal was found in sHsp responses to stress or diapause, suggesting an inconsistency between similarities in Lack of Phylogenetic Signal in sHsps sequence, structure and function of the lepidopteran sHsps. This may reflect rapid evolution of sHsp genes towards new functions under various environments.

High variation of sHsp in sequence feature
All thirteen sHsps in OFM can be separated into two different branches, ACD and p23-like sHsps (Figs 1 and 3). Among them, 12 sHsps were found to contain characteristic ACD, which shows the same number of paralogues of typical ACD in the D. melanogaster genome [32]. One orthologous group was reported in Lepidoptera [13], and GmHsp21.3 is a member of this common ortholog in OFM. It is very different from sHsps of humans in which all 10 orthologous sHsps can be identified in other mammals [33]. Indeed, most of the sHSPs of Lepidoptera show extensive sequence variation because of the N-terminal arm of divergent sequence and variable length and a C-terminal extension [34,35]. This high variation of sHsp sequences contrasts markedly to conserved patterns for the other Hsp families, including Hsp70, Hsp90, and Hsp10 [36]. GmHsp11.1 was identified as a member of Hsp10 which is highly conserved among insects. The conservation of Hsp10 is as high as for Hsp90 (41-76% vs. 25-88%) [37,38].
The potential functions of sHsp in temporal and spatial expression Gene functions are often deduced from known gene families with similar conserved sequence structure [39,40]. Four out of 12 sHsp members of Drosophila have been well-characterized for their expression patterns and functions [41], but this information cannot necessarily be used to deduce function for sHsp genes in other insects. The function of Hsp21.3 also has not yet been established even though it is commonly found in Lepidoptera. We thus inferred the potential functions of sHsps in OFM from temporal and spatial patterns of expression in our study and from previous research suggesting that various sHsp may be involved in insect development as well as stress responses. A developmental function for Hsps has been suggested by tissue-specific and tightly regulated patterns of expression during development in Drosophila [42][43][44].
The expression profiles of GmHsp11.1 may point to a putative function in reproduction, embryo development and maternal effects, because of its high expression at the adult and egg stage, in contrast to a steep decrease at the late larval stage. Hps10 is known as a chaperone in mitochondria where it is involved with cell proliferation and differentiation [45]. The increased sHsps (GmHsp18.9, GmHsp19.8a, and GmHsp31.8) from the muscles/cuticle of the abdomen might link to metamorphosis. GmHsp21.3 may also play a role in development given its increased expression from the neo-larval to pupal stages. The 5 th instar stage is key for OFM entering pupal metamorphosis or diapause. High expression of three GmHsp genes (GmHsp19.9, 21.3, and 21.7) in the head of 5 th instars may point to a possible function in signal reception and transmission. Many genes are potentially involved with diapause [46,47], and in our study GmHsp21.3 is the most markedly upregulated together with five other GmHsps during early diapause. Hsp23 has previously been shown upregulated in diapausing flesh fly [7,8], while ArHsp21 was shown to be upregulated in diapausing embryos of Artemia franciscana [48]. The upregulation of sHsp genes during diapause of OFM may point to a protection function in cold environments, contributing to cold-hardiness of overwintering insects in combination with other Hsps such as Hsp70 [49]. However, there was no common feature of sHsps that were upregulated in diapausing insects.
Our expression results suggest that sHsps may be important in sub-lethal temperature tolerance in OFM, acting as molecular chaperones under extreme conditions. In OFM, sub-lethal temperatures frequently arise in different seasons. For example, low temperatures at night often occur in early spring or late autumn when OFM adults might regulate expression of genes such as GmHsp19.6 and GmHsp21.7. Hot days in summer that are increasing in frequency under climate warming might induce expression of genes like GmHsp19.9 and GmHsp24.8, particularly in adults that cannot easily find a refuge unlike larvae inside plant tissue.

Lack of phylogenetic signal of sHsps expression for function divergence
We analyzed sequence and expression data available for sHsps to test for potential functional conservation (heat, cold and diapause) among lepidopteran insects. We found that similarities in sequence were inconsistent with functional conservation in sHsp genes, as assessed by the association between stress responses and the expression of sHsps across a lepidopteran phylogeny covering five families. Our findings do not fit in well with the notion that conserved domains can be used to extrapolate gene function from model species to target species under investigation. This may reflect the fact that sHsps with the unified structural ACD do not display conserved functional binding sites affecting expression patterns. Those sHsps with relatively constant expression patterns may have "housekeeping" roles in maintaining molecular structures at certain stages [50] rather than acting as conventional stress chaperones.
The inconsistent sHsp response patterns suggest that sHsp responses are evolutionarily labile. While sHsps are induced at high temperatures in many organisms [51], not all described sHsps show increased expression when organisms are placed under temperature stress [12,14,52]. Given the large number of sHsps in organisms, it is possible that only a subset is required for chaperone functions to maintain cellular viability. This may leave sHsps free to evolve additional functions apart from needing to act as stress chaperons and protective agents [42]. The absence of strong phylogenetic signal points to functional constraints being restricted to closely related groups rather than higher taxonomic levels, even though sHsps might not necessarily always diverge after speciation, as in the case of the ortholog hsp21.3.
Genome sequence data is continuing to expand knowledge about the structure of sHSPs [34,35], but the most sHsp families in insects remain poorly characterized. Thus, more information is needed to further assess the presence/absence of phylogenetic signal and links to functional roles. This involves testing more species with divergent traits (particularly for cold and diapause responses) and also characterizing expression patterns in more detail. Signal might be present but not evident from whole organism comparisons; for instance lepidopteran species might exhibit temporal and/or spatial variation in expression. In addition, the classification of sHsp by molecular weight and conserved domains should be further characterized, and functional tests need to also focus on manipulations of gene expression such as through RNAi [53].