Introduction
Invasiveness and diet breadth
Prevailing theories suggest that the majority of non-native species introduced to new habitats fail to establish due to travel stress, climate incompatibility, inadequate or inappropriate food resources, or small population size, among other factors [1]. For species that do successfully establish, their potential to dynamically adapt phenotypic expression to environmental conditions [2] is thought to be correlated with establishment success [3]. If the underlying biological causes behind invasion success were to be identified, global invasive control methods could be more targeted and efficient [4].
Food resource access can mediate invasive species establishment success [1]. Herbivorous invasive insect species are on a continuum with regard to their dietary specialization [5]. In general, species that feed on one or a few closely related plant species are considered to be monophagous “specialist” herbivores, whereas species that feed on more than one plant family are polyphagous “generalist” herbivores. Variation in species phenotype, including diet, can be caused by genetic and/or environmental factors [6]. It has been proposed that a species that consumes varied host plants must account for differentiation between plants and develop an all-purpose phenotype [3], whereas a species that consumes one to a few plants can specifically optimize its usage: this set of evolutionary tradeoffs is often summarized as ‘jack of all trades, master of none’ [5]. Other work has proposed that divergence in diet breadth is a byproduct of nonadaptive evolutionary forces such as drift [7]. Data on gene expression in generalist herbivores supports the trade-off idea, finding that generalist herbivores have less fine-tuned gene regulation responding to different host plant diets, and broader patterns of gene regulation occur in generalists compared to specialists [8]. In general, herbivorous insect generalists are thought to rely on transcriptional plasticity to respond to dietary variation [9,10].
Invasiveness and asexuality
The general-purpose genotype hypothesis [3] has been applied to invasive asexual organisms to explain their success. This hypothesis (as proposed by [11]) postulates that asexually reproducing species tend to have less strict habitat requirements, which allows wider spatial and environmental ranges compared to related sexual species. Asexually reproducing species require only one individual to start a population, which is advantageous for establishment and invasiveness [11]. The twofold reproductive capacity of asexual organisms is a marked advantage for invasion [12,13].
Within the weevil tribe Naupactini, several flightless species have been found to reproduce parthenogenetically [11]. Reduced flight capacity has been hypothesized to be positively related to parthenogenetic species colonization in heterogeneous landscapes [13]. Furthermore, flightlessness and obligate parthenogenesis have been linked to extreme polyphagy in successfully invasive insects [12].
Naupactus weevils are a taxonomic group of approximately 170 species of medium-sized weevils, covering a native geographic range between Mexico and Argentina [14–16]. The asexual weevil species Naupactus cervinus and N. leucoloma reproduce via apomictic parthenogenesis, in which offspring are produced from unfertilized, diploid egg cells. Parthenogenetic species are thought to establish successfully due to their ability to preserve successful genotypes via clonal reproduction, as beneficial gene relationships are preserved under extreme linkage disequilibrium [17].
Fuller’s rose weevil, Naupactus cervinus, is a highly polyphagous species [12,18]. Native to South America, it has successfully established invasive populations in many countries via commercial trade, including the United States and Australia [19]. The white-fringed weevil, Naupactus leucoloma, is also parthenogenetic, invasive, and highly polyphagous [20]. Native to central and northern Argentina, southern Brazil and Uruguay, this species has successfully established populations in Chile, Peru, Australia, New Zealand, South Africa, and the United States [21]. Most damage to crops and other host plants by both weevil species is caused by larvae feeding on roots, while the damage caused by the leaf feeding adults is usually less significant [22].
Even in the absence of genetic variation, parthenogenetic species can still become successfully established invasive species. In that case, what kinds of genetic and/or transcriptional adaptation and acclimation do these parthenogenetic species employ to acclimate to a new environment?
Differential gene expression in targeted gene categories
Prior initial sequencing of the transcriptomic differences of N. cervinus and N. leucoloma demonstrated that host plant produced a wider variance of differentially expressed genes than tissue effects (unpublished work). Thus, a comparative transcriptomic approach was employed to measure phenotypic variation of Naupactus weevils in response to host plant type, specifically focusing on differential expression of genes that mediate functions that may impact invasion success. We also explored changes in gene expression in the offspring, given that transgenerational and maternally influenced epigenetic modifications have been found to impact the expression of fundamental survival traits (i.e. lifespan and age at maturity) [23,24].
One well-documented group of genes important for finding suitable host plant species in herbivorous insects are host detection genes associated with olfaction and taste, such as odorant-binding proteins [25–27]. Another key functionality important for herbivore adaptation is that of detoxification and neutralization of plant secondary compounds; differential regulation of detoxification genes has been correlated with successfully feeding on new host plants [28]. Detoxification genes may form the short-term first line of defense for herbivorous insects introduced to a new host [29]. Moreover, detoxification of host plant defenses may continue to be a challenge given that a generalist’s longer-term response to a new host has been shown to include three times more differentially expressed genes related to detoxification [30]. Gene pathways known to be involved in the detoxification response of herbivorous insects include cytochrome p450, gluthathione-S-transferases, UDP-glycosyltransferases, carboxylesterases, ABC transporters, and glutathione peroxidases [26,31,32].
Generalist herbivores feeding on a variety of plants are also often exposed to a wider range of pathogens and toxins, which drives a stronger selective pressure on generalists’ immune systems [33]. Host plants of different nutritional qualities and defense capacities could alter herbivore immune defense response in a species-specific manner [33,34], or generalists may have evolved general immune defense upregulation mechanisms that do not vary between hosts [35]. Alternatively, resource investment in immune defense mechanisms could decrease as introduced invertebrates move away from their co-adapted pathogens [4]. According to the enemy release hypothesis [36], individuals in a new environment will reallocate resources associated with immune defense towards growth and reproduction.
Our analysis therefore includes five groups of contrasts as detailed below: two between weevils feeding on hosts from different plant families (Fabaceae, Rutaceae and Asteraceae in contrasts grouped as Legume vs. Other, Legume vs. Citrus); one between weevils feeding on the same host plant species under different cultivation conditions (Conventional vs Organic); one between weevils feeding on host plants within the same host plant family (including members from Rutaceae, Fabaceae, and Asteraceae); and finally, one between weevils feeding continuously on one host plant versus weevils that have been transferred onto a previously un-encountered, but edible host. In those contrasts we will explore differential regulation of genes related to olfaction and chemosensory cues, those related to detoxification of host plant secondary compounds, and those related to immune system response genes. We will explore differences in the number of upregulated genes, the intensity of the increased expression (measured in fold change and other indexes of differential expression), and numbers of uniquely differentially expressed genes in all three gene categories in both immature and adult tissues.
Predictions for expression patterns in weevils feeding on legume hosts vs. other (non-legume) hosts
As potential host plants, legumes harbor a high diversity of defensive secondary metabolites, including alkaloids, amines, cyanogenic glucosides, and non-nitrogen-based compounds such as phenolics and terpenoids [37]. Cyanogenic glucosides in particular are lethal to most herbivores, as they can disrupt cellular respiration and effectively shut down cellular functionality. Nitrogen-based defensive compounds are fairly unique to Fabaceae due to their association with nitrogen-fixing rhizobia. High levels of nitrogen in host plants are preferred by insect herbivores [38], because insects cannot produce their own nitrogen and must derive nitrogen nutritionally [9,25]. Previous studies indicate that herbivorous insects perform best on plants with high levels of rhizobial interactions [38]. Because these legume-specific chemical defenses are damaging to herbivorous insects, there is a strong evolutionary pressure on legume-feeding species to develop adaptive mechanisms by which they can effectively break down these nitrogen-based defensive compounds [25]. In Naupactus specifically, N. cervinus larvae performed better on a legume host [18], and N. leucoloma has been shown to prefer legume species [39]. We predicted that when comparing differentially expressed genes between N. cervinus and N. leucoloma weevils feeding on legume host plants versus other (non-legume) host plants, there will be more differential regulation of genes in the three targeted categories in weevils feeding on legume host plants in both adult and immature tissues than in weevils feeding on non-legume host plants.
Predictions for expression patterns in weevils feeding on legume hosts vs. citrus hosts
Citrus (family Rutaceae: subfamily Citrinae) also produce a variety of defensive secondary metabolite compounds, such as limonoids, flavonoids, alkaloids, carotenoids, and phenol acids [40]. As some of these defensive compounds are unique to citrus, successful citrus herbivore species must have some counteracting or defensive mechanisms to allow them to survive. Despite the systemic nature of many citrus species’ defense responses, some of the strongest chemical defenses produced by citrus, such as limonene, occur in the fruit itself, which Naupactus does not consume (ex. [41]). Because Naupactus larvae feed on root tissue while adults feed on leaf tissue, it is likely that the secondary metabolites produced by legumes will be more deleterious to Naupactus weevils than those produced by citrus. We predicted that when comparing differentially expressed genes between N. cervinus weevils feeding on legume host plants versus citrus host plants, there will be more differential regulation of genes in the three targeted categories in weevils feeding on legume host plants in both adults and immature tissues than on weevils feeding on citrus host plants.
Predictions for expression patterns in weevils feeding on organically grown vs. conventionally grown oranges
There is inconsistent evidence regarding the effects of organic versus conventional farming techniques on agricultural pest burdens. Some research proposes that generalist diets predispose herbivorous insects towards evolving effective insecticide resistance, making feeding on conventional hosts less costly [7]. Regardless of herbivore diet breadth, the assumption is that applying insecticides to host plants will make insect feeding more difficult, and conversely, reducing chemical insecticide usage on plants will increase the pest burden [42]. However, no significant correlation was found between pest damage and farming management approaches for garden tomatoes [42]; it is possible that organically grown plants not exposed to insecticides are capable of synthesizing their own chemical defenses. The addition of insecticides to a conventionally raised plant may interfere with the natural defense response of the plant, and an organically raised plant may be able to upregulate its defensive response in ways that conventionally raised plants cannot. We predicted that when comparing differentially expressed genes between weevils feeding on organically treated host plants versus conventionally treated host plants, there will be more differential regulation of genes in the three targeted categories in weevils feeding on organically cultivated host plants in both adults and immature tissues than on weevils feeding on conventionally cultivated host plants.
Predictions for expression patterns in weevils feeding on different host plants within the same host plant family
If it is true that legume and citrus hosts are more resource-taxing to herbivores compared to other hosts, it could be expected that herbivores that feed on highly chemically defended species will have more species-specific transcriptional responses, and that the weevils consuming these host plants have acclimated to these defenses.
Because of this acclimation, we predicted larger numbers of unique expression patterns between weevils feeding on citrus members (Rutaceae), and between those feeding on legume members (Fabaceae), than between those feeding on members of a non-citrus, non-legume group (Asteraceae), even though the degrees of phylogenetic relatedness between host plants within each family are not equivalent. Furthermore, there will be more differential regulation of genes in the three targeted categories in weevils feeding on legume and citrus host plant family members relative to those from the non-citrus, non-legume host plant family comparisons.
Predictions for expression patterns in weevils feeding on their natal host plant vs. a novel host plant
In polyphagous herbivores that can consume several host plants, a shift from consuming one host plant to a different host plant has been previously associated with high transcriptional responses [8,9,29]. Although patterns of transcriptional response to short-term host plant switching are characterized by highly specific gene responses, these responses occur within a small number of gene families, indicating the potential for common pathways of host plant acclimation and adaptation in generalist arthropods. Some work in oligophagous leaf beetles has reached similar conclusions, demonstrating that only a few gene ontology (GO) terms were differentially expressed upon short-term host switching, such as acyl carrier protein hydrolases [43].
When comparing differentially expressed genes between N. cervinus weevils feeding on their natal host plant versus those feeding on a novel host plant, we predicted that that there will be more differential regulation of genes in the three targeted categories in weevils feeding on the novel host in both adults and immature tissues.
Exploration of global expression patterns in all host plants and experimental contrasts
It is entirely possible that important aspects of weevil acclimation and/or adaptation to feeding on resource-taxing host plants, or on novel hosts, may involve differential regulation of genes beyond the three targeted gene categories of detection, detoxification and immune response. For example, a plastic response, as measured by a wider array of upregulated gene sets, was recorded in milkweed aphids feeding on novel host plants [44], and specific gene expression response trajectories were elicited in response to different sugar-mimic alkaloids in silk moths [45]. Other studies on herbivore transcriptional plasticity at the gene set level frequently identify categories associated with metabolic processes, transporter activity, digestion, membrane structure, and reproduction [46].
In insects, developmental gene networks are well-known and have been profiled in several species [47]; thus, it is not implausible to hypothesize that other tightly synchronized gene networks might exist. Metabolic pathways have also been found to be key in herbivore response to host plant defenses [48]. We use a global gene set enrichment approach to elucidate overall patterns of expression by gene family. Together with the observed patterns in the three targeted gene categories, the goal of these analyses is to understand the role of host plant acclimation and adaptation in introduced species.
We sought to profile the transcriptome of successfully invasive, but paradoxically asexual, insects, and determine how life stage, host plant, and environmental conditions affect gene regulation in these species. We have successfully established that gene expression response of weevils can be specific to particular host plants, and that elements of that response can be maintained in the offspring. We have gained understanding of how some host plants are more taxing to weevils eliciting strong and specific gene expression response. However, we also found commonalities to the response of taxing host plants and other stressful situations such as host plant cultivation conditions and/or a transition to a novel host.
Discussion
Several arthropod species show transcriptional plasticity in response to different host plant profiles [9,48]. In the same vein, this series of analyses sought to understand the processes of acclimation and adaptation to non-native host plants in two asexual Naupactus weevil species, as evidenced in their gene expression patterns.
Even though our focus here is on the resulting gene expression differences among herbivores feeding on different host plants, if these were sexually reproducing species it would be entirely possible that some of those herbivore transcriptional differences, interpreted as host plant effects, would be largely due to underlying genetic differences between the herbivore populations themselves [51]. Given that both N. cervinus and N. leucoloma reproduce through apomictic parthenogenesis, and that mitochondrial and nuclear DNA data points to extremely homogeneous populations within the introduced range (Sequeira et. al., unpublished data), our work here highlights host plant-based transcriptome differences, assuming that genetic differences are minimized.
Taxing natal and novel host plants require highly specific transcriptional responses from herbivores
Legume and citrus host plants
Because legumes contain nitrogen-fixing rhizobia and generally have diverse repertoires of chemical defenses, there is a strong evolutionary pressure on legume-feeding herbivores to overcome these defenses in order to derive nitrogen for their own nutrition [25]. This can result in a demonstrable preference for legume hosts [18], even though these legume species tend to require more energy-intensive herbivore responses to overcome the host’s defense response. Evidence of the extra cost imposed on legume-feeding weevils appears to be reflected in their gene expression profiles. The numbers of upregulated host detection genes, detoxification genes, and immune genes were significantly higher in legume-feeding weevils in both N. cervinus and N. leucoloma (Figs 1 and 2). This follows the prediction that both species invest more resources in detecting and dealing with secondary compounds of legumes, and that legumes elicit a larger immune response, possibly related to their associated rhizobia.
The identity of the overexpressed transcripts in legume-feeding weevils points to a legume-specific response. When examining both upregulated and downregulated host detection genes, detoxification genes, and immune genes, legume-feeding weevils had the highest number of unique differentially expressed transcripts (Fig 3i and 3ii), suggesting that the weevil’s transcriptional response pattern is highly specific to that host plant group. However, there were also strong overlaps in differentially expressed genes in comparisons between legume and other comparisons (Fig 3), suggesting that there are potentially shared mechanisms of responding to these particular host plants/growing conditions at the gene level. Previous work has hypothesized that host plant response specificity in herbivores may be exacerbated by the microbial communities specific to a host plant species, as ingesting microbes present on the leaf alters insect immunity [52]. Adult Naupactus weevils feed on foliage and can encounter such leaf microbes.
Organically cultivated host plants
While the body of work contrasting transcriptional levels of defense compounds on conventional versus organic crops is not large, there is evidence that the production of some plant defensive compounds increases when plants are treated using organic rather than conventional approaches [53]. Additionally, specific pathways related to RNA regulation and biotic stress have been found to be part of the variation in gene expression due to agricultural practices, with those pathways enhanced in organically fertilized or protected crops [54]. In transcriptomes from weevils feeding on the same host species under different regimes of cultivation, there was a significantly higher quantity of upregulated host detection genes and specific categories of detoxification genes from weevils feeding on organically grown host plants (Fig 1iii and Table 1). The expression intensity for differentially expressed immune genes was high across all three tissue types in both positive and negative directions (Fig 2). Taken as a whole, adult weevils feeding on organically raised hosts tend to elicit more upregulated genes in detoxification and host detection, with a slight trend in immune defense, supporting the hypothesis that organically cultivated host plants are associated with more differential gene regulation.
Even though organically raised host plants appear to challenge herbivores to a larger degree than their conventionally grown counterparts, the observed response in the three targeted gene categories is not unique to organically grown hosts. There was a notable overlap in the number of shared DEGs for host detection, detoxification and immune defense genes between Legume vs. Other and farming method comparisons (Fig 3). A greater degree of transcriptional plasticity and changes in genes associated with the metabolism of secondary compounds has been found as a response to exposure to stress in some aphids and other specialist insects [44,55]. The evolution of a conserved mechanism for both more toxic host plants and exposure to other forms of stress would be the least evolutionarily costly [56], and would be especially beneficial for this polyphagous species.
The pathway-level response to feeding on organically grown host plants included enriched GO terms in oxidation/reduction pathways, potentially linked to oxidative stress responses (S2iiia Fig). Transcripts involved in ribosome construction, translation, and basic structure and metabolic function were also found to be significantly enriched (S2iiia Fig), as previously found in other population-specific host adaptation studies in general [46]. It may be the case that the enrichment of these terms points to an increase in translation of certain transcripts in response to xenobiotic compounds from resource-taxing host plants that require a change in weevil expression in basic metabolic pathways in order to clear these potentially life-threatening substances, as has been shown in Helicoverpa armigera, the polyphagous cotton bollworm [8].
Although the function of acyl carrier proteins in insect cells specifically is largely unknown [57], the uniquely citrus-specific enriched cluster of acyl carrier proteins found in organic-feeding weevils (Fig 4ii and S2iiia and S2ivc Fig) is known to be linked to fatty acid biosynthesis and glycolytic pathways [58], and has been previously identified in oligophagous beetles after short-term host switches [43]. This upregulation may indicate that the host plant defenses of organically treated oranges are more stressful for herbivores than those of conventionally treated oranges. Similar results have been proposed as a clear link between exposure to stress and increased transcriptional plasticity, including regulation of transcription and translation processes [44].
While these are interesting findings, and many of the genes identified here have also been identified in similar host treatment studies, albeit over different time scales [30], we do recognize our limited ability to make powerful inferences due to the inherent limitations of sample size for the comparisons in this group.
Short-term acclimation to a novel host plant
The important contribution of cytochrome P450s to the success of herbivore establishment on novel host plants has been previously documented in spider mites [28]. In our experimental host plant switch, numbers of upregulated ABC transporter, cytochrome P450, and glutathione S-transferase genes were significantly higher in the switch condition (Fig 1v and Table 1).
A possible interpretation of the bidirectional nature of the expression of immune genes (Fig 2i) could be that the new host plant presents a new set of natural enemies, and as an herbivore feeds on a host where new natural enemies or parasites are present, immune genes associated with those pressures are regulated in one direction. Genes specific to the old host plant appear as regulated in the opposite direction, when in fact they may be simply maintained in weevils feeding on the old host relative to downregulation in weevils feeding on the new host. Support for the idea that herbivore detoxification and immune challenges are larger in newly colonized host plants is supported by the elevated herbivore diversity and load on native hosts relative to non-native hosts found in forty-seven different woody plant species [36].
While host detection and immune defense genes were entirely shared with other comparisons, a suite of 17 detoxification DEGs were uniquely specific to the Switch vs. Maintain contrasts (Fig 3). The length of host plant attenuation may explain the results described here; a host-plant specific set of detoxification genes may form the first line of short-term defense for a weevil introduced to a new host, as identified in the transcriptomic response of spider mites challenged with a novel host without prior exposure [59]. On the other hand, long-term attenuation to a host plant may occur through host plant detection and immune pathways over longer timescales without exhibiting or requiring short-term specificity. It is possible that the investment needed to differentially regulate immune and host detection genes may come later as a long-term adjustment, whereas detoxification genes are differentially regulated early on to ensure survival on that new host. This is supported by other work indicating that a generalist’s short-term transcriptional response to a new host is detoxification-based, with the longer-term response including three times more differentially expressed genes across the genome [30].
Our results also present a set of 11 GO terms enriched exclusively in the switched weevils, providing a window into other pathways potentially involved in early acclimation to a new host plant. Some of these GO terms have been shown in other species to be highly variable and involved in stress responses to new environmental conditions [60]. Other terms are implicated in the post-transcriptional regulation of mRNA maturation and export from the nucleus [61]. This suggests that there are some upregulated GO terms related to responding to immediate environmental stress and the rapid adjustment of regulatory mechanisms that are enriched after a host plant transition. For parthenogenetic weevil species and other species with low genetic variation, an immediate response modulated by gene expression and epigenetic modification would be a useful way of acclimating quickly to new environmental conditions [62,63]. More generally, other arthropod studies that have examined new and old host plant adaptations in polyphagous insects have reported distinct transcriptional plasticity patterns during acclimation to such hosts [9,29].
Different modes of gene expression response: Narrowly targeted vs. widespread
Even though our focus species have the potential to be polyphagous [64], individual weevil populations produce larvae that drop from the foliage to burrow into the soil to feed on the same host plant roots, which may result in the extension of a specific host plant preference as a maternal effect, regardless of polyphagous ability. Because of this dichotomy between potential and actual diet breadths, the expression of host-related genes in these weevils could take divergent modalities. Their patterns of gene expression may manifest as a widespread regulation of several common genes, as expected in a generalist species, or as a specific and targeted regulation of a few highly host-specific genes, as expected in a specialist species [8].
Citrus hosts appeared to elicit a narrow, targeted expression response of host detection genes in weevils feeding on different species of citrus hosts (Fig 2ii). One explanation for the targeted expression of host detection genes for citrus is the phylogenetic closeness of the citrus hosts examined here. Further research that corrects for this potentially confounding variable would be productive for more concretely identifying the source of this effect. However, this trend in highly specific, targeted expression for citrus hosts is replicated in other comparisons, and this pattern may be the result of acclimation to the unique chemical defenses of the host clade as well. Some research has found that the consequences of transfer to a new related host versus a new, distantly related host utilizes similar pathways [29]. Following this idea of specialist, targeted expression, weevils feeding on a novel host plant increased expression intensity, but not number, of host detection and immune genes; such a targeted response was only observed in head and abdominal tissue, and may constitute the first signs of acclimation to the new host (Fig 2i).
It is important to note that the within-family comparisons involved two weevil species, with legume-feeding N. leucoloma compared against aster-feeding and citrus-feeding N. cervinus. Thus, for this particular set of comparisons, differences may be due to species biology rather than host plant attenuation. However, it is then interesting that host plant detection DEGs overlap entirely between two species feeding on legume and citrus host plants (Fig 3); either these species are very alike and the other results included in the host plant family analysis are credible, or the genes associated with host plant detection are highly conserved between species while the differentially expressed detoxification and immune defense genes have diverged. Host detection genes such as odorant-binding proteins are generally highly divergent between insect clades [65], suggesting that our results are probably due to genuine alterations in gene regulation patterns.
The contrasts between legume host plants show a pattern that appears to follow what would be expected for a generalist insect, with a high-intensity response involving large numbers of upregulated genes. This is particularly noticeable in detoxification genes, where the quantity of upregulated detoxification genes is significantly higher between weevils feeding on different legumes than in contrasts in the other two host plant groups (Fig 1iv). Even though the modality of expression involving larger numbers of genes may appear like that of a generalist, the identity of the transcripts that are differentially expressed shows a large degree of specificity. Legume-feeding weevils had more total differentially expressed unique detoxification genes than either the citrus-feeding weevils or the aster-feeding (non-citrus, non-legume) weevils (Fig 3). This data supports the idea that observed differences in gene expression are highly dependent on the chemical characteristics of a specific host or plant family, or in this case, differences between members of the same host plant family. Legumes are not unique in eliciting specific defensive responses from herbivores; studies on Coleoptera and Lepidoptera feeding on Brassicaceae also respond specifically to the chemical defense profile of that host clade [66].
Resource allocation and maternal effects on gene expression
Transgenerational and maternal, host quality-dependent effects have been observed in insect herbivores before, as parental modulation of offspring phenotype can better adapt that progeny to different host plant qualities [67].
The intensities of gene expression for detoxification and immune defense genes were particularly interesting in comparing transcriptomes between immatures derived from legume-feeding versus citrus-feeding parents. In this case, the intensity of expression was strong in both upregulated and downregulated immune and detoxification genes (Fig 2i), suggesting that there are different sets of immune and detoxification genes that are differentially expressed between the offspring of legume-feeding weevils and citrus-feeding weevils. Overexpression of cytochrome P450s in larval stages has been reported in other citrus-feeding arthropods, such as the citrus red mite, but the role of this major detoxification enzyme has been linked to resistance to insecticides rather than to citrus-specific defenses [68]. Legumes have the unique potential for rhizobia-mediated augmentation of host plant defenses [38,69], and because of this, differences between immune gene regulation in legumes versus citrus were expected. However, this effect was only identifiable in immature tissue, which suggests that this pattern is potentially specific to this life stage.
From the GSEA results of weevils feeding on legumes versus non-legumes, it appears that GO terms associated with “ribosome assembly” and “nucleosomes” are enriched solely in adult tissues (Fig 4 and S2ia, S2ib, S2iia and S2iib Fig). The immature comparison yielded primarily downregulated gene sets, which may be the effect of resource allocation towards adult survival rather than host-priming of offspring (Fig 4). If the adult stage must dedicate its energy to surviving on a difficult host plant, previous research has suggested that this triggers the diversion of energetic resources away from reproduction and towards survival [70,71], so that gene sets are less modulated in immatures from these adults relative to immatures from adults feeding on less well-defended host plants. In our experimental set-up, where immatures were processed before they were able to feed and therefore not yet exposed to the challenges presented by their host plants, the decreased parental investment in offspring priming would be more prominent. This would follow the above findings of generally higher numbers and combined expression indices of upregulated immune, detoxification, and host detection genes in adult weevils from both species that fed on legumes.
We see high expression intensity in immatures from parents feeding on organically raised host plants across all three gene groups, despite no significant difference in number of DEGs across these three categories, with the exception of HD genes (Fig 2). Because an organic host is not as difficult as a legume host, feeding on organic hosts allows for the parent to maintain any investment in reproduction and offspring host priming, rather than reallocating that energetic resource to immediate survival. The very low number of enriched gene sets in the immature comparison (S2iii Fig) from parents feeding on organically raised host plants could indicate that host plant cultivation may have less of an effect on regulation at the gene pathway level than host plant groups, as more enriched gene sets were observed in host plant group comparisons. Previous work has shown a highly specific gene response but common gene family response during an herbivore’s long-term acclimation to a particular host plant [29], and the low number of enriched pathways but significant difference of DEG number and expression intensity in this set of comparisons may support this finding. As noted in Results, this particular group of comparisons was limited by sample size; however, with such interesting preliminary results, the possibility for further exploration in this direction with larger replicate sizes looks promising.
A set of 11 enriched GO terms exclusive to those weevils that have fed on a new host plant are found only in adult tissues. It appears that maternal effects on offspring expression at the pathway level are not generalized, although at the gene level, a small effect was observed for detoxification and immune genes. This is surprising, because it would be reasonable to assume that any transmission of the parent’s acclimated phenotype specific to a new host plant through maternal effects could help the offspring be better poised to face those same conditions. Transgenerational and maternal effects of environmental conditions have been recorded in asexual colembolan species and sexually reproducing grass moths [24,67]. However, it is also possible that a multi-pathway enrichment through maternal effects may not be immediately needed, and that the more specific priming in the form of increased expression of specific detoxification and immune genes is enough of an advantage for offspring to survive.
Closing remarks
Our results have shown that the gene expression response of some Naupactus weevils can be specific to particular host plants, and that elements of that response can be maintained in the offspring. Moreover, some host plant groups, such as legumes, appear to be more taxing to weevils as they elicit a complex gene expression response which is both strong in intensity and specific in identity. However, the weevil response to the secondary metabolites of taxing host plants shares many attributes (i.e., identity of upregulated transcript and enriched GO terms) with other stressful situations such as host plant cultivation conditions and/or a transition to a novel host, leading us to believe that there is an evolutionarily favorable core shared gene expression regime for responding to different types of stressful situations. Modulating gene expression in the absence of other avenues for phenotypic adaptation may be an important mechanism for successful host plant colonization for these introduced asexual insects.
Experimental procedures
Weevil collection and rearing
Field collection was facilitated and authorized by personnel at NFREC-Quincy, FL, Lindcove Research and Extension Center, University of California, the USDA-ARS Appalachian Fruit Research Station, West Virginia, the USDA Southeastern Fruit and Tree Nut Research Laboratory in Georgia, the IFAS Extension, University of Florida in Homestead, Florida, and the Auburn University Gulf Coast and Chilton Research and Extension Centers in Alabama. Weevils were collected from Argentina in Buenos Aires and Entre Rios Provinces (7 localities) and within the United States in Georgia, Florida, Alabama (6 localities), and California (4 localities) (S1 Table). Permissions to collect were obtained in each area: samples Quin71C and Fair74L were collected in fields belonging to research stations where we worked with station managers and directors to obtain permission to collect in the station’s fields (NFREC-Quincy, FL, USDA Southeastern Fruit and Tree Nut Research Laboratory, Georgia and Auburn University Gulf Coast and Chilton Research and Extension Centers in Alabama); samples Ker_OneC, Ker_twoC, Tul_OneC, Tul_twoC and Tul_threeC were collected in orchards working in partnership with the Lindcove Research and Extension Center, University of California where station personnel applied to the owners for permission to collect; samples For67C, Post70C,L, Olear72C, Ros77C,L, Nan79C, Eli80C, Tala81C and Sol82L were collected in privately owned lots where obtained permission directly from the owners; Per76C and Otta78C were collected in Argentine natural areas and/or reserves where we obtained permits from Argentine authorities (permit “Confalonieri RNO” from CRCE (Coordinación Centro Este de la Administración de Parques Nacionales) and “Lanteri” provided by OPDS (Organismo Provincial para el Desarrollo Sostenible) de la Provincia de Buenos Aires.
Adult weevils were maintained in temperature-controlled environmental rooms with 12:12 dark/light cycles at 24–28°C and 50% humidity (after [72]) for a three-week acclimation period. Each set of weevils was fed their natal host plant obtained at its original location. Weevil rearing boxes were checked daily for eggs, and juvenile specimens were separated, allowed to develop for 7–10 days, and frozen before active feeding on plant matter began. Adults were processed three weeks after the acclimation period. For a set of experimental host switch trials, individual adults were randomly assigned to continue consuming their natal host or to switch to a novel host plant (grown in greenhouse conditions) after the three-week acclimation period and processed after an additional three weeks.
Sample preparation, RNA extraction, and quality control
While a given tissue from a specimen pool representing a given locality was sequenced only once in this format, the differential expression analysis consists of comparisons of several of these pooled RNA samples.
RNA sequencing, transcriptome assembly, and initial GSEA results were completed by SeqMatic (Fremont, CA) from each of the 52 samples. The RNA-Seq libraries were compiled using paired-end sequencing on the Illumina HiSeq 2500 platform (Illumina, San Diego, CA) and transcripts were assembled de novo using the R/Bioconductor (http://bioconductor.org) package Trinity [73,74] (data available in NCBI GEO, series accession number GSE173980). Each sample transcriptome was aligned against an initial, arbitrary Trinity transcriptome assembly using the bowtie package [75], and the RSEM package [76] was used to calculate transcript and gene expression levels without the need for a reference genome. The Trinity-provided script align_and_estimate_abundance.pl executes this alignment and expression [73,74], using script parameters for Trinity [73,74], bowtie [75], and RSEM [76] described in S4 Table.
In total, the reference transcriptome for N. cervinus yielded 39,539,401 assembled bases; fragment assembly into 124,344 transcripts; a median contig length of 305 base pairs per transcript; and assignment to 79,798 Trinity ‘genes’. The reference transcriptome for N. leucoloma yielded similar values, with 30,813,684 assembled bases; fragment assembly into 120,344 transcripts; a median contig length of 298 base pairs per transcript; and assignment to 73,953 Trinity ‘genes’. Transcripts were mapped to identified/putative protein sequences in the UniProt database (http://uniprot.org), with the best hit used for transcript annotation and the assignment of gene ontology (GO) terms. This resulted in identified/putative protein annotations for 51,213 transcripts in N. cervinus (69.25% of Trinity-identified ‘genes’) and 53,498 transcripts for N. leucoloma (72.34% of Trinity-identified ‘genes’). As this search did not filter protein hits by taxonomic group of origin, there are undoubtedly a fraction of incorrectly annotated transcripts within these sets, but this permissive threshold allows for all possible protein identities to be considered for this non-model organism.
Read-level quality control measures were performed using FastQC (Illumina, San Diego, CA). A second round of quality control at the gene level was also performed manually, removing genes with less than two counts in any sample. Samples that contained very few genes with 10 or more counts were also excluded as potential outliers. Only gene transcripts that had transcript counts of ≥10 in at least 1 sample were included for differential gene expression quantification. Furthermore, samples where the top 100 genes with the highest read counts accounted for more than 35% of all reads were also marked as outliers on the basis of PCR bias during amplification and/or library bottlenecking issues originating from other sources. Comparisons that included samples that were flagged during quality control analysis were not included (n = 4), with the exception of two larval samples (Tul_onetwoC1I1 and Tul_threeC3I1), which were retained given that there were no replacements and the comparative paucity of immature samples. From 79,798 transcripts in N. cervinus samples, 54,366 genes were retained (68%); from 73,953 gene transcripts in N. leucoloma samples, 37,982 genes were retained (51%). For DEG analysis, these reads were further filtered by expression, retaining only genes that had ≥1 counts per million in at least two samples. This further filtering step produced 26,046 genes passing QC for N. cervinus (32.6% of the initial 79,798 transcripts), and 17,474 genes passing QC for N. leucoloma (23.6% of the initial 73,953 transcripts). This subset was normalized and scaled to reduce confounding variables in RNA composition using TMM normalization [77], which is designed for samples with differences in RNA transcript expression distribution such as might be expected here.
Data processing and visualization
Gene expression levels were then assessed using FPKM and log2FC values. The fragments per kilobase of exon per million reads mapped (FPKM) is a normalized count value of the number of transcript fragments mapped onto a particular gene, corrected for the length of that gene and the sequencing depth. The adjusted log2-fold change in expression levels between the two groups of samples compared (adjusted log2FC) gives a relative measure of over- or under-expression for the sample groups being compared while adjusting based on the median expression level of each sample. For this analysis, a mapped gene was considered a differentially expressed gene (DEG) if the ΔFPKM was > 1 and the adjusted log2FC value was ≥ 1, indicating upregulation, or ≤ 1, indicating downregulation. Adjusted p-values were available in contrasts with multiple biological replicates in each host plant/tissue combination. In every case where the adjusted p-values were available, the number of DEGs identified did not differ from the ones computed with the two measures used for the one on one contrasts (ΔFPKM and adjusted log2FC).
All graphing was performed in RStudio v. 3.6.1 [78,79] using tidyverse [80], magrittr [81], reshape [82], Hmisc [83], and data.table [84] for data manipulation, RColorBrewer [85] for colorscales, VennDiagram [86] for Venn Diagram generation, ggbeeswarm and ggplot2 (from tidyverse, [80]) for violin plot generation, gridExtra [87] and ggpubr [88] for graph formatting.
Differential gene expression comparisons
Fifty-two individual samples were included in 48 pairwise comparisons (S2 Table). Although we did not sequence each sample multiple times, we obtained replicates by analyzing samples from similar tissues and host plants together, albeit from different localities. Both N. cervinus and N. leucoloma samples originating from native and introduced ranges were included when available but analyzed separately. The DEG levels between groups of samples were compared in a pairwise fashion (1–5 samples per group). Contrasts that fell into similar contrast categories (e.g. host plant, plant family, plant farming method, or those maintained on the natal host plant or switched to a novel host plant) were visualized together in contrast groups of varying sizes (2–15 pairwise contrasts per group).
Assessing upregulation in three targeted gene categories
To examine the role of IM, DTX, and HD gene regulation among host plant types and other conditions, composite violin/beeswarm DEG plots were constructed to visualize the number of differentially upregulated genes in categories of host detection genes (odorant binding proteins, chemosensory proteins, gustatory proteins), detoxification genes (cytochrome P450s, glutathione S-transferases, glutathione peroxidases, ABC transporters, carboxylesterases, UDP-glycosyltransferases) and immune defense genes (serine proteases/proteinases and serpins modulating the immune defense cascade, general immune response-related gene identities) in the pairwise comparisons used in each set. These were grouped by host plant, host plant family, plant farming method, or host switch condition (see Results), and broken into functional gene groups as defined above, as well as by tissue type. To analyze differential gene expression of a transcript with a certain functional annotation (i.e. odorant-binding protein, a host detection gene), the transcript must be present in both groups in a comparison. In cases where a transcript is annotated for a function of interest in one group (i.e. legume-feeding weevils in a Legume vs. Other comparison) but is not identified in the counterpart group (i.e. weevils feeding on other hosts in a Legume vs. Other comparison), the differential expression of that gene product is not able to be determined and is therefore excluded from the consequent violin plot, generating a different number of data points for each functional annotation within a plot. This does not exclude that comparison pair from being plotted for differential expression of other host detection genes (i.e. chemosensory proteins). The detoxification gene group was analyzed as both aggregate data, by summing the total number of upregulated genes in each condition, and separately by identity, by producing a violin plot that retained the gene’s functional identity information. To examine the potential interactions of sampled tissue type and functional gene group on the weevils’ expression response to different host plants, a rank-based nonparametric pairwise ANOVA was performed using the R package Rfit [89] for each comparison group, which uses a reduction in dispersion algorithm described in [49]. This approach tested each interaction between tissue type, host plant, and gene family identity separately. If the interaction between a pair of variables was significant at α = 0.05, the effect of the interaction was considered an influence on the distribution of the number of overexpressed genes in each comparison.
Weighted expression heatmaps considering intensity of expression and number of differentially upregulated genes
Heatmaps were constructed using the R package gplots [90] to compare the weighted median intensity of expression (using log2FC values) in either direction for the three gene groups of interest. For each set of comparisons, DEGs falling into HD, DTX, or IM groups were separated, and log2FC ranges calculated separately for both positive and negative expression levels for each of those three gene groups. Comparisons that returned only one significantly upregulated or downregulated transcript in a gene group were excluded. These six expression range values (positive HD, negative HD, positive DTX, negative DTX, positive IM, and negative IM) were each split into five equal bins based on the range of expression values. This allowed the calculation of a median expression intensity, weighted by the number of genes in each bin, for both positive and negative expression in each of the three gene groups for each comparison. These weighted median expression intensities were then assembled into individual heatmaps separated by tissue type for each comparison group (S1 Fig) and synthesized into two global heatmaps (Fig 2).
Venn diagrams constructed using the R package VennDiagram [86] were employed to explore the number of shared or uniquely differentially expressed gene identities between comparisons. The same dataset used to build the plots for numbers of upregulated genes was used, separating by transcript identity between comparisons and retaining genes that are differentially expressed in either direction for HD, DTX, and IM genes. DEG specificity was visualized in a three-way or four-way Venn diagram, according to the comparison groups being tested.
Exploration of global expression changes specific to particular host plants or experimental conditions
As N. cervinus and N. leucoloma are not model organisms, a preliminary investigation of global expression patterns associated with host plant use was also performed using Gene Set Enrichment Analysis (GSEA) [91]. GSEA identifies functionally enriched pathways and/or families of genes for each comparison, producing a gene ontology (GO) term associated with each of these gene families/sets. Each of these sets are assigned an enrichment score, which indicates the degree to which the component genes of a gene set are overrepresented in that sample. This is normalized to ameliorate differences in gene set size, as some gene families are bigger or more researched than others, as well as differences in expression depth. Finally, a false discovery rate (FDR) is calculated to control for multiple testing and false positive errors.
Network-based visualization of gene set enrichment patterns across all gene categories
To explore the relationships between these upregulated or downregulated enriched gene sets, a hierarchical clustering analysis of gene ontology terms was performed using the Cytoscape module EnrichmentMap (Cytoscape, v. 3.7.2) [92,93]. Seventeen comparisons with the largest available sample sizes for each tissue class were selected to assemble EnrichmentMaps. Only gene sets with a false discovery rate (FDR) < 0.05 and a log2FC > 1 were included to evaluate expression differences [94,95]. Cytoscape parameters were set so that q = 0.05, and the default connectivity level was employed. The gene set list files compiled from the initial transcriptome assembly for each species were used as references. This method of visualization allows for interpretation of overlaps between different GO terms/gene sets for a gene network-oriented analysis of regulation patterns at a global level, with stringent selective criteria. To examine differences and similarities in gene set enrichment between hosts and hypotheses, Venn diagrams were constructed, in this case separating GO terms by enrichment direction (positive or negative).