Reference gene selection for quantitative gene expression analysis in black soldier fly (Hermetia illucens)

Hermetia illucens is an important resource insect for the conversion of organic waste. Quantitative PCR (qPCR) is the primary tool of gene expression analysis and a core technology of molecular biology research. Reference genes are essential for qPCR analysis; however, a stability analysis of H. illucens reference genes has not yet been carried out. To find suitable reference genes for normalizing gene expression data, the stability of eight housekeeping genes (including ATP6V1A, RPL8, EF1, Tubulin, TBP, GAPDH, Actin and RP49) was investigated under both biotic (developmental stages, tissues and sex) and abiotic (heavy metals, food, antibiotics) conditions. Gene expression data were analysed by geNorm, NormFinder, BestKeeper, and ΔCt programs. A set of specific reference genes was recommended for each experimental condition using the results of RefFinder synthesis analysis. This study offers a solid foundation for further studies of the molecular biology of H. illucens.


Introduction
The black soldier fly (BSF) Hermetia illucens is distributed in tropical, subtropical and temperate regions of the world. BSF adults do not have functional mouthparts and therefore do not bite and transmit insect-borne diseases [1,2]. On the contrary, BSF larvae can be used in forensic entomology to estimate the post-mortem interval [3,4]. BSF larvae are voracious consumers and generalist saprophagous insects [5]; therefore, the larvae have been used to bioconvert a wide range of organic waste, including food processing residue, food waste, crop straw and animal manure [6-9]. The harvested larvae are exploitable for the isolation of bioactive substances (e.g., hydrolase [10, 11], antimicrobial peptides [12] and chitin [13]). Moreover, the larvae of BSF are rich in lipids and proteins, which can be used as feed or as ingredients for poultry, livestock and aquaculture [5, 14,15]. Due to the vast economic potential of BSF, the breeding industry has developed rapidly, and research related to BSF is rapidly increasing [16,17]. Most studies have focused on the optimization of rearing conditions and substrates for improving the nutrition and biomass of BSF larvae [18]. The molecular biology of BSF, however, is poorly researched. Generally, real-time quantitative PCR (qPCR) is a popular and accurate technique for measuring target gene expression levels [19]. qPCR analysis is highly sensitive, and various factors, including the intrinsic variability of RNA, reverse transcription efficiency, amplification efficiency and diversity in RNA extraction methods, can affect its accuracy. Therefore, normalization is the key to achieve accurate target gene results [20,21]. Traditionally, housekeeping genes such as beta-actin (Actin), translation elongation factor 1-alpha (EF1), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), alpha-tubulin (Tubulin) and TATA-binding box (TBP) have been used as reference genes for the normalization of qPCR data [22,23]. However, the expression of these reference genes fluctuates substantially under different experimental conditions [24]. A systematic study is needed to determine the appropriate reference genes for each experimental species [25].
In this study, the goal was to identify candidate reference genes in H. illucens. The stability of eight housekeeping genes under selected experimental conditions was evaluated. The candidate genes include V-ATPase subunit A (ATP6V1A), ribosomal protein L8 (RPL8), EF1, Tubulin, TBP, GAPDH, Actin, and ribosomal protein 49 (RP49). The selected environmental conditions ranged from biotic to abiotic treatments. Biotic conditions included developmental stage, tissue and sex, which were commonly used in the study of reference genes stability [23,26,27]. Abiotic conditions included food, heavy metals and antibiotics. These conditions will affect the conversion of BSF on waste, so many studies have focused on these conditions [8,9,28,29]. As a result, a series of reference genes are recommended for gene expression studies in H. illucens.

Insects
H. illucens samples were obtained from a colony maintained year-round in the laboratory of the Hubei International Scientific and Technological Cooperation Base for Waste Conversion by Insects, Huazhong Agricultural University (30˚29 0 N, 114˚22 0 E). The H. illucens colony was maintained at 27 ± 2˚C, 13 L: 11 D photoperiod and 60 ± 5% relative humidity. The adults were reared in mesh cages and fed on water. Three to five blocks of three-layer corrugated cardboard were tied together with rubber bands and placed in cages to collect eggs. Eggs were held in cylindrical plastic containers (12 cm diameter, 6 cm high) with ambient humidity until eclosion. The neonatal larvae were given approximately 100 g of food (every 10 g of wheat bran was mixed with 20 ml of water), and every 48 h, (or as needed) the food was replaced. Six-day-old larvae were used in different experimental conditions. In developmental stages and during sex and tissue bioassays, the larvae continued to feed on wheat bran. In the food bioassay, 6-day-old larvae were fed wheat bran, food waste and faeces. In heavy metal and antibiotic bioassays, the 6-day-old larvae were fed wheat bran containing different concentrations of sulphonamides and cadmium.

Experimental conditions
Developmental stages. The developmental stages included 0.2 g eggs, thirty 3-day-old larvae, five 8-day-old larvae, three 13-day-old larvae, three prepupae, three pupae and four adults (half male and half female).
Sexes. Three 1-day-old virgin male and female adults were separated and sampled. Tissues. 13-day-old larvae were dissected (epidermis, fat, gut, malpighian tube, haemolymph) by sterile anatomical forceps in liquid nitrogen. Tissues from ten larvae were collected and pooled as one sample.
Food. Thirty 6-day-old larvae were pooled in cylindrical plastic containers (12 cm diameter, 6 cm high) with 100 g of wheat bran, food waste and faeces. The diet was changed every 2 days. When 60% of the larvae became prepupae, three prepupae were taken from each sample. Death was not detected during the different food treatments.
Heavy metal. The diet for the larvae was wheat bran mixed with deionized water or CdCl 2 �2.5H 2 O solution. Every 10 g of wheat bran was mixed with 20 ml of solution. Finally, the concentrations of Cd were 0, 50 and 100 mg/kg dry wheat bran. Thirty 6-day-old larvae were pooled in cylindrical plastic containers (12 cm diameter, 6 cm high) with 100 g of prepared diet. The food was replaced every 2 days. When 60% of the larvae became prepupae, three prepupae were taken from each sample. Death was not detected during the heavy metal treatment.
Antibiotic. The concentrations of sulphonamides were 0, 50 and 500 mg/kg dry wheat bran. Thirty 6-day-old larvae were pooled in cylindrical plastic containers (12 cm diameter, 6 cm high) with a 100 g food supply. The food was replaced every 2 days. When 60% of the larvae became prepupae, three prepupae were taken from each sample. Death was not detected during the antibiotic treatment.
Sampling amount in each experimental conditions was for one biological replicate, and three biological replicates of all treatments were prepared. The experimental conditions are briefly summarized in Table 1.

Total RNA isolation and cDNA synthesis
Total RNA was extracted using TRIzol reagent (Ambion, Life Technologies, USA) following the manufacturer's instructions. A NanoDrop2000 (Thermo Scientific, USA) was used to

Reference gene selection and primer design
The involved reference gene sequences were downloaded from NCBI (http://www.ncbi.nlm. nih.gov/). The candidate primers were designed by the online primer design tool for real-time PCR (http://www.idtdna.com/site) with default settings. The primers were synthesized by Tsingke Biotech (Wuhan, China). The length and identity of PCR products were assessed by gel electrophoresis and sequence analysis. Then, the sensitivity, specificity, and capacity of the correct primers were tested by the melting curve and standard curve with the Bio-Rad CFX96/384 Real-Time System. Finally, the remaining qualified primers were used for further qPCR evaluation. A standard curve was achieved for each gene by five-fold serial dilution of the template.

Quantitative real-time PCR (qPCR)
The qPCR reaction system (10 μl) contained 2.6 μl ddH 2 O, 5.0 μl TB Green MasterMix (NEW-BIOTech., Canada), 0.4 μl of forward and reverse primers (10 μM) and 100 ng first-strand cDNA. The PCR program for all the genes included an initial denaturation step for 30 s at 95˚C, followed by 40 cycles at 95˚C for 5 s and 60˚C for 30 s. Finally, a melting curve analysis from 65˚C to 95˚C was performed to confirm the specificity of the PCR products. Three technical replicates were analysed for each biological replicate.

Stability of gene expression
The stability of the eight candidate reference genes was comprehensively evaluated using RefFinder (https://www.heartcure.com.au/reffinder). RefFinder is a web-based analysis tool that integrates all four major computational programs, including geNorm [30], NormFinder [31], BestKeeper [32], and the ΔCt method [33]. All of these methods can recommend the most stable reference genes; geNorm can also calculate the pairwise variation (V) of one gene by comparing values with other genes. Finally, geNorm suggests a minimum number of reference genes by V value for normalization. If Vn/n+1 is less than 0.15, no more reference genes are needed for normalization [30].

Reference gene validation
Researches showed that the expression of heat shock protein 90 (Hsp90) in organisms will increase under heavy metal stress [34][35][36], so Hsp90 was selected as a target gene to evaluate the stability of candidate reference genes under different heavy metals. The normalization factors (NFs) were computed based on the geometric mean of genes with the lowest Geomean values as determined by RefFinder and a single reference gene with the lowest or highest Geomean value. The relative expression level of Hsp90 under heavy metal exposure was calculated using the 2 −ΔΔCt method. Data were statistically analyzed through IBM SPSS Statistics 20.0 software. Differences among the experimental results were analyzed by one-way ANOVA (Tukey's test), p < 0.05 was considered significant.

Expression profiles of reference genes
A single amplicon was designed for each candidate reference gene, and the specificity was evaluated by agarose gel electrophoresis analysis and melting curve analysis. Qualified primers used for qPCR, their PCR efficiency and regression coefficient are shown in Table 2. The efficiency values of all primers ranged between 93.2 and 108.6%, with regression coefficient values from 0.978 to 0.999, which confirmed the standard [37]. Expression profiles of the eight candidate reference genes were calculated within the selected experimental conditions (Fig 1).

Stability of reference genes under biotic conditions
For different developmental stages, NormFinder ranked stability from high to low as follows: Actin, RPL8, Tubulin, GAPDH, EF1, ATP6V1A, RP49, and TBP; BestKeeper provided the ranking as RP49, Tubulin, RPL8, GAPDH, Actin, TBP, ATP6V1A, and EF1; geNorm calculated the ranking sequence as ATP6V1A = GAPDH, Actin, EF1, Tubulin, RPL8, RP49, and TBP; and the ΔCt ranking was Actin, Tubulin, RPL8, GAPDH, EF1, ATP6V1A, RP49, and TBP (Table 3). By synthesizing the results of four programs, RefFinder recommended Actin, Tubulin and GAPDH as the top three candidate reference genes in different developmental stages. Actin was the most stable reference gene, while TBP was the least stable reference gene (Fig 2A). RefFinder showed that EF1, Actin and Tubulin were the three foremost reference genes in different tissues. Furthermore, EF1 was categorized as the most stable reference gene, whereas ATP6V1A was the least stable reference gene (Fig 2B). RefFinder ranked three different reference genes, including Tubulin, RPL8 and RP49, in different sexes. Among these genes, Tubulin was the most stable reference gene. In contrast, the reference gene EF1 was found to be the least stable reference gene (Fig 2C). Based on the above three biotic conditions, RefFinder provided a general ranking of Actin, Tubulin, EF1, RPL8, RP49, GAPDH, ATP6V1A, and TBP ( Fig  2D). According to the results of the four programs (NormFinder, BestKeeper, geNorm and Reference gene selection for quantitative gene expression analysis in black soldier fly (Hermetia illucens) ΔCt), the stability of eight candidate reference genes in response to biotic conditions is shown in Table 3.

Stability of reference genes under abiotic conditions
For different heavy metals, GAPDH, RPL8 and Actin were the top three reference genes according to the comprehensive ranking from RefFinder. Specifically, GAPDH and Tubulin were the most and the least stable reference genes, respectively (Fig 2E). For different food sources, ATP6V1A, Actin and Tubulin were the top three reference genes according to the comprehensive ranking from RefFinder. ATP6V1A and RP49 were the most and the least stable reference genes, respectively (Fig 2F). For different antibiotics, ATP6V1A, RPL8 and TBP were the top three reference genes according to the comprehensive ranking from RefFinder. ATP6V1A and GAPDH were the most and the least stable reference genes, respectively ( Fig  2G). Based on these three abiotic conditions, RefFinder offered a general ranking of TBP, Actin, ATP6V1A, RP49, EF1, RPL8, GAPDH, and Tubulin ( Fig 2H). According to the results of four programs (NormFinder, BestKeeper, geNorm and ΔCt), the stability of eight candidate reference genes in response to abiotic conditions is shown in Table 4.

Optimal number of reference genes
The optimal number of reference genes was determined using geNorm by the paired variation (Vn/n+1) value. If Vn/n+1 was less than 0.15, it was unnecessary to apply an additional reference gene, i.e., top-ranking n reference genes were sufficient to use as control genes. Based on the comprehensive ranking (Fig 2) and paired variation value (Fig 3), reference genes under each experimental condition were determined. The results of the geNorm analysis showed that the value of V2/3 was < 0.15 under different developmental stages. Therefore, Actin and Tubulin were the most appropriate reference genes. For different tissues, the recommended reference genes were EF1, Tubulin, Actin, RP49, TBP and RPL8. In different sexes, Tubulin and RPL8 were recommended. GAPDH and RPL8 were considered the most appropriate reference genes for different heavy metals. The analysis of geNorm predicted that the values of V2/3

Reference genes validation
Hsp90 was selected as a target gene to validate the accuracy of the identified reference gene under different heavy metal conditions. The expression of Hsp90 was calculated by using the most stable reference gene GAPDH (NF1), the top two stable reference genes GAPDH and RPL8 (NF1-2), and the least stable gene Tubulin (NF8) for normalization (Fig 4). The expression profile showed a similar trend in both NF1 and NF1-2. When using NF1 and NF1-2 as reference genes, the expression of Hsp90 was not significantly different in 0 mg/kg of cadmium (HM0) and 50 mg/kg of cadmium (HM50), and the expression in HM0 was significantly lower than that in 100 mg/kg of cadmium (HM100). In comparison, when NF8 was used as a reference gene, Hsp90 expression was no significant difference between HM0 and HM100 (Fig 4).

Discussion
Because of its high sensitivity, specificity, rapidity and reliability, qPCR is considered the most suitable method for the quantification of gene expression [38]. Reliable reference genes are particularly important in eliminating heterogeneity in various datasets and ensuring the accuracy of quantitative results [39]. A gene that maintains a constant expression level under different experimental conditions is considered an ideal reference gene. However, not all reference genes have this characteristic in different species. According to Bustin et al. [38], the stability Reference gene selection for quantitative gene expression analysis in black soldier fly (Hermetia illucens) of reference genes has been studied in many insect species [23,26,27]. As a potential biomaterial for the conversion of waste, the stability of reference genes of H. illucens plays an important role in understanding molecular mechanisms. To the best of our knowledge, this is the first report on the identification of appropriate reference genes for qPCR analysis in H. illucens.
In this study, we evaluated the stability of eight candidate reference genes for H. illucens gene expression analysis under six experimental conditions (developmental stages, tissues, sex, heavy metal, food, and antibiotics). The first three biotic conditions have been used to evaluate the stability of reference genes in many insects [26,27,40]. The latter three abiotic conditions were selected according to the high frequency of their occurrence as experimental treatments in the study of H. illucens. First, as a saprophytic insect, H. illucens can transform a wide range of organic wastes, most of which are faeces and waste food [41][42][43]. As a control diet [44], wheat bran was also analysed in this study. Second, due to exposure to heavy metals and antibiotics in the environment (especially in pig manure), there have been many studies on heavy metals and antibiotics in H. illucens [45][46][47].
The stability ranking of reference genes was different between the different experimental conditions. For example, Actin was the most stable reference gene in the developmental stages, while it was the third most unstable gene in the different sexes. TBP was the most stable reference gene in all abiotic conditions and the least stable in biotic conditions (Fig 2). Despite some differences in individual rankings, Actin was consistently maintained at a higher level of stability than the rest of the reference genes (Fig 2). Previous results showed Actin to be an optimized reference gene in Drosophila melanogaster, Bemisia tabaci, Chilo suppressalis and Bombyx Mori [48][49][50][51]. In summary, some genes can be used as reference genes under many experimental conditions, but the best way to find suitable reference genes is to evaluate their stability under specific experimental conditions.
Recently, a single gene has been replaced by multiple genes to normalize the expression level of target genes in qPCR analyses [52]. Traditionally, the optimal number of reference genes is customarily determined by geNorm [30]. In this study, our results not only ranked the stability of genes under different experimental conditions but also provided the optimal Reference gene selection for quantitative gene expression analysis in black soldier fly (Hermetia illucens) number of reference genes to improve the accuracy of qPCR analysis (Fig 2 and Fig 3). Two reference genes were recommended for different developmental stages (Actin and Tubulin), sex (Tubulin and RPL8), heavy metals (GAPDH and RPL8), food (ATP6V1A and Actin) and antibiotics (ATP6V1A and RPL8), while six reference genes were required for reliable normalization in tissues (EF1, Tubulin, Actin, RP49, TBP and RPL8). However, our recommended reference genes were selected from eight candidate reference genes, which does not mean that other candidate genes cannot be used. It would be more appropriate for potential audiences to verify these results flexibly than to apply them directly.
In conclusion, this study screened a series of reference genes for future work on gene expression in H. illucens and provides a solid foundation for further molecular biology research to understand how this important resource insect copes with harsh environmental conditions. Supporting information S1