Validation of Reference Genes for Real-Time Quantitative PCR (qPCR) Analysis of Avibacterium paragallinarum

Real-time quantitative reverse transcription PCR (qRT-PCR) offers a robust method for measurement of gene expression levels. Selection of reliable reference gene(s) for gene expression study is conducive to reduce variations derived from different amounts of RNA and cDNA, the efficiency of the reverse transcriptase or polymerase enzymes. Until now reference genes identified for other members of the family Pasteurellaceae have not been validated for Avibacterium paragallinarum. The aim of this study was to validate nine reference genes of serovars A, B, and C strains of A. paragallinarum in different growth phase by qRT-PCR. Three of the most widely used statistical algorithms, geNorm, NormFinder and ΔCT method were used to evaluate the expression stability of reference genes. Data analyzed by overall rankings showed that in exponential and stationary phase of serovar A, the most stable reference genes were gyrA and atpD respectively; in exponential and stationary phase of serovar B, the most stable reference genes were atpD and recN respectively; in exponential and stationary phase of serovar C, the most stable reference genes were rpoB and recN respectively. This study provides recommendations for stable endogenous control genes for use in further studies involving measurement of gene expression levels.


Introduction
Infectious coryza is an acute upper respiratory tract disease of chickens. This disease is of worldwide economic significance and affects both broiler and layer flocks, manifested primarily as a drop in egg production (10±40%) in layer flocks and retardation of growth due to decreased feed and water consumption in breeder and broiler flocks. The most common clinical signs are nasal discharge, conjunctivitis, facial oedema, lacrimation, anorexia, and diarrhea [1].
The causative agent of infectious coryza is Avibacterium paragallinarum (A. paragallinarum) [2], and A. paragallinarum is classified into three serovars: A, B, and C according to the Page schemes [3]. It is widely accepted that the three Page serovars represent distinct "immunovars," since inactivated vaccines based on any one Page serovar provide no protection against the other two Page serovars [4]. However, little is known about the differences among these three serovars, either the genes defining each serovar or the expression of these genes.
Real-time quantitative reverse transcription PCR (qRT-PCR) is a robust and sensitive method for measurement of gene expression and characteraization of gene regulation. In most qPCR studies, internal reference genes are used to eliminate sample-to-sample variations that may arise due to test variation including differences in cell numbers and efficiency of RNA isolation and reverse transcription [5]. Since "housekeeping" metabolism of prokaryotes is highly variable depending on experimental procedures [6], selection of reference genes is crucial for the accuracy of a qRT-PCR test. Once the reference genes are selected, any changes in target gene expression can be expressed in relation to those of the reference genes. A single gene is often selected as the reference gene but Vandesompele et al [7] suggested that multiple carefully selected housekeeping genes were recommendable and more suitable for accurate normalization.
Development of an effective qPCR for defining gene expression in serovars A, B, and C of A. paragallinarum is urgently needed. However, information concerning reference genes as candidates for a qPCR against A. paragallinarum is very limited, primarily due to a lack of understanding the genome organization of serovars A, B, and C of A. paragallinarum. In this study, nine candidate reference genes encoding 16S ribosomal subunit (16S rRNA), the DNA gyrase subunit A (gyrA), theβ-subunit of RNA polymerase (rpoB), the glucose-6-phosphate isomerase (pgi), the DNA repair protein (recN), the translation initiation factor 2 (infB), the DNA gyrase subunit B (gyrB), the β-subunitof the ATP synthase (atpD) and the Mn-dependent superoxide dismutase (sodA), respectively, were chosen for validating the reference genes for qPCR of A. paragallinarum. The majority of these genes were recognized as housekeeping genes in the family of Pasteurellaceae and used in phylogenetic analysis [8][9][10][11][12][13]. Moreover, five of the nine genes (including 16S rRNA, gyrA, rpoB, atpD and gyrB) were also used for qRT-PCR normalization in Actinobacillus suis and Haemophilus ducreyi [14,15].
To date, no study has systematically investigated reference genes for A. paragallinarum. In order to verify the stable expression genes and determine whether they are suitable for normalization of qPCR data for A. paragallinarum, we performed molecular biological analysis of their expression stability. The objective of this work was to validate internal reference genes for a qRT-PCR of serovar A, B, and C strains of A. paragallinarum. The expression of nine reference genes was examined during different growth phase. The results of this study will be helpful for gene expression normalization of qPCR in serovars A, B, and C of A. paragallinarum.

Bacterial cultures and sample processing
The three reference strains 221, 0222, and Modesto of the A. paragallinarum serovar A, B, and C respectively were kindly provided by Dr. Pat Blackall, University of Queensland, Australia. Tryptic Soya Broth (TSB) and Tryptic Soya Broth Agar (TSA), supplemented with 10% (v/v) chicken serum and 0.0025% (w/v) reduced Nicotinamied adenine dinucleotide (NAD) were used for propagation and maintenance of these three strains. The cultures were grown in a shaking incubator at 37˚C. Broth Cultures of serovar A, B and C were monitored by OD600 measurement every 40 min, and samples were harvested from the exponential and stationary phase by centrifugation, followed immediately resuspended in RNAlater (Ambion, Carlsbad, CA). Samples were then stored at 4˚C until the test.

Primer design and validation
Primers were designed from GenBank sequences with the aid of primer analysis software Primer3plus (http://www.primer3plus.com/cgi-bin/dev/primer3plus.cgi; Version 2.4.0) [16,17]. Then confirmed through DNAMAN and NCBI/Primer-BLAST. Characteristics of the primers are listed in Table 1. The length of the amplicons were kept between 100-250 bp as much as possible for optimal amplification efficiency. The effectiveness of the primers was confirmed by conventional PCR and product size observed by electrophoresis on 1.5% agarose gels. All nine primers produced single amplification products as expected (data not show).

RNA extraction and cDNA synthesis
Total RNA was extracted from 0.5 ml aliquot of bacterial samples collected at different growth phase using Trizol RT extraction system (Invitrogen, carlsbad, CA) following the manufacturer's instruction. The extracted RNA was re-suspended in DEPC-treated water (Life Technologies) and the concentration and purity of RNA were determined by NanoDrop10001 spectrophotometer (NanoDrop Technologies Inc, Wilmington, DE, USA). RNA samples with 260nm/280nm ratio between 1.9 and 2.1 were prepared in equimolar aliquots for further tests. cDNA was synthesized from 200 ng of each RNA sample [18] using a Reverse Transcription Kit (Tiangen, China). Prior to cDNA synthesis, genomic DNA (gDNA) in the RNA samples were removed by incubation with a gDNA buffer at 42˚C for 3 min as described in RNA reverse transcription kit (Tiangen, China). Reverse transcription reactions were performed in a MasterCycler 1 Gradient Thermal Cycler under the following conditions: 42˚C for 15 min, 95˚C for 3 min. The cDNA samples were placed immediately on ice at the end of the reactions and then stored at -20˚C for later use.

Real-time quantitative PCR
Real-time quantitative PCR (qRT-PCR) was performed with Bio-rad IQ5 (Bio-Rad, Hercules, CA) using 2X SYBR Green iTaq mixture (Tiangen, China) in a total reaction volume of 12.5 ul. The reaction mixture consisted of: 6.25 ul of 2X SYBR Green iTaqmixture, 0.25 ul forward/ reverse primer mix with an initial concentration of 10 uM, 1 ul of cDNA (1:2 dilution) and DEPC-treated water added to12.5 ul. Each sample was tested in triplicate. The cycling condition was as follows: 3 min denaturation at 95˚C, followed by 40 cycles at 94˚C for 40 s, 56/58˚C for 40 s and 72˚C for 40 s.

Data and statistical analysis
Cycle threshold (CT) values, also known as Cq recommended by Bustin [19], were recorded for all qPCR reactions. Two of the most widely used statistical algorithms, geNorm v3.5 and NormFinder [20,21] were used to evaluate the expression stability of reference genes. The comparative ΔCT method [22] was used to rank candidate reference genes, with the lowest standard deviation considered to signify the highest stability. The geNorm algorithm determines the most stable combination of reference targets based on the geometric mean of the most stable control genes to generate a stability value (M). The goal of the alanysis was to choose two or more reference genes to obtain more reliable quantitative results according to the pairwise variance analysis of normalization factor (V n/n+1 ). NormFinder Excel applet, as a similar calculation method as geNorm and also based on relative expression levels, was used to assess reference gene stability based on both intra-and inter-group variations. NormFinder was used to identify genes with the lowest standard deviation (SV) as an indication for highest stability. The comparative ΔCT method was subsequently used to further evaluate gene expression stability. This method compares the relative expression of pairs of genes within each treatment and selects the most stable reference gene by assessing expression stability based on standard deviation derived from CT values. If the ΔCT values between pairs of genes remain constant for all samples tested, it means these two genes are either stably expressed or co-regulated. However, if the ΔCT values vary logarithmically, as reflected by higher standard deviations, it indicates that one of these two reference genes is variably transcribed. In such event, ΔCT analysis was performed to compare each gene with all other genes and standard deviation of each gene was obtained. The reference gene with the lowest standard deviation was then selected as the most stable reference gene. Overall ranking of reference genes was confirmed by using the geometric mean of the rankings generated from the individual algorithms [23].

Analysis of gene expression using different reference genes for normalization
In order to evaluate the impact of using different reference gene for normalization on the expression levels measured by qRT-PCR, the two best-ranked, the two middle-ranked reference genes and a least-ranked reference genes were used to calculate the expression levels of hypothetical gene of interest along with the growth time for different serovars of A. paragallinarum. Expression levels of the hypothetical gene of interest were generated with normalization using the most stable reference genes, the stable reference genes and the least stable reference gene. The effect of reference genes with different ranks of stability was assessed by variation tendency of expression level for the hypothetical gene of interest along with the growth time for strains of A. paragallinarum.

Primer amplification efficiency
The efficiency, linear dynamic range and specificity of nine pairs of primers were evaluated in qPCR with a series of five-fold dilution starting at 1:5 for a total of 5 dilutions. The efficiencies (E) of all primer pairs ranged from 88.2 to 112%, and correlation coefficients (R 2 ) were all higher than 0.98 (Table 1), both being considered acceptable [6,24]. Primer specificity was verified by the presence of a single-peak in the melting curve analysis in qRT-PCR (data not shown).

Expression profiles of candidate reference genes
In this study, certain variations in the expression levels of the nine candidate reference genes were observed in serovars A, serovars B and serovars C as shown in Fig 1A, Fig 1B and

GeNorm analysis
The expression stabilities of the nine candidate reference genes were analyzed using geNorm algorithms. High gene expression variability results in high M values and indicates low expression stability. For overall comparison, samples from three serovars at each growth phase (exponential and stationary) were calculated. In cultures of serovar A, gyrA (M = 0.468 and 0.428) was found to be most stably expressed gene while sodA (M = 1.781 and 1.257) was the least stably expressed both in exponential and stationary growth phase ( Table 2). In cultures of serovar B, the most stable genes were found to be atpD (M = 0.438) and recN (M = 0.355) in exponential and stationary growth phase respectively, and the least stable gene was sodA (M = 1.046 and 0.686) ( Table 3). In cultures of serovar C, rpoB (M = 0.347) and recN (M = 0.436) were found to be the most stably expressed genes in the exponential and stationary growth phase respectively while sodA (M = 1.154 and 1.026) was the least stably expressed both in the exponential and the stationary growth phase (Table 4). When combining exponential and the stationary phase, the most stable reference genes were 16S rRNA, recN and gyrA.
In addition to the ranking of the candidate reference genes, geNorm also recommends using optimal number of required reference genes and provides calculations of the impact of adding additional reference genes on normalization (V n / n+1 ). If a pairwise value (V n / n+1 ) is no more than 0.15, then there is no need to choose n+1 reference genes. In our study, the best combination of reference genes assessed by geNorm analysis was gyrA and recN, atpD and gyrA, recN and rpoB in exponential growth phase for serovars A, B, and C respectively (with V 2/3 = 0.053, 0.12, and 0.053, respectively). In stationary growth phase of serovars A, B, and C, however, the stability ranking-order seemed to vary with serovars. The optimal numbers and best combinations of reference genes in this phase were gyrA and rpoB, gyrA and pgi, rpoB and atpD for serovars A, B, and C respectively (with V2/3 = 0.052, 0.101, and 0.085, respectively). When combining the exponential and stationary growth phase, the optimal numbers and best combinations of reference genes were recN and 16S rRNA, recN and 16S rRNA, recN, rpoB and gyrA for serovars A, B, and C respecitively (with V2/3 = 0.086, V2/3 = 0.110, and V3/4 = 0.096, respectively).

NormFinder analysis
Stabilities of the expression of the nine reference genes were evaluated using NormFinder analysis. A lower stability value indicates a more stably expressed reference gene. NormFinder also suggests the best combination of two reference genes for quantitative real-time PCR normalization. For serovar A cultures, NormFinder identified gyrA, atpD, and gyrA as the most stable reference genes in exponential, stationary, and combined both growth phase, respectively (Table 5). For serovar B cultures, the most stable genes were found to be atpD, recN, and gyrA in the three growth phase respectively (Table 6). For serovar C cultures, the most stable genes were found to be gyrA, gyrB, and gyrB in the three growth phase respectively (Table 7).

ΔCT analysis
The comparative ΔCT method was used to assess the best reference gene by comparing standard deviation of a particular gene with all other genes. The results of the comparative ΔCT method analysis were similar to those from geNorm and Normfinder. However, both major and minor differences still exist in the tested samples from the other two analysis methods. A summary of the full results can be seen in Table 8.

Impact of reference gene selection on gene expression studies
To determine the effect of a poorly ranked reference gene on a gene expression study, we performed a 50S ribosomal protein L33 expression analysis using data from the cultures of serovar B of A. paragallinarum in stationary growth phase. Expression of 50S ribosomal protein L33 was assessed using two highly ranked genes: recN, 16S rRNA, two middle-ranked reference genes: gyrA and atpD and a poorly ranked gene-sodA. The results revealed a difference in tendency of expression level for 50S ribosomal protein L33 along with the growth time. There was a same tendency when normalised to the two most stable reference genes and two stable reference genes, as opposed to inconsistent result when normalised to the least stable reference gene (Fig 2). Therefore, by using the least stable reference gene sodA for normalization significantly changed the calculated expression level of 50S ribosomal protein L33 which could lead to large error alterations in study results.

Discussion
Real-Time quantitative PCR is among the most powerful tools for detection of expression levels of target genes. Endogenous reference genes are widely used in qPCR assays for normalization because their stability in different experimental conditions and biological treatment system [18,25].Housekeeping genes, such as 18S and 28S rRNA, GAPD (3-GAPDH), ACTB (actin), and TUBLIN (tubulin), etc., are often used as reference genes in relative quantification as the proteins encoded s are essential for maintaining cellular activities and are stably expressed in different tissues and organs [25,26]. However, the expression levels of reference genes are only conditionally stable and are subject to different species, experimental conditions, growth phase, biotic and abiotic stimulations. Each particular experimental condition, therefore, has its suitable stable expression reference genes [27][28][29] and none of the commonly used reference gene is universal [30][31][32]. For example, four genes, rpoB, atpD, gyrA and gyrB, were found to be most stable candidate reference genes; whereas the expression of 16S rRNA, a commonly used reference gene in many of studies, has been found to be unstable [33]. Furthermore, the expression of some genes varies depending on different growth conditions and stimulations. One study found that pyk and rpoB performed most stably when comparing aerobic and epinephrine cultures during growth phase; whereas when analyzing exponential and stationary growth phase together, only pyk remained in the top three rankings and the 16S rRNA has been demonstrated unstable under certain study conditions [14]. Choosing a suitable reference gene for gene expression research based on experimental conditions is critical for valid analysis. In addition, the atpD gene we identified is a highly conservative and stable gene in Pasteurellaceae, and is commonly used as a molecular biomarker [9] for bacterial categorization and widely used as an reference gene in regenerating phylogenetic tree for Pasteurellaceae in recent studies [34]. However, current study is the first to demonstrate its existence in A. paragallinarum. To provide reference for future studies, we sequenced A. paragallinarum atpD gene of serovar A, B and C strains and deposited those into GenBank (KXO78457, KXO78458 and KXO78459). In this study, no universal reference gene was found for all environments and phase. This was probably contributed by variations among different serovars, i.e. the stable reference genes in one serovar might not necessarily be stable in another serovar. Then, the expression level of the same reference gene might vary at different growth phase, such as exponential and stationary phase. Therefore, when choosing reference gene for normalization of expression levels, the timing and serovars need to be taken into consideration for most accurate estimation of expression levels.
The gyrA and rpoB, which have been evaluated in Actinobacillus [14], Staphylococcus aureus [35], and Xanthomona s [33], and demonstrated most stable expression characteristics, also exhibited stability in our study. 16S rRNA, a widely used reference gene for species classification, had a low favorably ranking despite its high-abundance with mRNA transcription. Interestingly, compared to other reference genes, sodA, which was usually used for strain identification and species classification [10,35], had a high expression level in serovar B, but not in A and C (with CT = 35), and amazingly kept CT values unchanged regardless of assay conditions. This meant that expression levels of sodA was very low in serovar A and C beyond the limit of detection for the quantitative PCR, sodA was therefore excluded from the ΔCT analysis. In addition, the stabililty ranking of sodA was at the bottom for serovar B, which further confirmed that sodA was not suitable for normalization in qPCR assays of A. paragallinarum. The stable ranking of reference genes analyzed by geNorm was generally consistent with that by NormFinder, with minor differences in individual reference genes. For example, the top four stable reference genes analyzed by geNorm were gyrA, atpD, rpoB, and recN; while NormFinder recommended the use of atpD、gyrB、gyrA, and rpoB for normalization. The variations may be explained by different parameter settings, assumptions and algorithms in geNorm and NormFinder when calculating the gene expression stability of the reference genes. Thus, a combination of two or more software was needed for the stability ranking of candidate genes when selecting reliable reference genes [36].
In conclusion, suitable reference gene candidates were selected for use in serovars A, B, and C of A. paragallinarum in different growth phase. For serovar A, gyrA and atpD were the most stably expressed in exponential and stationary phase respectively; for serovar B, atpD and recN were the most stably expressed in exponential and stationary phase respectively; for serovar C, rpoB and recN were the most stably expressed in exponential and stationary phase respectively.