Selection of internal references for transcriptomics and RT-qPCR assays in Neurofibromatosis type 1 (NF1) related Schwann cell lines

Transcriptomics has been widely applied in uncovering disease mechanisms and screening potential biomarkers. Internal reference selection determines the accuracy and reproducibility of data analyses. The aim of this study was to identify the most qualified reference genes for the relative quantitation analysis of transcriptomics and real-time quantitative reverse-transcription PCR in fourteen NF1 related cell lines, including non-tumor, benign and malignant Schwann cell lines. The expression characteristics of eleven candidate reference genes (RPS18, ACTB, B2M, GAPDH, PPIA, HPRT1, TBP, UBC, RPLP0, TFRC and RPL32) were screened and analyzed by four software programs: geNorm, NormFinder, BestKeeper and RefFinder. Results showed that GAPDH, the most used internal reference gene, was significantly unstable. The combinational use of two reference genes (PPIA and TBP) was optimal in malignant Schwann cell lines and the use of single reference genes (PPIA or PRLP0) alone or in combination was optimal in benign Schwann cell lines. Our recommended internal reference genes may improve the accuracy and reproducibility of the results of transcriptomics and real-time quantitative reverse-transcription PCR in further gene expression analyses of NF1 related tumors.


43
Neurofibromatosis type 1 (NF1) is an autosomal dominantly inherited tumor 44 predisposition syndrome that affects multiple organ systems and has a wide range of 45 variable clinical manifestations, such as pigmentary lesions, skeletal abnormalities[1-46 3]. The major defining features in NF1 patients are peripheral nerve sheath tumors, 47 including plexiform neurofibromas (pNF) and malignant peripheral nerve sheath 48 tumors (MPNST) [4]. Therefore, the majority of NF1 related tumor researches could be 49 categorized into benign tumor studies or malignant tumor studies.

50
Currently, Omics studies play an increasingly important role in uncovering disease 51 mechanisms and screening potential therapeutic biomarkers. Among different omics 52 studies, transcriptomics is the bridge between genomics and proteomics. Key methods, 53 including microarray and RNA-seq, could provide deep and precise measurements of 54 levels of transcripts and different isoforms. Several studies are conducted to figure out 55 the molecular mechanisms of NF1 and to advance the development of targeted drugs[5-56 7]. These high-throughput results need further validation by real-time quantitative 57 reverse-transcription PCR (RT-qPCR), which is considered as the gold standard for 58 gene expression studies. However, the accuracy and reproducibility of these results may 59 be affected by several factors at multiple stages. In data analysis, the stability of 60 reference genes is critical for appropriate standardization and obtaining accurate gene 61 expression data. The Minimum Information for Publication of Quantitative Real-Time 62 PCR experiments (MIQE) guidelines highlight the importance of experimental 63 validation of reference genes for particular tissues or cell types [8][9][10][11]. However, to date, 64 there is no previous research on the validation of suitable reference genes for the 65 relative quantification analysis of target gene expression in NF1 related cell lines.

73
We investigated the stability of these eleven reference genes in fourteen different 74 NF1 related cell lines, including two non-tumor NF1 +/-Schwann cell lines, five benign 75 pNF cell lines and seven malignant MPNST cell lines. Based on study design, we 76 analyzed the data obtained as two separate groups: (1) benign NF1 tumor study 77 including Schwann cell (SC) + pNF cell lines and (2)  Total RNA was extracted from all NF1 related cell lines using AxyPrep Multisource 103 RNA Miniprep Kit (Axygen, USA) according to the manufacturer's instructions. The 104 quality and quantity of RNA were measured using a NanoDrop 2000 105 Spectrophotometer (Thermo Scientific, Waltham, MA, USA) through OD260/280 and 106 OD260/230 ratios. 500 ng of total RNA was used for the cDNA synthesis reaction using 107 PrimeScript TM RT Master Mix kit (TaKaRa RR036A) according to the manufacturer's 108 instructions. After the reaction, the cDNA was diluted to 20 ng/ml. All RT-qPCR 109 experiments were performed with the same batch of cDNA.

RT-qPCR 111
Reference gene primers were designed and synthesized by Sangon Biotech 112 (Shanghai, China). The efficiency, dynamic range and specificity of all primers were 113 tested by Sangon Biotech (Shanghai, China). RT-qPCR was performed using TB 114 Green ® Premix Ex Taq™ Kit (TaKaRa RR420A) on a Applied Biosystems 7500 Real-115 Time PCR System, as described previously [18]. Each sample was measured in three 116 technical replicates. The relative quantification of RT-qPCR data were calculated using 117 the 2 −ΔΔC T method, as described previously [8] .
118 Statistical analysis 119 The stability of the eleven candidate reference genes was examined by four 120 frequently used software programs, geNorm [ 126 The threshold cycle (C t ) value was used to assess the expression characteristics of 127 the internal reference gene candidates. Higher C t values indicate lower expression 128 levels. The distribution of the C t values of eleven reference genes in fourteen samples, 129 including two NF1 +/-Schwann cell lines, five plexiform neurofibroma cell lines and 130 seven MPNST cell lines was displayed in In order to identify the best reference genes for benign and malignant NF1 tumor 138 study, we applied four statistical approaches including geNorm, NormFinder, 139 BestKeeper and ΔC t method. The M value of eleven reference genes were calculated 140 and ranked by genorm. The genorm system identifies those genes with lowest M values 141 as the most stably expressed genes and an M value under 0.5 represents relatively stable 142 gene expression. GeNorm analysis showed that PPIA and TBP, sharing an M value of 143 0.549, were the most stable reference genes in SC + pNF + MPNST cell lines (Fig 2). 144 However, when only considering SC + pNF cell lines, the most stable genes were PPIA 145 and PRLP, with an M value of 0.225, followed by TBP and S18 (M value, 0.295 and 146 0.454, respectively). The optimal number of reference genes to obtain a stable 147 normalization index was demonstrated by V-values. When SC + pNF + MPNST cell 148 lines were considered, the V8/9 value was 0.127, indicating that eight reference genes 149 combination is optimal for RT-qPCR normalization. In SC + pNF cell lines, the V2/3 150 value was 0.104, suggesting that merely two reference genes were necessary. And the 151 V8/9 value was lower than 0.110 in this group, of which the V values kept stable in 152 different subsets. Eleven reference genes underwent normalization factor calculation and were ranked 157 by NormFinder according to their minimal combined SC, pNF and MPNST gene 158 expression variation. According to the result, TBP was the optimal reference gene 159 (stability value, 0.394), followed by PPIA, ACTB and GAPDH (stability values, 0.403, 160 0.427 and 0559 respectively) (Fig 3). The best combination of two genes are GAPDH 161 and S18 with a combination stability value of 0.150. For SC + pNF cell lines, PPIA, 162 RPLP and TBP were the most stably expressed genes (stability values, 0.142, 0.185 163 and 0.186 respectively), while the rest ones were extremely unstable amongst those cell 164 lines. In best keeper algorithm, the C t values, SD (standard deviation) and CV (coefficient 169 of variance) of each gene were calculated and analyzed to identify stable reference gene 170 candidates. Under general conditions, genes with a SD greater than 1.0 are determined 171 to be unstable. Among SC + pNF + MPNST cell lines, TBP turned out to be the best 172 choice with a standard deviation [+/-CP] (which is the stability value in best keeper 173 algorithm) of 0.48 (Fig 4). The stability value of TBP kept stable (0.44) when it came 174 to SC + pNF cell lines. However, RPLP and PPIA were esteemed as more stable genes, 175 sharing a stability value of 0.42. 176 177 ΔC t analysis 178 179 In ΔC t method, the differential expression of "gene pairs" were analyzed to 180 determine the optimal reference genes. According to the results, PPIA (1.24) and TBP 181 (1.26) were the most stably expressed reference genes in SC + pNF + MPNST cell lines 182 (Fig 5). While RPLP (0.78) was far more stable in SC + pNF cell lines. 183 184 Comprehensive ranking order 185 186 Using RefFinder, we integrated four analysis approaches mentioned above to 187 comprehensively evaluate the expression stability of these candidate genes. PPIA 188 turned out to be the optimal reference gene in SC + pNF + MPNST cell lines with a 189 ranking value of 1.19, followed by TBP and S18 (ranking value, 1.41 and 3.66, 190 respectively) (Fig 6). And PRLP (1.19) was the most stably expressed in SC + pNF cell 191 lines.

193
High-throughput transcriptome sequencing identifies the key pathogenic genes that 194 play crucial roles in the pathogenesis, development and malignant transformation of 195 NF1 related neurofibroma. RT-qPCR has been considered as golden standard for gene 196 expression analysis because of its accuracy and sensitivity [27]. It is a robust and 197 specific method for the validation of the identity of these pathogenic genes that 198 aberrantly expressed in neurofibroma.

199
The ease to generate RT-qPCR data is in sharp contrast with the challenges to 200 guarantee that the results obtained are reliable. In the absence of proper reference genes, 201 data obtained are possibly inaccurate and unreproducible, as it has been shown that the 202 use of a single reference gene without validation results in a significant bias (ranging 203 from 3-fold in 25% of the results up to 6-fold in 10% of the results) [19].
204 Prior to this study, no validated reference gene has been identified for NF1 related cell 205 lines. In previous studies, GAPDH and ACTB have been most frequently used in gene 206 expression analysis of NF1 without solid validation [28][29][30]. However, the expression 207 characteristics of reference genes vary remarkably under different experimental 208 conditions and with different samples. The use of GAPDH and ACTB as an internal 209 control has been proven to be unsuitable in other samples, such as lymphoblastoid cell 210 lines and human mesenchymal stem cells [31][32][33]. Therefore, it is crucial to confirm 211 reliable and qualified reference genes for particular tissues or cell types and specific 212 experimental designs.

213
In our study, in order to find the internal reference genes stably expressed in different 214 NF1 samples, including normal peripheral nerve, benign and malignant tumor tissues, 215 we investigated fourteen different NF1 related cell lines, derived respectively from the 216 tissues mentioned above, which include two non-tumor NF1 +/-Schwann cell lines, five 217 pNF cell lines and seven MPNST cell lines. The expression stability of eleven 218 frequently used reference genes, including RPS18, ACTB, B2M, GAPDH, PPIA, 219 HPRT1, TBP, UBC, RPLP0, TFRC, and RPL32 were investigated and analyzed 220 separately in these cell lines.

221
Four different mathematical and statistical models were utilized to analyze the data 222 obtained, including geNorm, NormFinder, BestKeeper and ΔC t . Each model uses 223 different algorithms to estimate both the intra-and the intergroup expression variations 224 and rank candidate genes based on the instability score. GeNorm and NormFinder use 225 the stability (actually instability) value, ΔC t uses the average of S.D. and BestKeeper 226 uses the S.D. of the crossing points [19][20][21]. In addition, A web-based tool RefFinder 227 was finally used to evaluate the stability of candidate reference genes and identify the 228 most stable gene by calculating the geometric mean of ranking values obtained from 229 the above-mentioned four methods [22].

230
Due to the fact that the software applied are based on different mathematical models, 231 the ranking orders of reference gene stabilities varied slightly between these tools. 232 However, the top two positions of reference genes in SC + pNF group and SC + pNF + 233 MPNST group determined by geNorm were identical to those determined by 234 NormFinder, BestKeeper and ΔC t . Unanimously, according to the results of RefFinder, 235 the top two reference genes in two groups are also the same and they share similar 236 ranking values which are significantly lower than others. Therefore, conclusion can be 237 drawn that PPIA and PRLP0 are the best reference genes for normalization in benign 238 NF1 tumor study, while PPIA and TBP are as well in malignant NF1 tumor study.

239
PPIA is a gene encoding for a cyclosporin binding-protein, TBP is a gene encoding 240 for a TATA-binding protein, and PRLP0 encodes a ribosomal protein that is a 241 component of the 60S subunit. In NF1 related tumors, they were stably expressed 242 irrespective of tumor pathology and severity. However, the most frequently used 243 reference gene GAPDH was proved to be an inappropriate option in two groups (M 244 values > 0.5). As a multifunctional gene, the use of GAPDH as a reference gene has 245 been also questioned and challenged in other cancers including lung cancer [

249
In addition, considering the difference of opinion on the minimal number of 250 reference genes required for RT-qPCR, we explored the necessity of selecting multiple 251 reference genes for data normalization in our samples. In previous studies, some 252 investigators showed that the combination of more than one reference gene improved 253 the accuracy of results[14, 19, 39-41] while other investigators proved that 254 normalization with a single gene is sufficient for most research applications[12, 36, 42, 255 43]. In present study, according to the analyses of geNorm, combination of eight 256 reference genes is the most precise plan for normalization in SC + pNF + MPNST group. 257 However, considering the stability and precision of using two reference genes in 258 combination is sufficient for data normalization, we suggest using a normalization 259 factor calculated by the geometrical mean of the most stable reference genes (PPIA and 260 TBP) for normalization of target gene expression in SC, pNF and MPNST cells [44]. 261 What's more, it is worth noting that the use of single reference gene, with a M-value 262 over 0.5, should be avoided in these samples. In SC + pNF group, although the 263 combination of two reference genes significantly improved precision over 264 normalization with PPIA or PRLP0 alone, the use of single reference genes (PPIA or 265 PRLP0), sharing an M value of 0.225, is acceptable for data normalization.

267
In this study, we systematically explored the suitability of fourteen candidate 268 reference genes for normalization of gene expression in different NF1 related cell 269 lines, including non-tumor NF1 +/-Schwann cell lines, pNF cell lines and MPNST cell 270 lines. According to the results, we recommend using two reference genes (PPIA and 271 TBP) in combination for gene expression analyses in MPNST related researches and 272 using single reference genes (PPIA or PRLP0) alone or in combination in pNF related 273 studies.