Identification of transcriptional biomarkers by RNA-sequencing for improved detection of β2-agonists abuse in goat skeletal muscle

In this paper, high-throughput RNA-sequencing (RNA-seq) was used to search for transcriptional biomarkers for β2-agonists. In combination with drug mechanisms, a smaller group of genes with higher detection accuracy was screened out. Unknown samples were first predicted by this group of genes, and liquid chromatograph tandem mass spectrometer (LC-MS/MS) was applied to positive samples to validate the biomarkers. The results of principal component analysis (PCA), hierarchical cluster analysis (HCA) and discriminant analysis (DA) indicated that the eight genes screened by high-throughput RNA-seq were able to distinguish samples in the experimental group and control group. Compared with the nine genes selected from an earlier literature, 17 genes including these nine genes were proven to have a more satisfactory effect, which validated the accuracy of gene selection by RNA-seq. Then, six key genes were selected from the 17 genes according to the variable importance in projection (VIP) value of greater than 1. The test results using the six genes and 17 genes were similar, revealing that the six genes were critical genes. By using the six genes, three positive samples possibly treated with drugs were screened out from 25 unknown samples through DA and partial least squares discriminant analysis (PLS-DA). Then, the three samples were verified by a standard method, and mapenterol was detected in a sample. Therefore, the six genes can be used as biomarkers to detect β2-agonists. Compared with the previous study, accurate detection of β2-agonists abuse using six key genes is an improvement method, which show great significance in the monitoring of β2-agonists abuse in animal husbandry.


Introduction
Abuse of β 2 -agonists in animal husbandry has become a common practice in recent years. Among mainstream methods of detecting drug residues [1,2], instrumental analysis, immunoassays, sensor-and chip-based analysis have received tremendous attention, and substantial improvement has been made on them. These methods, however, are all based on known β 2 -a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 agonist drugs. New drugs and unregulated drugs are overlooked by regulatory departments, which continuously impose threats on the safety of consumers. Therefore, it is necessary to develop a new screening technique for effective large-scale monitoring of homogeneous drugs, especially new drugs.
With rapid development of various -omics technologies, information can be extracted from -omics data using statistical tools on the same platform, making it possible to find out highly effective and sensitive biomarkers. Previous studies have shown that β 2 -agonists affect gene expression in different tissues and organs, providing a basis for biomarkers search at the transcriptional level. In organisms, β 2 -agonists combined with β 2 -adrenergic receptors trigger the coupling between receptors and guanine nucleotide-binding protein (G protein). In this way, adenylate cyclase (AC) can be activated, which converts adenosine triphosphate (ATP) into 3,5-cyclic adenosine monophosphate (cAMP) and causes a series of in vivo reactions [3][4][5]. β 2agonists also help regulate the concentrations of Na + and Ca ++ in cells [6,7], treat inflammation [8,9] and relax smooth muscle [10]. Ractopamine reportedly induced the change in the expression amount of β 2 -adrenergic receptor (β 2 -AR) in porcine smooth muscle [11], as well as the expression level of fatty acid synthase (FAS) in adipose tissue of fattening pigs [12]. Zilpaterol may trigger the variation in the expression of β 2 -AR in cattle muscle tissues [13]. The use of clenbuterol alter the expression of apolipoprotein R (apoR) in porcine fat cells and bovine smooth muscle cells [14], in addition to the abundances in the insulin-like growth factor-1 (IGF-1) in gracilis cells of mice [15], β 2 -AR in muscle cells of the left ventricles of mice [16], SOCS box protein 15 (ASB15) in longissimus muscle on the cattle back [17] and muscle growth inhibitor (mRNA) in breast muscle cells of chickens [18]. After ractopamine was fed to pigs, IGF-1 expression in smooth muscle cells was changed [19]. By using different β 2 -agonists to treat animal subjects, the preceding researches aim to verify the genes that cause obvious differences in the expression. However, using these genes as biomarkers to monitor β 2 -agonists abuse has not yet been reported.
The aim of this study is to find biomarkers for β 2 -agonists at the transcription level, which are expected to monitor misuse of β 2 -agonists in livestock husbandry to compensate for regulatory flaws. In the earlier research [20], nine genes selected from literatures including β 2adrenergic receptor (ADRB2), cAMP-dependent protein kinase (PRKACB), adenylate cyclase 3 (ADCY3), Na + /K + -ATPase (ATP1A3), Ca ++ -ATPase (ATP2A3), parathyroid hormone (PTH), myosin light chain kinase (MYLK), tumor necrosis factor-α (TNF-α) and interleukin-1β (IL1B) were used to differentiate experimental group samples from control group samples. The relative expression fold and changing trend of each gene were consistent with the mechanism of β 2 -agonist drugs. The correlation analysis results showed that the predicted values using the regression formula established based on the nine genes and actual sample residual values were highly correlated with the actual residual values of the samples. The results indicated that only ractopamine caused changes in the expressions of the nine genes, which could be used as biomarkers to indicate the use of ractopamine. Because only ractopamine was used in animal experiments to treat goats, further investigation is required to determine whether this group of genes is effective in monitoring other β 2 -agonist drugs.
In this study, RNA-sequencing was used as a non-targeted-omics technique to optimize the group of genes selected through a targeted -omics method [20]. Compared with qRT-PCR, RNA-seq has higher detection throughput and a wider detection range, which provides a possibility of finding more biomarkers. To find more accurate biomarkers at the transcriptional level for β 2 -agonist drug misuse monitoring, this study improved the previous study [20] with a high-throughput RNA-seq technology. This approach is expected to further screen target genes to obtain more stable, sensitive and efficient biomarkers for higher detection precision. Biostatistical methods like principal component analysis (PCA), hierarchical clustering analysis(HCA) and discriminant analysis(DA) were applied to test if the selected genes could get a clear distinction between 27 treatment and 27 control samples. This study also used VIP (Variable Influence on the Projection) values to screen potential biomarkers to determine key factors among the selected genes, and genes with VIP>1 are considered to have critical influence on the model according SIMCA-p, which were expected to reduce the monitoring cost and simplify operations. The six key factors were then employed to predict whether drugs were used in 25 unknown samples through DA and partial least squares discriminant analysis (PLS-DA). Subsequently, a standard method (Detection of β-receptor agonist residues in animal-source food by LC-MS/MS, Notice #1025- , Ministry of Agriculture, China) was used to verify the positive samples and determine the drug types and names, which would prove the effectiveness of the potential biomarkers.

Animal experiments and sample collection
Fifty-four healthy male goats weighing 30±2.5 kg were selected and evenly divided into control and experimental groups. Before the experiment, blank feedstuff was given to the goats for one-week observation. Ractopamine was orally given to goats every morning at 1 mg/kg (body weight) in the experimental group before feeding for 28 consecutive days (dosing days). Then, the additive was canceled for 21 consecutive days (withdrawal days) until the end of the experiment. Three animals were sacrificed under local anaesthesia separately on doing days 7, 14 and 21 and on withdrawal days 0, 1, 3, 7, 14 and 21 after drugs were no longer fed (days). About 5g meat cut from the longissimus muscle on the back, washed with sterile phosphate buffer to remove fat tissues and connective tissues, and immediately moved to liquid nitrogen. Then, they were stored under -80˚C until analysis. To verify the effectiveness of the selected target genes, muscle tissue samples of 25 healthy male goats were randomly obtained from slaughter lines in Dongsheng Slaughter House (Bayannaoer, Inner Mongolia, China). After the animals were slaughtered, 100 mg of longissimus muscle tissue on their backs was immediately placed into RNAlater (Tiangen Biotech, Beijing, China) and then stored under -80˚C until analysis.

Gene expression level analysis
TRIzol 1 reagents (Invitrogen, CA, USA) were used for RNA extraction following specified instructions. The total RNA concentration was measured by NanoDrop 2000c UV spectrophotometer (Thermo Scientific, Massachusetts, USA) and the OD260/280 value was recorded to determine the RNA purity. Agilent 2100 Bioanalyzer (Agilent, CA, US) was used to assess RNA integrity, with the RNA integrity number (RIN) used as an RNA quality parameter.
The first chain of cDNA was synthesized using SuperScript VILO cDNA synthetic reagent kit (Invitrogen, CA, US) according to the manufacturer's instructions to acquire the final total RNA concentration. In this manner, the total RNA addition volume was determined as about 1 μg. In the reaction system, 4 μL of 5X VILO™ reaction mix and 2 μL of 10X SuperScript 1 enzyme mix were added before DEPC-treated to obtain 20 μL solution. The reaction was performed under 25˚C for 10 min, 42˚C for 60 min, 85˚C for 5 min, then cDNA was stored at -20˚C until use.
To select target genes, on the 21 st dosing day and third and 21 st withdrawal days, highthroughput RNA-seq was performed on the experimental group samples and corresponding control group samples with Illumina HiSeq™ 2500. The criteria for selecting target genes were: high statistical significance (p < 0.01); relative expression fold (x-fold, experimental group/ control group) was lower than 0.6 or higher than 3 when compared with the control group; xfold values were not significantly different on the 21st feeding day, 3rd withdrawal day, and 21st withdrawal day, which guaranteed long-term validity and stability of target genes. In addition, the gene signal pathways were considered to ensure the specificity of genes to the largest extent [21][22][23].
Real-time quantitative PCR 7500 system (Applied Biosystems, CA, US) was used to measure the relative expression amounts of candidate genes. SYBR 1 Premix Ex Taq™II (Tli RNa-seH Plus; TAKARA, Dalian, China) was applied in the reaction according to the instructions of the reagent kit. The materials in the system were 10 μL of SYBR 1 Premix Ex Taq II, 0.8 μL of primers (10 μM), 0.8 μL of reverse primers (10 μM), 6 μL of double-distilled water (ddH 2 O) and 2 μL of cDNA. The reaction was performed in an 8-strip tube (Axygen, Silicon Valley, USA) with a pipette gun epMotion5075 (Eppendorf).
The following real-time PCR cycling protocol was employed in two steps. First, pre-denaturation was performed at 95˚C for 30s. Then, denaturation was performed at 95˚C for 5s and reaction was allowed at the annealing temperature for 34s as described in Table 1 (40 cycles). The dissolving curve step was finally added on the PCR instrument. The threshold cycle (Ct) and melting curve were both obtained through the SDS software of real-time quantitative PCR 7500 instrument (Applied Biosystems, USA). Subsequent data analysis was performed with the melting curve characterized with only one peak, whereas data with irregular melting peaks were excluded from the quantification procedure. The relative quantitative method was used to process raw data using the following formulas: 4Ct ¼ Ctðtarget geneÞ À Ctðreference geneÞ 4 4 Ct ¼ 4Ctðexperimental groupÞ À Average 4 Ctðcontrol groupÞ

Data processing
The expression ratio of the treatment group to the control group is expected as 2 -ΔΔCt and represents the x-fold regulation with a value of 1.00, indicating no expression change after the treatment. The measurement was conducted using the WPS software (Kingsoft, Beijing, China). SPSS 22.0 (IBM, CA, US) was used to perform data analysis aiming at determining genes with significant differences, which could be used as candidate biomarkers to identify drug dosing. In the Student-t test results, p 0.05 indicates a significant difference, and p 0.01 indicates a highly significant difference.
Biostatistical methods like PCA, HCA and DA were applied to test if the selected genes could get a clear distinction between 27 treatment and 27 control samples. PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components (PC). The normalized expression values of all responding genes were taken as the initial variables and reduced to two principal components only, thus facilitating resolution of treatment clusters in the scatter plot. By this, the data was reduced to a small number of dimensions (two dimensions here, PC1 and PC2) that can be plotted in a scatter plot. This transformation is defined in such a way that the first principal component (PC1) has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component accounts for as much of the remaining variability as possible. Figures of PCA provide an intuitive way to show if there is a clear distinction between control and treatment groups. DA is a pattern recognition and machine learning method to find a combination of features that characterize or separate two or more classes of objects or events. Results of cross-validation demonstrate the effectiveness and reliability of the model. PCA was carried out by SPSS 22.0 to verify the validity of the preceding candidate biomarkers. In addition, HCA and DA were used to verify the PCA results with different mathematical principles and presentation of results. Figures of HCA would display which sample go to the wrong tree. To screen key genes from the selected genes, SIMCA-P (Umetrics AB, Umea, Sweden) was used to obtain variable importance in projection (VIP) values to reduce the monitoring cost and simplify operations. According to the user guide of SIMCA-p, VIP provides the influence of every item in the matrix X on all the Y's. Terms with VIP>1 have critical influence on the model.
After key genes were determined, DA was in coupled with PLS-DA by SIMCA-P to predict 25 unknown samples, which helped determine positive samples.
MS conditions: electrospray power supply; positive ion scan; multiple reaction monitoring (MRM); ionization voltage 3.2 kV; source temperature 110˚C; atomizing temperature 350˚C. All standards (> 98.0% purity) were purchased from Dr. Ehrenstofer GmbH Reagents Company (Germany). Monitoring ion pairs, collision energy, and declustering voltages were provided in S1 Table. 3. Results 3.1. Searching for accurate biomarkers by drug mechanisms coupled to RNA-sequencing 3.1.1. RNA integrity. Agilent 2100 Bioanalyzer (Agilent, CA, US) was used to determine RNA integrity. The results showed that the RIN values of samples for RNA-seq were between 8.0 and 8.5, indicating that the RNA was complete and met the RNA-sequencing requirements; RIN values of the other samples were between 7.5 and 8.5, revealing that the RNA integrity met qRT-PCR requirements.
3.1.2. RNA-sequencing. The total number of genes measured in the 21 st dosing day group is 686,092, and genes with significant expression differences from control is 39,256, accounting for 5.72%. The total number of genes measured in the third withdrawal day group is 632,279, and significantly different genes have a number of 11,275, accounting for 1.78%. The total number genes measured is 582,047 of the 21 st withdrawal day group, and genes with significant expression differences compared with control is 30,808, with a 5.29% proportion. The number of identical genes with significant expression differences from control are 11,269 for the first and second groups, 30,794 for the first and third groups, 11,181 for the second and third groups, and 3,808 for all the three groups. According to the selection criteria, the following genes were selected for qRT-PCR verification: phosphodiesterase 4C (PDE4C), hormonesensitive lipase (HSL), fatty acid binding protein 5 (FABP5), immunoglobulin lambda peptide (LOC102185492), myosin heavy chain 9 (MYH9), nebulin (NEB), cAMP response element modulation (CREM), and forkhead box O1 (FOXO1).
3.1.3. qRT-PCR for the screened results based on RNA-sequencing. qRT-PCR was performed on the 54 samples in the experimental and control groups to verify the eight genes screened by RNA-seq. The eight genes of the test group showed significant expression level differences from the control group, which validated the RNA-sequencing results. The qRT-PCR results indicated that two genes were obviously down-regulated while six genes were obviously up-regulated, which were consistent with the RNA-seq results. The p values and relative expression folds (x-fold) of all genes in the two groups verified by high-throughput RNA-seq and qRT-PCR were shown in Table 2.
To determine whether the selected eight genes showing can be used as potential biomarkers to differentiate treatment and control group, PCA was performed on the qRT-PCR results for all the samples selected on the 7 th , 14 th , and 21 st dosing days and 0, 1 st , 3 rd , 7 th , 14 th , and 21 st withdrawal days. The results were shown in Fig 1, where PC1 and PC2 were the obtained two principal components (PCs). Solid points represent samples in the control group, whereas hollow points represent samples in the test group. PC1 ranged from -1.88 to 0.46 for the test Table 2

Gene
X-fold value X-fold value P value X-fold value X-fold value P value X-fold value X-fold value P value samples, and ranged from -0.23 to 2.48 for the control samples, indicating that the genes selected by RNA-seq were able to differentiate samples in the two groups.

RNA-seq qRT-PCR qRT-PCR RNA-seq qRT-PCR qRT-PCR RNA-seq qRT-PCR qRT-PCR
To validate the PCA results, hierarchicalcluster analysis (HCA) was conducted for the same data. The results were shown in Fig 2, where "treatment" indicated test samples and "control" indicated control samples. All test samples were classified into a cluster and all control samples were classified into another, validating the PCA results.
The PCA and HCA results indicated that the group of genes selected by RNA-seq could be used to differentiate samples in the test and control groups and displayed an improvement on the results of nine genes selected by literatures. However, these genes in this study may not be used as potential biomarkers for monitoring drug treatment since genes are selected based on RNA-seq and biostatistical methods, further verification is required in other issues.

qRT-PCR for the screened results by sequencing in combination with drug mechanisms.
PCA was performed based on the eight genes selected by RNA-seq and nine genes selected from literatures. The results were shown in Fig 3, in which hollow points indicate experimental group samples, and solid points display control group samples. PC1 and PC2 are the obtained PCs. PC1 ranges from -0.51 to 2.68 for the control samples and -1.96 to 0.52 for the test samples. As shown in the figure, the two groups of samples are classified into separate clusters despite a few samples in the border areas. This demonstrated that the selected 17 genes could differentiate samples in the two groups, which slightly improved the test result using only nine genes selected from literatures.
The HCA results were shown in Fig 4. As shown in Fig 4, all samples in the control group are in the first cluster; most samples in the treatment group are in the second cluster although a few ones are deviated from this cluster. Compared with the nine genes selected from literatures, the results using 17 genes witnessed an obvious improvement.
The DA results were shown in Table 3, in which 96.3% of the samples were correctly classified. The value was higher than nine genes, which had an accuracy of 96.3% and 92.6% for the initial classification procedure and subsequent cross-verification procedure, respectively.
In summary, the PCA results of 17 genes were slightly improved over those of 9 genes. Based on the same data, HCA and DA results using 17 genes indicated a significant improvement when compared with 9 genes. Therefore, the results of 17 genes selected by a combination of targeted and non-targeted methods as biomarkers to identify animals treated with ractopamine were more satisfactory. This study revealed that the method established in the earlier literature [20] could be inhanced by adding eight genes selected by RNA-seq, which has a high throughput potential.

Screening of more concise genes from 17 genes
Addition of eight genes selected by RNA-seq could optimize the earlier method, but more biomarkers mean heavier workload and consumption of more money and materials, which is a drawback to the application and implementation of this method. To further optimize the method and search for best-performance biomarkers, we used VIP to screen modulating genes that have critical influences on the model.
To screen key genes from the selected genes, SIMCA-P was used to conduct VIP. The VIP value greater than 1 is considered greatly influential on the model, and the impact of genes with VIP<1 on the model is negligible. As shown in Fig 5, six genes were selected, including FOX, FABP5, NEB, ATP2A3, PDE, and ADRB2, all of which have a VIP value of greater than 1.
The PCA results for all the samples in the test and control groups using the preceding six genes were shown in Fig 6. PC1 and PC2 are the two PCs, and solid and hollow points represent the control and test group samples, respectively. As shown in the figure, PC1 ranged from -1.78 to 0.34 for the test group and -0.23 to 2.46 for the control group, which could be used to obviously differentiate the two groups, in concordant with the results in Fig 3. The PCA results indicated that the test results using the screened six genes were consistent with those based on the 17 genes, revealing that the six genes were the key factors and could be used to predict the model. To validate the PCA results, HCA was carried out on the same data, and the results were shown in Fig 7. In this figure, "treatment" and "control" represent the samples in the test and control groups, respectively. According to the figure, control samples are classified into a cluster, and test group samples are classified into another cluster except one outlier included in the control cluster. This result was in agreement with that in Fig 4. This result also corresponded to the PCA results, displaying a great potential of the VIP method in screening key biomarkers.
DA was further carried out on the same data, and the results were shown in Table 4. By using the screened six genes, 96.3% of the samples were correctly classified during the initial classification procedure as well as the subsequent cross-verification procedure. This result completely agreed with Table 3.
DA results were given as the same as the data obtained by using 17 genes, presenting the criticality of these six genes. It should be noted that 100% accuracy in discriminating the samples could not be realized even though both targeted and non-targeted methods were used. Therefore, this group of screened genes should be further optimized to find out reliable and sensitive biomarkers to solve the regulation problem in drug misuse.    Table 5. According to this table, 22 samples were classified into the control group with the prediction of non-drug-treated, and three     . With the addition concentration of 0.5 μg/kg to 2 μg/kg, the recovery rates were between 70% and 120%. Three positive samples were measured, and mapenterol, a type of β 2 -agonist, was found in the sample labeled b-24, with the concentration of 0.772 μg/kg (Y = 1309.9X + 480.18, R 2 = 0.99868).
In this part, unsupervised method DA and supervised method PLS-DA were used to predict drug treatment of 25 undetermined samples. Both methods determined the same three positive samples, which were labeled b-18, b-20 and b-24. The β 2 -agonist residue of was detected only in the sample labeled b-24, which might be caused by instrument sensitivity deficiency. Maybe the drug was completely decomposed but it still had some effects. There is another assumption that b-18 and b-20 samples do not contain β 2 -agonists but contain drugs that have similar physiological effects, which could present false-positive results.
Mapenterol was detected from a suspected sample. β 2 -agonists are structurally divided into aniline and phenol types. In this study, ractopamine is a phenol type, while mapenterol is an aniline type, indicating that this group of genes can be used to monitor β 2 -agonist drugs.
The instrument-aided verification result indicated that prediction of unknown drug-treated samples using the six genes screened by literatures, RNA-seq, and VIP process was correct. This group of genes can be used as biomarkers to monitor the abuse of β 2 -agonists, which is practicable and effective. However, to determine more stable and sensitive biomarkers at the transcriptional level, this group of genes needs further optimization. For example, animal samples treated with a longer period of drug withdrawal are needed, and samples of different tissues, species, and genders are required for the study. These will facilitate monitoring of β 2agonists misuse in animal husbandry.
In this study, six key genes were selected to monitor β 2 -receptor agonist misuse in animal husbandry as biomarkers. PCA, HCA, and DA results indicated that eight genes selected by high-throughput RNA-seq technique plus nine genes selected in earlier literatures collaboratively achieved better performance, demonstrating the accuracy of the selected genes by RNAseq. The results using six key genes selected from the 17 genes were consistent with the results of 17 genes, revealing that the six genes were the critical ones. The instrumental verification showed the correctness of the results predicted on the undetermined samples by the six genes, validating the precision and reliability of the six key genes. With both RNA-seq and earlier literatures, detection accuracy can be improved by finding key genes. Then effectiveness of the genes was verified, which imposed a great importance on effective monitoring of β 2 -agonist drugs abuse in animal husbandry.

Discussion
Transcriptomics is extensively studied for monitoring β 2 -agonists misuse in current animal husbandry, making it possible to determine the physiological functional impacts of the drugs at the molecular level [24]. Generally, gene expression quantity changes are measured by qRT-PCR, a highly sensitive method with a wide quantification range. Compared with chip techniques, this method is highly repetitive with a low cost, in addition to easy operations and analysis of multiple biological species of samples in one cycle. Therefore, qRT-PCR is widely used for the study of biomarkers in transcriptomics [25]. Compared with qRT-PCR, microarray technique allow analysis of a set of complete genes in an array [26,27]. However, the latest high-throughput RNA-seq method is more sensitive than the microarray approach, with a higher probability of finding more biomarkers.
In a previous research, a targeted -omics method, that is, screening according to the literatures, was used to screen target genes. After verification by qRT-PCR, statistical results showed that selected genes could be used as biomarkers to differentiate ractopamine-treated and untreated animals. To further optimize the method, this study used a non-targeted -omics technique, RNA-seq, to screen possible biomarkers in combination with earlier study by exploiting great potential of RNA-seq in searching for transcriptional biomarkers. This aim to find out optimal biomarkers and apply them in actual monitoring. As a result, the screened 17 genes brought in an improvement compared with previous nine genes, demonstrating the superiority of RNA-seq. The combination of targeted and non-targeted -omics techniques is a better method for searching for optimal biomarkers.
Biostatistical methods such as PCA, HCA and DA have a strong potential and are extremely important in screening potential biomarkers to differentiate experimental samples from control samples. Now, these methods have been widely used in screening biomarkers at various levels, and biomarkers selected based on VIP have also been extensively used in -omics studies [28][29][30][31].
Among the six genes, ATP2A3, PDE and ADRB2 are key factors for the β 2 -agonists signal pathway, and FOX, FABP5 and NEB are also associated with the physiological functions of β 2agonists, indicating that the changes in gene expression levels are specific to the treatment of β 2 -agonists. In the earlier research, candidate genes were selected based on the mechanisms of β 2 -agonists, but the animals in the test group were not treated with other β 2 -agonist drugs, and therefore the selected genes maynot be used as biomarkers to monitor β 2 -agonists except ractopamine. In this study, another type of β 2 -agonist mapenterol was detected in a potentially drug-treated sample after prediction of the six genes, demonstrating this group of genes could realize the monitoring of all β 2 -agonist drugs.
To determine more stable and sensitive biomarkers at the transcriptional level in the long run, this group of genes needs to be optimized. For example, animal samples treated with a longer period of drug withdrawal are needed, and samples of different tissues, species, and genders are needed. These will facilitate monitoring of β 2 -agonists misuse in animal husbandry.

Conclusion
In this study, RNA-seq was used as a non-targeted-omics technique to optimize the group of genes selected through a targeted -omics method. There were 17 genes with more information combining a targeted -omics method and a non-targeted -omics technique. PCA, HCA and DA results indicated that eight genes selected by high-throughput RNA-seq plus nine genes obtained from literatures collaboratively achieved better performance, demonstrating the accuracy of the RNA-seq-selected genes. Since 17 genes possessed better results than nine genes, more genes mean more work in actual monitoring. To further optimize the method, six genes carrying key information in the 17 genes with VIP>1 were extracted upon a certain algorithm (VIP value). The test results using the six key genes were consistent with the results of 17 genes, revealing that the six genes were the critical genes. The six key factors were then employed to predict whether drugs were used in 25 unknown samples through DA and PLS-DA. LC-MS/MS applied to verify the positive samples showed that the results predicted on the positive samples by the six genes were accurate and reliable. The selected six key genes possess equal accuracy and information with 17 genes mean less amount of operation in actual monitoring, imposing a great importance in effective and efficient monitoring of β 2 -agonist drugs abuse in animal husbandry.
Supporting information S1