Response of bitter and sweet Chenopodium quinoa varieties to cucumber mosaic virus: Transcriptome and small RNASeq perspective

Saponins are secondary metabolites with antiviral properties. Low saponin (sweet) varieties of quinoa (Chenopodium quinoa) have been developed because seeds high in saponins taste bitter. The aim of this study was to elucidate the role of saponin in resistance of quinoa to Cucumber mosaic virus (CMV). Differential gene expression was studied in time-series study of CMV infection. High-throughput transcriptome sequence data were obtained from 36 samples (3 varieties × +/- CMV × 1 or 4 days after inoculation × 3 replicates). Translation, lipid, nitrogen, amino acid metabolism, and mono- and sesquiterpenoid biosynthesis genes were upregulated in CMV infections. In ‘Red Head’ (bitter), CMV-induced systemic symptoms were concurrent with downregulation of a key saponin biosynthesis gene, TSARL1, four days after inoculation. In local lesion responses (sweet and semi-sweet), TSARL1 levels remained up-regulated. Known microRNAs (miRNA) (81) from 11 miR families and 876 predicted novel miRNAs were identified. Differentially expressed miRNA and short interfering RNA clusters (24nt) induced by CMV infection are predicted to target genomic and intergenic regions enriched in repetitive elements. This is the first report of integrated RNASeq and sRNASeq data in quinoa-virus interactions and provides comprehensive understanding of involved genes, non-coding regions, and biological pathways in virus resistance.


Introduction
Quinoa (Chenopodium quinoa Willd.) is a pseudocereal crop with high nutritional value [1,2]. Quinoa varieties are classified as bitter or sweet depending upon saponin content of the seed. Although saponins in quinoa are highly accumulated in the seed pericarp, they are also present in lower concentrations in leaves. Saponin biosynthesis in quinoa is regulated by the triterpene saponin biosynthesis activating regulator-like1 (TSARL1) gene [3][4][5]. Mechanical abrasion or water rinsing is typically used to remove saponin from the seed of bitter varieties [6,7]. Because these methods are costly and reduce nutritional value of the seeds [6,8], quinoa cultivars with low seed saponin content (sweet cultivars) are desired [9]. Saponins from quinoa are active against several plant pathogens including viruses and are pre-existing defense factors in other pathosystems [10][11][12][13][14]. analysis was incorporated to complement the transcriptome analysis by characterization of host components involved in viral gene silencing, virus-derived siRNAs (vsiRNAs), and validated known/novel miRNAs. These findings are of particular interest for molecular breeding of quinoa varieties to enhance resistance against virus.

Plant materials and virus inoculation
Seed from three Chenopodium quinoa varieties (Table 1 and  Madole was used as the propagation host. Leaves of seven-week-old quinoa plants were mechanically inoculated with CMV-infected tobacco leaves or mock-inoculated with phosphate buffer (pH:7.0). Inoculated leaves were harvested at 1 or4 days post infection (dpi). The experiment was designed as a 3×2×2×3 factorial [three varieties; two treatments (virus or mock); two sample times (1 or 4 dpi); three biological replicates]. Samples times were chosen because in preliminary studies, no symptoms were observed at 24 h and symptom development typically began by Day 4 (data not shown). Excised leaf samples were immediately frozen in liquid nitrogen and kept at -80˚C until RNA extraction. To verify successful inoculation, virus-and mock-inoculated tissues were tested by CMV AgriStrips (Bioreba, Reinach, Switzerland) according to manufacturer's instruction.

Total RNA extraction, library preparation, RNA and small RNA sequencing
Total RNA was extracted from individual samples by Direct-zol RNA MiniPrep Plus kit (Zymoresearch, Irvine, CA, USA) according to manufacturer's protocol. Concentration of DNA-digested RNAs was measured by NanoDrop™ One Microvolume UV-Vis Spectrophotometer (Thermo Scientific, Waltham, MA, USA). The cDNA library preparation for RNASeq

Bioinformatic pipelines
The pipeline used to analyze RNASeq data was followed as described previously [45]. In summary, the quality of obtained raw data was checked by FastQC v0.11.5 [46]. Adapters, low quality reads, and sequences shorter than 30 nucleotides were trimmed by Trimmomatic v0.38 [47]. High quality clean reads were mapped to C. quinoa reference transcripts by Salmon v0.12.0 [48] to obtain normalized transcripts per kilobase mapped million reads (TPM) values. Normalized transcript-level read counts were transformed to gene-level abundance in order to use in differential gene expression R package, DESeq2 [49]. The DEGs were determined in two modes: 1) with a full design model (time×variety×treatment) where DEGs were determined across quinoa varieties ('Jessie' as reference level) in time-series inoculation (1 dpi as reference level) between CMV-inoculated and mock-inoculated samples, and 2) individual variety with a time-specific treatment design (time+treatment+time:treatment), where DEGs were identified in each variety in time-series inoculation (4 dpi vs 1 dpi) between CMV-inoculated and mock-inoculated samples. The DEGs with adjusted p-value � 0.05 were considered significant.
To identify Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of DEGs, KOBAS v3.0 [50] was used. Significant enriched GO terms and KEGG pathways were chosen based on the corrected p-value � 0.05 obtained through Benjamini and Hochberg's method for controlling false discovery rate (FDR). The pipeline used to analyze sRNASeq data was followed as described previously [51]. Briefly, after inspecting raw reads quality by FastQC, adapters and low-quality bases were trimmed by Trimmomatic. Afterwards, tRNA, rRNA, snRNA, snoRNA retrieved from NCBI, Rfam, and RepBase databases were removed to obtain clean reads. In miRNA analysis, clean reads were mapped to C. quinoa reference genome by Bowtie2 [52] to recover quinoa sequences. Then, miRDeep2 [53] was employed to detect known miRNAs based on miRBase v22 database, and predict novel miRNAs based on the randfold with a p-value < 0.05. Read counts of known and novel miRNAs obtained by "quantifier" module of miRDeep2 were used in DESeq2 for detection of differential expressed miRNAs (DEmiRNA) based on the same design formula as DEG analysis. The DEmiRNA with adjusted p-value � 0.05 were considered significant and further analyzed by TargetFinder [54] to predict their target genes. Significant GO terms and KEGG pathways of target genes were obtained by KOBAS as described above. Characterization of endogenous siRNAs and vsiRNAs was carried out by alignment of miRNA-free clean reads to C. quinoa reference genome and three RNAs of CMV, respectively, using Bowtie2 with alignment options of very sensitive, zero mismatch, and 18 nucleotides as the minimum sequence size. The 5' base enrichment and nucleotide size distribution analyses were carried out by "reformat" module of BBMap [55] software. To prepare the read counts for differential expression of siRNA, miRNA-free clean reads were used as input in ShortStack 3.7 [56]; this was used to determine siRNA clusters accumulating in genomic loci. The parameters used in ShortStack were nohp mode, mismatch 0, dicermin 18, dicermax 30, DicerCall cut-off of 80%, and mincov 0.5. Differential expression of siRNAs was determined in two modes as the same as DEG analysis of RNASeq. DE-siRNA clusters from full design and variety-specific models were considered significant based on Benjamini Hochberg adjusted pvalue � 0.05. To predict putative genes, transposable elements (TEs), and transcription factor binding sites (TFBSs) as potential targets of DE-siRNA clusters, a window frame of 2 kb upstream of the gene was inspected. Targeted sequences were retrieved and searched using CENSOR [57] algorithm to find candidate TE from RepBase database. Also, the retrieved sequences were subjected to PlantRegMap [58] database against Arabidopsis thaliana, Beta vulgaris and Spinacia oleracea to perform an exhaustive search on transcription factor binding sites with a p-value cut-off of 1e-7. Finally, to annotate the biological function of the targeted gene, KOBAS was used to perform GO and KEGG pathway enrichment analyses. All coding used in the study is available upon request.

Validation of DEGs by RT-qPCR
To validate RNASeq results, nine significant DEGs were randomly selected to conduct quantitative RT-PCR on total RNA samples used for RNASeq experiment. To target each gene, two random samples were selected as template: one from mock-inoculated and one from virusinoculated sample. For each sample, two technical replicates were used. Primers were designed by Primer3 software (S1 Table). Total RNA was primed by random hexamer and reverse transcribed using SuperScript™ III Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) according to manufacturer's instruction. The cDNA synthesis was carried out on Eppendorf Mastercycler Nexus Thermal Cycler (Hamburg, Germany). The PCR reaction was conducted in a final volume of 10μl [1 μl cDNA, 0.4 μl of each primer (10 μM), 3.2 μl RNase/DNase-free water, and 5 μl PowerUp™ SYBR™ Green Master Mix (Applied Biosystems, USA)]. The amplification conditions were 2min at 50˚C for UDG activation, initial denaturation of 2min at 95˚C followed by 40 cycles of 95˚C for 15 s and 60˚C for 1min. The qPCR was performed by Quant-Studio™ 6 Flex Real-Time PCR machine (Applied Biosystems, Foster City, CA, USA). Elongation Factor 1α (CqEF1α)(XM_021888526.1) was considered as the reference gene for normalization of gene expression. Relative gene expression was calculated using the 2 −ΔΔCT method [59].

Relative expression of TSARL1 gene during virus infection
The expression of TSARL1 was relatively quantified upon time-course CMV infection among quinoa varieties. Three biological and two technical replicates was used in RT-qPCR. The primer design (S1 Table), cDNA synthesis and qPCR methods were performed as described above. The relative expression pattern of the gene was measured using 2 −ΔΔCT method based on the comparison of mock-and CMV-inoculated samples.

Validation of known and novel miRNA by stem-loop RT-PCR
To validate results of detected miRNAs, six known and nine novel miRNAs were randomly selected. Stem-loop primers for cDNA synthesis were designed by miRNA Primer Design Tool [60]. Stem-loop pulsed reverse transcription of each miRNA was according to Varkonyi-Gasic et al. [61] with few modifications. Briefly, 1-2 μg total RNA was added to 0.5 μl dNTP (10mM), 1 μl stem-loop primer (10μM) (S1 Table) and desired volume of RNase/DNase-free water to a total of 13.8 μl. The mixture was incubated for 5min at 65˚C followed by 3 min ice incubation. Subsequently, 4 μl of 5X first-strand reverse transcriptase (RT) buffer mixed with 2 μl DTT and 0.2 μl SuperScript™ III Reverse Transcriptase kit (Invitrogen, USA) (200U/μL) were gently added to yield the final volume of 20 μl. Tubes were loaded to thermal cycler for pulsed reverse transcription with initial incubation at 16˚C for 30 min, followed by 60 cycles of 30˚C for 30 s, 42˚C for 30 s and 50˚C for 1 s. Then, the samples were incubated at 85˚C for 5 min to inactivate RT enzyme. To prepare PCR reactions, 1 μl of cDNA was added to 0.4 μl of each primer (10 μM), 8.2 μl water, and 10 μl of Platinum™ II Hot-Start Green PCR Master Mix (2X). Loaded samples in thermocycler were initially denatured at 94˚C followed by 35-40 cycles of 94˚C for 15s, and 60˚C for 30sec. PCR products were resolved electrophoretically on ethidium bromide stained-2% agarose gel immersed in 1× TAE and visualized with Biorad Gel Doc XR instrument (Biorad, USA).

Symptomatology and verification of CMV infection of quinoa
A total of 36 quinoa samples comprised of three varieties (Table 1), two harvest time-points, and three biological replicates were either inoculated with CMV or inoculation buffer. By 4 days after inoculation with CMV, chlorotic local lesions developed across all replicates of 'QQ74' and 'Jessie', whereas all replicates of the bitter variety 'Red Head' had local systemic chlorosis that was not spread throughout the plant (Fig 1A, S2 Fig). Infection was confirmed in vivo by immunoassay and in silico, with all CMV genomic RNAs detected in RNASeq and small RNASeq datasets (Fig 1B). Control samples that underwent mock-inoculation did not have symptoms or signs of infection.

Transcriptome and Small RNASeq profile of quinoa
To determine molecular interactions and stress responsive mechanisms in quinoa during CMV infection, transcriptome and small RNA sequencing were performed. In transcriptome sequencing, a total of 1.182 billion raw reads was generated. After removing low quality reads and adapters, trimmed sequences (1.131 B) were mapped to C. quinoa reference transcripts [4] with an average successful alignment percentage rate of 90.17 (S2 Table). To eliminate the effect of sequence length and coverage depth of samples, mapped reads were normalized to TPM values for use in downstream analyses.
The small RNA sequencing of 36 samples resulted in 364.31 million raw reads. After the trimming step, the surviving reads (327.84 M) were mapped to C. quinoa reference genome with an average successful alignment percentage rate of 97.93 (S2 Table). Exclusion of transfer (tRNA), ribosomal (rRNA), small nuclear (snRNA), and small nucleolar (snoRNA) RNAs from mapped reads resulted in the remaining reads (198.11 M) that were further used for miRNA or siRNA analyses. All raw data are available from NCBI SRA database under Bioproject accessions PRJNA541979 and PRJNA541982.

DEG Identification and GO and KEGG pathway enrichment analyses
TPM-normalized gene counts were imported in DESeq2 package, and DEGs were determined between CMV-and mock-inoculated samples with an adjusted p-value cut-off � 0.05. In the first mode of analysis (full design) that encompassed the effect of variety, time and treatment and their interactions, a total of 250 genes were differentially expressed ( Table 2 and S3 Table). Principal component analysis (PCA) was used to compare gene expression profiles of the samples; samples with similar pattern were plotted close to each other while samples with different profiles scattered far from each other in the scores plot of the PCA (S3 Fig). PC1 explained 53% of variation and PC2 explained 12%. In the variety-specific analysis (individual variety design), DEGs were determined between CMV-inoculated and mock-inoculated samples by including time, treatment and time × treatment. In 'Jessie' a total of 332 genes, in 'QQ74' 85 genes, and in 'Red Head' 140 genes were differentially expressed ( Table 2).
To identify biological function of DEGs, GO term analysis was performed on DEGs using KOBAS with corrected p-value cut-off � 0.05 (Table 2). In the full design, significant up-regulated (UR) GO term candidates were endopeptidase regulator and oxidoreductase activities, and down-regulated (DR) GO terms were cell aging and hydrolase activity. In individual variety design, for 'Jessie' significant UR GO terms were response to oxygen-containing  compound and flavonoid biosynthetic process, whereas photoperiodism and response to stimulus were among candidate DR GO terms. In 'QQ74', cellulose catabolic process and telomere maintenance were significant UR GO terms, and telomere organization and beta-glucan catabolic process were DR GO terms. In 'Red Head', UR GO terms were chloroplast accumulation movement and response to stress, and DR GO terms were response to chemical and chlorophyll metabolic process (S4 Table). To identify the role of DEGs in primary and secondary biological pathways, KEGG enrichment analysis was performed with KOBAS (Table 2). With all the experimental factors in the statistical design, glycerophospholipid metabolism was among significant UR KEGG pathways; and pyrimidine metabolism was the only enriched DR KEGG pathway. Within each variety design, in 'Jessie', flavonoid biosynthesis was the most enriched UR KEGG pathways, and circadian rhythm was a significant DR KEGG pathway candidate. In 'QQ74', top enriched UR KEGG pathway was sulfur metabolism, whereas the most significant DR pathway was biosynthesis of secondary metabolites. In 'Red Head', top enriched UR KEGG pathway was circadian rhythm, and the most significant DR pathway was glycerolipid metabolism (S4 Table). Overall, among the enriched GO terms and KEGG pathways, modulation of hormonal signaling, plant-pathogen interaction (PPI), photosynthesis, carbon metabolism, and amino acid biosyntheses were shared at least once between varieties.

Differential rxpression of TSARL1 gene in quinoa upon CMV infection
To inspect the effect of local and systemic infection of CMV on expression of TSARL1 gene in sweet/semi-sweet and bitter quinoa varieties, mock-and virus-inoculated samples were compared by RT-qPCR. In sweet varieties, TSARL1 expression was consistently upregulated at 1 (early stage of infection) and 4 dpi (at symptom development). However, in the bitter variety gene expression was upregulated at 1 dpi but suppressed drastically at 4 dpi (Fig 2).

Detection, differential expression, and target gene analysis of known and novel miRNAs
To further characterize miRNAs from trimmed small RNASeq reads, sequences mapped to C. quinoa were separated from tRNA, rRNA, snRNA, and snoRNA. The resulting clean reads were used as input for miRDeep2 software. Known miRNAs were detected based on the mature and precursor sequences of miRBase v22. The total of 81 known miRNAs belonging to 11 miR families (miR 156, 160, 162, 166, 171, 172, 319, 393, 395, 398, and 399) were detected. The miR166 was the most abundant detected known mature miRNA among the quinoa samples. In addition, novel miRNAs were identified based on the randfold (p < 0.05) and miR-Deep2 score � 0. Prediction of novel miRNAs resulted in 876 significant novel mature miRNAs, among which novel miRNA "NW_018742204.1_181" was the most prevalent novel miRNA among quinoa samples (S5 Table). Determination of RISC factors involved in miRNA biogenesis is possible through nucleotide size distribution and 5' base enrichment analyses of miRNAs. The dominant size and the enriched 5' terminal base of known miRNAs were cleaved sequences with the size of 21 nucleotides and 5' Uracil (U) (S5 Table). To characterize differential expressed miRNA (DEmiRNA), read counts of known and novel miRNAs were used in DESeq2, and DEmiRNAs were determined between CMV-inoculated and mock-inoculated samples based on the threshold p � 0.05. In the full model that included the variety interaction as well, one DEmiRNA was UR and one was DR. Within the varieties, in 'Jessie' two UR-and four DR-miRNAs (including miR166, miR399) were detected, in 'QQ74' five DEmiRNAs (5 DR) were identified, and in 'Red Head' no DEmiRNA was detected. TargetFinder analysis of miRNAs detected a total of 126 candidate target genes that four (3 UR, 1 DR) belonged to the fullmodel design and 122 (19 UR, 103 DR) belonged to individual-variety design ( Table 3, S6  Table). To provide comprehensive function of DEmiRNAs, enrichment analysis of targeted genes was performed by KOBAS (adjusted p � 0.05). Overall enriched biological pathways were involved in cell recognition and water homeostasis. In 'Jessie', targeted genes were enriched in ether lipid/glycerophospholipid metabolism, ubiquitin-mediated proteolysis and endocytosis functions. In 'QQ74' and 'Red Head', no biological pathways were enriched (S6 Table).
To identify the complexes involved in formation of endogenous siRNA, nucleotide size distribution and 5' end base abundance of detected siRNAs are needed. Since siRNAs completely match their target sequences, mapping clean small RNA reads to C. quinoa reference genome allowing zero mismatch resulted in siRNA targets. These target sequences were used to identify dominant cleaved nucleotide size and 5' terminal base of endogenous siRNAs. The average alignment rate was 99.75%. The sequences with 24 nt were more abundant than other nucleotide sizes (Fig 3). Also, Adenine (A) was the predominant base at 5' end (S4 Fig). To enable differential expression of siRNAs, the ShortStack package was used to determine the counts of siRNA clusters varying in size (18-30 nt) over quinoa genome. A total of 160,533 siRNA clusters was retrieved per individual sample (S7 Table). The abundance of 24 nt siRNA sizes was higher (Fig 3); however, within this size number,clusters detected in 'Red Head', bitter variety, were significantly (p = 0.01) higher than in two sweet/semi-sweet varieties, 'Jessie' and 'QQ74' (S7 Table). Using DESeq2 on cluster counts of the samples by comparing CMV-inoculated and mock-inoculated tissues under two models, overall analysis and variety-specific analysis, revealed that the size of 24 nt was the most abundant differentially expressed Dicer cleaved siRNA. Among the total 377 differentially expressed siRNA clusters, 235 belonged to overall analysis, and 142 belonged to variety analysis ('Jessie': 100; 'QQ74': 39; 'Red Head': 3) (S8 Table). These 24nt hc-siRNA clusters were used for further analyses. Since hc-siRNAs target genes and intergenic regions such as TEs, gene promoter and TFBSs, multiple methods were exploited to predict candidate targets spanning 2kb of the targeted genes. Based on the overall or variety-specific analyses, targeted repetitive regions and biological functions of the targeted genes were annotated separately (S9 Table).

Characterization of virus-derived siRNAs
Complexes involved in RNAi leading to antiviral defense could be identified through mapping clean reads to CMV genome with the perfect match option. Therefore, the same alignment option as described above was followed to retrieve vsiRNA sequences. These sequences provide information about nucleotide size distribution and 5' end base enrichment of vsiRNA that have been cleaved by quinoa RNA silencing factors. Forty five percent of vsiRNA had a length of 21 nt, and 30% were 22 nt (Fig 4A). There was no consistent 5' terminal base among varieties and time points (Fig 4B, S5 Fig). Obtained vsiRNA sequences also provided information about three RNA segments of CMV from which they were derived. Sequences derived from viral RNA3 were more prevalent than signatures of RNA1 and RNA2 (Fig 5).

RT-qPCR validation of DEGs
RT-qPCR with specific primers was used to validate DEG patterns detected in silico from RNASeq for nine genes involved in RNA transport, membrane structure, lipid or protein   biosynthesis. The resultant fold changes between CMV-and mock-inoculated samples were compared to those of detected DEGs from DESeq2 output. All in vitro tested genes had the same expression patterns as in silico detected DEGs (Fig 6), which validates reliability of DEG analysis utilized in this study.

Validation of known and novel miRNA by stem-loop RT-PCR
To confirm identified known and novel miRNA in vitro, 6 randomly selected known miRNAs including miRs 156, 166b, 166m, 393, 395, 399, plus the 9 novel miRNAs were reverse transcribed using stem-loop primers, followed by PCR and gel visualization. All the tested known and predicted miRNAs were amplified as expected (Fig 7), indicating reliability of the obtained sRNASeq results.

Discussion
Transcriptional responses of different quinoa ecotypes during abiotic stresses [62,63] and transcriptome changes of herbaceous host, C. amaranticolor, in response to different viruses Several metabolic pathways including purine and pyrimidine metabolism, nitrogen processes, and terpenoid backbone were downregulated in virus-infected plants (1 dpi and 4 dpi combined).Purine catabolism is a vital housekeeping function of the plants for growth, development, nitrogen remobilization, and induction of defense-related hormone signaling [64,65]. Allantoin is a nitrogen-rich compound and an essential intermediate in purine catabolism that typically accumulates in stressed plants [65]. Allantoin induces jasmonic acid (JA) [64, 65], the defense signaling phytohormone that results in systemic resistance against viruses [66][67][68][69]. In this study, downregulation of allantoin and nitrogen compounds could have resulted from suppression of purine catabolic processes. These successive suppressions may have decreased JA signaling resulting in plant susceptibility to virus infection [66,69]. Based on retrieved vsiRNAs from sRNA analysis in our study, there were few sequences of 3' end of RNA2 encoding 2b protein of CMV. Because viral 2b protein inhibits cleavage activity of Argonaute [70], increased levels of this protein allow the virus to counter both plant defense through JA signaling repression and facilitates long distance movement of the virus. Purine and pyrimidine together as part of nucleotide metabolism of host are essential for virus replication. Downregulation of these two compounds could be an antiviral mechanism that deprives virus of nucleotides required for virus replication machinery [71]. Terpenoids are secondary metabolites with the anti-herbivory and defensive roles in plants [13]. Downregulation of DEG involved in terpenoid backbone biosynthesis was related to a chloroplastic terpenoid pathway (MEP) ( Table 4); this restricts biosynthesis of downstream terpenoid compounds and impairs the terpenoid-related defensive roles [72,73].
Upregulated DEGs were involved in lipid and nitrogen metabolism, translation, hormone signaling, protein and terpenoid biosynthesis. Nitrogen is one of the primary limiting factors for plant growth, and its assimilation is required for biosynthesis of amino acids glutamate, glutamine, asparagine, and aspartate [74,75]. CMV infection in tobacco leaves led to overexpression of two enzymes implicated in primary assimilation of nitrogen, glutamate dehydrogenase (GDH) and cytosolic glutamine synthetase (GS1) [76]. In our study, accumulation of nitrogen compounds likely led to induction of amino acid biosynthesis, thus resulting in Table 4 elevated levels of translation process and protein formation; the finding is also consistent with the study on Arabidopsis thaliana that reported impact of elevated level of nitrogen on increasing protein content [74]. Endoplasmic reticulum (ER), the replication site for viruses in the Bromoviridae family, is the site for formation of viral factories, vesicles that are required for virus replication [77,78]. In quinoa, following induction of DEGs in nitrogen and protein biosynthesis, amino acid transport to ER was also increased. Because amino acid transport is correlated to abundance of nitrogen in plant cells [75], elevated processes of protein retention in ER and translation might be due to viral demand for the protein biosynthesis required for its replication and assembly. Also, increased level of lipid metabolism might be the result of recruiting host cellular factors to synthesize glycerophospholipids, which is the main component of membraneous viral factories [79]. Phytohormones have several regulatory roles in plant development, growth, host-pathogen interaction, and plant defense mechanism [80]. Salicylic acid (SA) is one of the major defense-related hormones and plays a positive regulatory role against plant viruses. The SA interaction with brassinosteroid (BR) or ethylene (ET), as two other phytohormones, results in either synergistic or antagonistic immune responses. Because BR induces ET biosynthesis [81] and ET precedes the SA signaling pathway [82], the upregulation of both BR and ET DEGs in this study, that was concurrent with overexpression of SA-related DEGs, could have been a synergistic defense response against virus. Enzymatic reactions of synthases, reductases, kinases, etc. in the terpenoid backbone biosynthesis concatenates isoprene units to produce various classes of terpenoids. Monoterpenoids are biosynthesized through MEP pathway, whereas sesquiterpenes are produced in a cytosolic pathway (mevalonate [MVA]) [83]. Overexpression of DEGs pertaining to sesquiterpenes (Table 4) in quinoa might have resulted from increased activity of MVA pathway, which probably compensates the suppressed sources of chloroplastic-derived isoprene units. Although, MEP pathway was downregulated, induction of monoterpenoid DEGs might be from elevated isoprene blocks originated from MVA pathway. The cross-talk of MEP and MVA pathways have been reported in northern red oak as a response to ozone [84].

Full Design
In quinoa, expression of TSARL1 gene is correlated with saponin content [4]. In this study there was a concomitant decrease in expression of TSARL1 and the systemic spread of CMV in the bitter variety, but not in the two sweet/semi-sweet varieties. This negative correlation between virus spread and relative saponin content of quinoa seed is consistent with a previous study [72]. In that study, tomato yellow leaf curl China virus (TYLCCNV)-infected plants had decreased terpenoid expression, and this led to higher fitness of Bemisia tabaci, a whitefly vector of TYLCCNV. On the other hand, virus-free plants showed higher expression of terpenoids, which resulted in low level of insect infestation.
Analysis on the effect of CMV infection over time on separate quinoa varieties resulted in identification of DEGs that modulated variety-specific biological pathways, such as plant-pathogen interaction (PPI), DNA replication, mismatch and nucleotide excision repair, homologous recombination, and hormone signaling. The DEGs for PPI include defense-related mechanisms that involve reactive oxygen species (ROS) and result in hypersensitive response, cell programmed death, and induction of genes that are related to defense against viruses. All these responses modulate levels of defense hormones such as SA, JA, and ET [85]. In 'Jessie', a variety with CMV-induced local lesions, downregulation of PPI was concomitant with suppression of DEGs of ROS and defense hormones (SA, JA, ET), which implies that these defense responses were suppressed over time (upregulated at 1 dpi during the active defense response and then downregulated at 4 dpi). Although the fact that a local lesion variety has downregulated defense system seems counter-intuitive, plant reaction to viral inoculation occurs rapidly and processes that require upregulation can be completed in early stages of infection [86]. In 'Red Head', the variety with CMV-induced systemic symptoms, both PPI and ROS were upregulated, which implies elevation of these two stress responsive mechanisms over time. Therefore, in contrast to 'Red Head' (susceptible host) that had a consistent induced defense response, 'Jessie' (resistant host) had induction of defensive pathways at early stage of infection and then suppression of those defensive pathways at late stage of infection. Replication and repair processes of host genome is a critical and substantial antiviral mechanism [85]. In tobacco infected with TMV, upregulation of plant DNA homologous recombination resulted in persistent induction of DNA rearrangement and leucine-rich repeat (LRR) gene expression, which led to persistent and broad-spectrum resistance to TMV infection in tobacco [87,88]. In 'QQ74', a variety with few CMV-induced local lesions, PPI over time was overexpressed. This finding could be due to upregulation of DNA replication, mismatch and nucleotide excision repair, and homologous recombination that provided prolonged and persistent defense responses such as PPI even at 4 dpi. The DEGs involved in hormone signaling were altered in 'Jessie' and 'Red Head' but not 'QQ74'. Reprogramming of auxin biosynthesis and/or signaling through altering its function and subcellular localization by several plant viruses including CMV has been shown to be beneficial for virus movement and replication, as well as enhanced infection and disease symptoms [80,89]. Elevated levels of auxin repressed the hypersensitive response (HR) resulting from SA signaling pathway [80]. In this study, 'Jessie' and 'Red Head' displayed induction of genes of auxin signaling over time in CMV infection, suggesting suppression of systemic resistance in both varieties via disruption of SA-mediated defense responses. It has been reported that in cucumber infected with CMV, ET is a determinant for viral-induced symptoms [90]. In 'Red Head', overexpression of ET-related genes at 4 dpi could be responsible for appearance of CMV-induced systemic symptoms. Abscisic acid (ABA), a phytohormone involved in developmental processes, such as seed germination, fruit ripening, and responses to abiotic stresses [80], is antagonist to SA/JA-dependent defense pathway, ROS production and HR, and therefore represses systemic resistance against viruses [80,91]. Furthermore, ABA induces callus deposition in plasmodesmata, which limits virus intercellular movement, and also elevates AGO1 activity implicated in gene silencing [80,91]. In this study, in 'Red Head', induction of ABA responding genes at 4 dpi may have played a dual role against CMV infection: 1) utilization of ABA by virus to negate host systemic resistance, and 2) induction of ABA by host to repress virus through limiting viral movement or silencing viral genes via RNAi machinery. In tobacco, cytokinin (CK) accumulation led to enhanced viral resistance through SA-mediated defense responses [92]. In 'Jessie' and 'Red Head', downregulation of CK-related genes over time was likely related to suppression of systemic resistance through SA signaling pathway. In 'Jessie' this might be a host strategy to bring hormonal levels to preinfection levels, because the virus has already been deactivated in the adjacent cells due to appeared local lesions [93]. Conversely, in 'Red Head', suppression of CK-related genes was probably governed by CMV, because of the systemic infection of the virus that led to continuous modulation of defense hormone signaling over time.
The primary plant-derived small RNAs (miRNA and siRNA) are necessary regulatory factors for modification of endogenous or exogenous gene expression through either mRNA cleavage or translation inhibition. RISC factors such as DCL and AGO have crucial roles in miRNA and siRNA biogenesis. Since, in other plants DCL and AGO factors have been identified based on cleaved nucleotide size and 5' terminal base of sRNAs [24,27,28], in this study we conclude that miRNA sizes of 21 nucleotides were cleaved by DCL1, and the single stranded RNAs with 5'U are loaded on AGO1 to guide the RISC to regulate the target genes. In Arabidopsis, 24nt siRNAs were cleaved by DCL3 and loaded on AGO4/6/9 to inactivate genomic and intergenic regions through sequence methylation and histone modification. RDR2 is also recruited in the process to generate dsRNA to enhance siRNA biogenesis [24,30,31]. Therefore, in quinoa, the dominant nucleotide size of 24nt, which are known as hc-siRNAs and had the 5'A bias, were likely generated by association of DCL3, AGO4/6/9, and RDR2. Also, in our analysis presence of endogenous siRNAs with different sizes (21-27 nt) could be attributed to the function of RDR2.
Ten out of eleven known miRNA families identified in this study were also reported at least once elsewhere from C. quinoa [4,94]; miR395 was reported only from this study. This indicated conserved structure of the detected miRNAs between this study and the others, and reliability of our in silico miRNA analyses. Identification of known and novel miRNAs in quinoa [4,94] and prediction of their target genes has already been reported [94]; however, screening of DEmiRNAs in quinoa between CMV-and mock-inoculated samples had not yet been investigated until this study. DE analysis detected a total of 13 DEmiRNAs from quinoa varieties, which overall targeted 126 candidate genes ( Table 3, S6 Table). As miRNA targets 3' UTR of the target genes for degradation by RISC [95], there are several lines of evidence confirming that there is a strong negative correlation between miRNA expression and regulation of their targeted mRNA [96][97][98][99]. In 'Jessie', downregulation of novel miRNA "NW_018743850.1_23708" derepressed expression of ABA receptor genes, which are known to modulate ABA levels in plants [100]. Increased ABA may result in diverse defense responses: 1) elevated callose deposition in plasmodesmata to confine and eliminate the virus, and/or 2) a viral strategy to inhibit host systemic resistance via disruption of SA/JA signaling pathway. The role of miR399 in relationship with lipid metabolism have been reported in maize [101]. In 'Jessie', suppression of miR399 and miR166, and induction of novel miRNA "NW_018744092.1_28071", might lead to upregulation of target genes involved in lipid biosynthesis, and one may infer that the virus regulated three miRNAs to modulate host lipid biosynthesis for its own multiplication. Another role of miR399 is to regulate ubiquitinconjugating E2 enzyme that target proteins for degradation [102]. In 'Jessie', downregulation of miR399 may induce: 1) expression of ubiquitin enzyme that increases degradation of viral proteins, and 2) endocytosis, which is a host mechanism to remove ligands, nutrients, lipids and proteins from the cell.
The complexity in siRNA classification during host-virus interaction at genome-wide levels needs development of particular analysis. Using ShortStack facilitated identification of siRNAs based upon 1) software development according to plant genomes, and 2) prediction of de novo regions over the genome with accumulated siRNAs named "clusters". In the Arabidopsis-root knot nematode interaction, abundance of 23 and 24 siRNAs is higher in galled samples than in controls and was attributed as a gall characteristic [29,34]. In this study, however, significant higher accumulation of 24nt siRNA clusters among varieties ('Red Head', bitter variety had more siRNA than other two varieties), may suggest that varied abundance of siRNA clusters in quinoa was a variety-specific incidence. Since hc-siRNA were the most abundant cleaved size among quinoa siRNA clusters and are associated with silencing of gene and intergenic repeats through DNA methylation and histone modification, differential expression analysis focused only on hc-siRNAs. Similar to miRNAs that have negative correlation with expression of their targets, siRNA expression also influences their targets inversely [29,103]. Thus, differentially expressed clusters of hc-siRNAs in quinoa either in overall or variety-specific modes impact their putative genomic regions including genes, TFBSs, and TEs in a negative manner. Based on target analyses of differentially expressed hc-siRNA in quinoa, intergenic areas and repetitive elements were more regulated than coding sequences; this is consistent with the fact that hc-siRNA mostly target promoter regions [29]. The differentially expressed hc-siRNAs identified in this study may be good resistance candidates during virus infection, however, further functional studies should be conducted to verify their expression via RNA-directed DNA methylation, and their potential role in quinoa-virus interaction.
During CMV infection in Arabidopsis, DCL4 produced 21nt vsiRNA that has been implicated in viral gene silencing. DCL4 mutants had a higher abundance of 22nt vsiRNA, indicating recruitment of DCL2 to compensate DCL4 impairment [25,28]. Both DCL4 and DCL2 may also be involved in vsiRNA biogenesis in quinoa since 21 and 22 nt are the predominant sizes (Fig 4; S5 Fig). More vsiRNAs mapped to RNA3 than RNA1 or RNA2 (Fig 5), perhaps due to the fact that RNA3 is the most abundant RNA in infected cells. RNA3 also has shorter sequence length and higher replication rate than RNA1 and RNA2 [104,105]. Since DCL2 and DCL4 preferentially cleave GC-rich template regions [106] and GC content of CMV RNAs is similar (RNA1: 46.5%; RNA2: 45.9%; and RNA3: 46.9%), this eliminated GC content as the reason for uneven distribution of vsiRNAs across CMV genome. In CMV-infected Arabidopsis and Nicotiana, RDR1 has the primary role in vsiRNA formation with a 5' selection bias for three viral genomic RNAs. In RDR1 mutants there was an increased production of vsiRNAs from the 3' half of the genomic RNAs, particularly in RNA3, which appeared to be RDR6-dependent [105,107]. Thus, in our study, variability of 5' bases ( Fig 4B) and higher number of vsiRNAs mapped to RNA3 (Fig 5) may have been due to suppressed activity of RDR1 and coexpression of RDR6 during vsiRNA formation.

Conclusions
Altogether, we concluded that CMV infection of sweet/semi-sweet quinoa varieties with different resulted in variable virus-induced local or systemic symptoms. Suppression of TSARL1 gene in the bitter variety was concurrent with systemic movement of the virus in the leaves. Integration of high-throughput transcriptome and sRNASeq of quinoa varieties during timecourse CMV infection provided the knowledge on 1) general and individual variety responses such as perturbation of translation, lipid and nitrogen metabolism as well as plant-pathogen interaction, hormonal signaling, DNA and mismatch repair processes, and 2) differentially expressed miRNAs and siRNAs and their putative targets. Our findings will enhance knowledge on 1) genes and biological pathways involved in tolerance of quinoa varieties to viral infection, and 2) molecular characteristics of regulatory miRNAs, endogenous and exogenous siRNAs. However, functional studies should be conducted to confirm the impact of regulatory and endogenous sRNA in quinoa as well as their role in quinoa-virus interaction. Taken together, information provided in this study will be helpful in development of resistant quinoa varieties against virus infection through molecular breeding.