Full genome characterization of 12 citrus tatter leaf virus isolates for the development of a detection assay

Citrus tatter leaf virus (CTLV) threatens citrus production worldwide because it induces bud-union crease on the commercially important Citrange (Poncirus trifoliata × Citrus sinensis) rootstocks. However, little is known about its genomic diversity and how such diversity may influence virus detection. In this study, full-length genome sequences of 12 CTLV isolates from different geographical areas, intercepted and maintained for the past 60 years at the Citrus Clonal Protection Program (CCPP), University of California, Riverside, were characterized using next generation sequencing. Genome structure and sequence for all CTLV isolates were similar to Apple stem grooving virus (ASGV), the type species of Capillovirus genus of the Betaflexiviridae family. Phylogenetic analysis highlighted CTLV’s point of origin in Asia, the virus spillover to different plant species and the bottleneck event of its introduction in the United States of America (USA). A reverse transcription quantitative polymerase chain reaction assay was designed at the most conserved genome area between the coat protein and the 3’-untranslated region (UTR), as identified by the full genome analysis. The assay was validated with different parameters (e.g. specificity, sensitivity, transferability and robustness) using multiple CTLV isolates from various citrus growing regions and it was compared with other published assays. This study proposes that in the era of powerful affordable sequencing platforms the presented approach of systematic full-genome sequence analysis of multiple virus isolates, and not only a small genome area of a small number of isolates, becomes a guideline for the design and validation of molecular virus detection assays, especially for use in high value germplasm programs.

Although CTLV was first discovered in citrus, it has been reported to infect a wide range of herbaceous hosts, many of which remain symptomless [13]. Most CTLV infected commercial citrus varieties also remain asymptomatic except when CTLV infected budwood is propagated onto trifoliate orange (P. trifoliata (L.) Raf.) or trifoliate hybrid citrange (P. trifoliata × C. sinensis) rootstocks [2,20]. The resulting citrus trees are stunted, display chlorotic leaves, and show bud union incompatibility, leading to the ultimate decline of the tree [10,21]. This poses a serious problem because trifoliate and trifoliate hybrid rootstocks are widely used in all citrus producing areas of the world for their tolerance to citrus tristeza virus and Phytophthora species in addition to many other desirable horticultural characteristics (e.g. freeze tolerance, good yield and fruit quality) [22][23][24].
The numerous asymptomatic citrus and non-citrus hosts in combination with the destructive potential of the virus for trees propagated on commercially important rootstocks make CTLV a serious threat to the citrus industry [17,20,21,25]. Reliable pathogen detection assays for the production, maintenance, and distribution of pathogen-tested propagative materials by citrus germplasm and certification programs are the basis for any successful mitigation effort against viral threats, including CTLV [26][27][28][29][30][31]. Bioindicators for indexing of CTLV such as Citrus excelsa, and Rusk citrange, displaying symptoms of deformed young leaves under controlled greenhouse conditions, provide a reliable diagnostic tool [6]. ASGV antiserum was used both in enzyme-linked immunosorbent assay and in immunocapture RT-PCR for CTLV detection [32]. A series of conventional reverse-transcription polymerase chain reaction (RT-PCR) based methods had been developed for CTLV including two-step multiplex assays [33,34] and a one-step RT-PCR assay with a semi-nested variation [28]. More recently, reverse transcription quantitative PCR (RT-qPCR) assays were developed for CTLV detection using SYBR 1 Green [35] and florescent probe platforms [25].
At the time that Liu et al. (2011) published their assay there were only four full-genome CTLV sequences deposited in the GenBank [35]. Cowell et al. (2017) reported that the RT-qPCR assay was designed based on seven full-genome sequences available at the time in the GenBank [25]. Today, a total of 12 full-genome sequences are available in the GenBank [2,36,37]. Due to the limited number of CTLV full-genome sequences, very little is known about the phylogenetic relationship and genomic diversity of virus and how such diversity may influence its detection. Next generation sequencing (NGS) technologies combined with bioinformatics have proven to be powerful tools for the assembly of full-genome virus sequences [38][39][40] and the guidelines for the design and validation of real-time qPCR methods are well established [41,42]. The purpose of this study was to characterize and further develop a robust CTLV RT-qPCR detection assay based on the systematic analysis of newly generated full-length genome data from multiple virus isolates maintained for the past 60 years at the CCPP.

Virus isolates and RNA extraction for full-length genome sequencing
Twelve CTLV isolates from various citrus varieties introductions, originating from different geographical locations, were intercepted and maintained in planta under quarantine at the CCPP disease collection between 1958 and 2014 (Table 1). Sweet orange (C. sinensis (L.) Osbeck) seedlings were graft-inoculated with the different CTLV isolates and total RNA was extracted from phloem-rich bark tissues of the last matured vegetative flush (i.e. one-year-old budwood) using TRIzol 1 reagent (Invitrogen, Carlsbad, California, USA) per manufacturer's instructions. The purity and concentration of the RNA were tested using a Nanodrop spectrophotometer and Agilent 2100 Bioanalyzer per manufacturer's instructions.

NGS library preparation and bioinformatics
CTLV RNA libraries were constructed using 4μg of total RNA with TruSeq Stranded mRNA Library Prep Kit (Illumina, San Diego, California, USA) per manufacturer's instructions. The RNA libraries were sequenced on an Illumina HiSeq 2500 instrument with high-output mode and single-end 50 or 100 base pairs (bp) at SeqMatic LLC (Fremont, California, USA). All sequencing data was generated by SeqMatic using an Illumina Genome Analyzer IIx and filtered through the default parameters of the Illumina QC pipeline and demultiplexed. The files were uploaded onto the VirFind bioinformatics server and mapped to the reference genome by Bowtie 2, followed by outputting mapped and unmapped contig sequences [43]. Unmapped sequences were de novo assembled by Trinity [43]. Assembled contigs were analyzed through BLASTn with an E-value cutoff of 10 −2 against all virus sequences in GenBank and generated outputs of reads and report for virus sequences.

Rapid amplification of cDNA ends of viral RNA
The 5' and 3' end sequences were obtained via rapid amplification of cDNA ends (RACEs). The 5' end sequence of each CTLV isolate was confirmed using FirstChoice 1 RLM-RACE Kit (Thermo Fisher Scientific, Carlsbad, California, USA). As per manufacturer's instructions, first-strand cDNA was synthesized and followed by nested PCR with the primer sets listed in S1 Table. To confirm the 3' end sequence of each CTLV isolate, first-strand cDNA was synthesized using SuperScript 1 II transcriptase (Thermo Fisher Scientific, Carlsbad, California, USA) with oligo dT 16mer and then performed PCR using Platinum 1 Taq DNA Polymerase High Fidelity Kit (Thermo Fisher Scientific, Carlsbad, California, USA) with the oligo dT 16mer and a CTLV gene specific primer (S1 Table). The PCR product that contained either the 5' or 3' end was ligated into pGEM 1 -T Easy Vector Systems (Promega, Madison, Wisconsin, USA) per manufacturer's instructions and sequenced using both T7 (5'-TAATACGACTC ACTATAGGG-3') and SP6 (5'-ATTTAGGTGACACTATAG-3') primers. Together with the contigs containing CTLV sequences from NGS, the sequence data were then analyzed and assembled as consensus full-length genome, using Vector NTI Advance™11 software (Thermo Fisher Scientific, Carlsbad, California, USA).

Phylogenetic and genomic identity analysis of full-length virus sequences
Phylogenetic analysis was performed using the Molecular Evolutionary Genetics Analysis tool (MEGA version 7.0.21) [44]. ClustalW was used to align the 12 newly generated CTLV fulllength cDNA sequences with the capilloviruses: CTLV, ASGV, pear black necrotic leaf spot virus (PBNLSV; a strain of ASGV), and cherry virus A (CVA) for which full genome sequences were available in GenBank (Table 2). Phylogenetic topologies were reconstructed using three different methods: neighbor-joining, maximum likelihood and minimum evolution and tested with 1,000 bootstrap replicates. All phylogenetic methods gave similar results and the neighbor-joining tree was presented in this study. Nucleotide (nt) percentage of sequence identities were calculated for CTLV complete or partial genomes using the pairwise sequence identity and similarity in a web-based analyzing program (http://imed.med.ucm.es/Tools/sias.html).

Citrus sample processing and RNA extraction for RT-qPCR detection of CTLV
To account for the possible uneven distribution of the virus within a plant, budwood samples from four to six different branches around the tree canopy were randomly collected and combined in a single sample. Samples from the citrus trees' phloem-rich bark of matured budwood (approximately 12 to 18 months old) were collected and processed by freeze-drying and grinding as described by Osman et al. 2017 [45]. Total RNA was extracted from the ground sample using MagMAX TM Express-96 (Thermo Fisher Scientific, Carlsbad, California, USA) along with a modified 5X MagMax TM -96 Viral RNA Isolation Kit optimized for citrus tissues [45]. Total RNA was eluted in 100 μl elution buffer and used as template for RT-qPCR.

RT-qPCR assay design
For the specific detection of CTLV in citrus tissues, an RT-qPCR assay was designed based on sequence conservation alignment of a total 28 full genome sequences: 23 sequences of CTLV, (12 generated in this study and 11 from the GenBank) and five GenBank sequences of ASGV isolated from citrus and kumquat, a citrus relative (S1 Fig). Primers and probe were designed using the Primer Express™ software (Thermo Fisher Scientific, Carlsbad, California, USA) and following the guidelines for designing RT-qPCR assays a 58˚C optimum melting temperature for primers and a 10˚C increase for qPCR probes was used to prevent the formation of primer dimers ( Table 3). The fluorophore used for the CTLV probe was 6-carboxyfluorescein FAM and the 3' quencher was Black Hole Quencher (BHQ). The homology of the primers and qPCR probe was confirmed by a BLAST search against the GenBank database. The RT-qPCR reaction (12 μl total volume) was performed using the AgPath-ID TM One-Step RT-PCR Kit (Thermo Fisher Scientific, Carlsbad, California, USA) with 2.65 μL water, 6.25 μL 2X RT buffer, 0.6 μL primer probe mix (417 nM for primers and 83 nM for probe as final concentrations), 0.5 μL 25X RT mix and 2 μL of RNA for each reaction. The cycling conditions were 45˚C for 10 minutes, 95˚C for 10 minutes during the first cycle, followed by 40 cycles of 95˚C for 15 seconds and 60˚C for 45 seconds. Samples were analyzed using Applied Biosystems™ 7900HT Fast Real-Time PCR System and Applied Biosystems™ QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Carlsbad, California, USA). Fluorescent signals were collected during the amplification cycle and the quantitative cycle (Cq) was calculated and exported with a threshold of 0.2 and a baseline of 3-15 for the targets of interest. The Cq was calculated by the qPCR machine using an algorithm with a set range of cycles at which the first detectable significant increase in fluorescence occurs. RNA and reaction integrity were assessed using the qPCR assay targeting cytochrome oxidase (COX) gene in the citrus genome as the internal control [27].

RT-qPCR assay validation
The newly designed CTLV RT-qPCR assay was validated using applicable parameters proposed in the "Guidelines for validation of qualitative real-time PCR methods" [41]. Applicability, practicability and transferability were evaluated by deploying the assay at two different laboratories, University of California (UC) Riverside-CCPP and UC Davis-Real-Time PCR Research & Diagnostic Core Facility. The robustness of the assay was evaluated with deviation in annealing temperatures (±2 o C), reaction volumes (±2 μL), and different qPCR instruments (CFX96 Real-Time PCR Detection System, Bio-Rad, Hercules, CA), and master mixes (iTaq™ Universal Probes One-Step Kit, Bio-Rad, Hercules, CA) to optimize the assay. The specificity of the assay was evaluated both in silico and experimentally, using a variety of citrus samples with known CTLV infection status from broad geographical origins and isolation times. All virus isolates exotic to California were received as nucleic acids under the auspices of the United States Department of Agriculture (USDA) Animal and Plant Health Inspection Service (APHIS) Plant Protection and Quarantine (PPQ) permits P526P-18-04608 and P526P-18-04609. Cross-reactivity was assessed using RNA of different non-inoculated citrus species and varieties and RNA from citrus inoculated with other non-targeted graft-transmissible pathogens of citrus.
The sensitivity (absolute limit of detection, LOD 6 ) and quantification of the amount of CTLV in samples was calculated by generating an absolute standard curve to determine the starting number of copies. More specifically, amplicons for CTLV were obtained for each primer set (i.e. F1, 2, and 3 with R) and individually cloned into plasmids (Eurofins MWG Operon, Huntsville, Alabama, USA) ( Table 3). The extracted plasmid DNA was linearized using HindIII enzyme, to increase the efficiency of dilutions. Serial 10-fold dilution of plasmids carrying a known copy number of CTLV inserts were made to construct a DNA standard curve. The standard curves for CTLV were run in singleplex RT-qPCR setting utilizing 6-carboxyfluorescein FAM fluorophores. Reactions were performed in triplicate to establish the linear response between the Cq values and the log of known copy numbers. The copy numbers for each sample were calculated as described [46]. The slope of the standard curve and the coefficient of determination (R 2 ) were calculated using linear regression [47]. Amplification efficiency (E) was calculated with the formula E = 10 (−1/slope) − 1 [48,49]. The intra-assay variation and inter-assay variations were calculated, by determining the percentage of coefficient of variation (CV %), which was calculated for each sample as follows: mean of the standard deviations of the duplicates divided by the grand mean of the duplicates × 100.

Comparison of CTLV RT-qPCR detection assay with previously published assays
The newly developed CTLV detection assay was compared to two recently the published RT-qPCR assays. Twenty-two samples from different CTLV isolates and 25 CTLV known negative samples were tested with the SYBR 1 Green-based RT-qPCR assay by Liu et al. 2011 [35], and the probe-based RT-qPCR assay by Cowell et al. 2017 [25] following the protocols described in each study. Based on the principal that a well performing diagnostic test correctly identifies the diseased individuals in a population, a series of statistical measurements, as reviewed by Bewick et al. 2004 [50], were used to compare the performance of the three RT-qPCR CTLV detection assays. An assay is performing well when sensitivity (Sn) = true positives / (true positives + false negatives) and specificity (Sp) = true negatives / (true negatives + false positives) approach 100%. High positive likelihood ratio (LR + ) = sensitivity / (1-specificity) and low (close to zero) negative likelihood ratio (LR -) = (1-sensitivity) / specificity also indicate a well performing diagnostic test. Finally, Youden's index (J) = sensitivity + specificity-1, can attain the maximum value of 1, when the diagnostic test is perfect and the minimum value of zero, when the test has no diagnostic value [50].

Full-length sequences of 12 CTLV isolates via NGS and RACEs
Full-length viral genome sequences of 12 CTLV isolates were obtained by RNA-Seq and the average total reads generated was 27,158,037 which covered 74% to 100% of the viral genome. The full-length cDNA sequences were deposited in GenBank with accession numbers MH108975-MH108986 (Table 1). Excluding the poly (A) tail, the 12 CTLV complete sequences ranged from 6,494 to 6,497 nucleotides (nt) long. Sequence analysis showed the CTLV genome was similar to other capilloviruses, including ASGV and PBNLSV, with two overlapping open reading frames (ORFs) (Fig 1). ORF1 (37-6,354 nt) encoded a 2,105 amino acids (aa) polypeptide, a putative polyprotein around 242-kDa containing methyltransferaselike, papain-like protease, helicase-like, RdRp-like domains, and a coat protein (CP) region (Fig 1). The CP region encoded a 27-kDa protein which was located at the carboxyl-terminal end of the ORF1 polyprotein (5,641-6,354 nt) and was identified based on sequence identity of ASGV CP deposited in GenBank (NC001749) [51]. Two variable regions previously described in ORF1 were also identified (Fig 1) [1,2]. ORF2 (4,788-5,750 nt) was nested in ORF1 and encoded a 36-kDa protein which belongs to the 30-kDa cell-to-cell movement protein (MP) superfamily (Fig 1).

Phylogenetic and genomic identity analysis of CTLV full-length sequences
Using three different methods, phylogenetic trees were generated with the available full-length nucleotide sequences of capilloviruses. All three methods generated similar topologies. The neighbor-joining unrooted tree identified four distinct clusters (I-IV) within two well supported clades (A & B) (bootstrap 99%) (Fig 2). Clusters I and II (bootstrap 100%), in clade A, contained CTLV isolates originating from Japan and China along with ASGV isolates from citrus and non-citrus hosts originated from the same geographic locations (Fig 2 and Table 2). Only one of the 12 CTLV isolates from this study (CTLV-IPPN122) was present in clade A (cluster I). This isolate was intercepted by the CCPP in a satsuma citrus introduction from China (Fig 2 and Table 2). The nucleotide sequence identities among the isolates of cluster I ranged within 83.23-93.02% including a 100% identity between ASGV-241KP and ASGV-P-209, both isolated from apple in Japan (Fig 2, Table 2 and Table 4). Sequence identities in cluster II ranged within 94.04-98.47%. Notably, in clade A (clusters I and II), some virus isolates derived from apple (I: ASGV-241KP, and -P-209 and II: ASGV-Li-23), had the highest sequence identities with isolates from lily (II: CTLV Table 2 and Table 4). In addition, in cluster I, the isolates ASGV-Nagami from Japan in kumquat (citrus relative, Fortunella margarita (Lour.) Swing.) and CTLV-ASGV-2-HJY from China in pummelo (C. maxima (Burm.) Merrill) had the highest sequence identity (93.02%) (Fig 2, Table 2 and Table 4).
Clusters III and IV (bootstrap 34%), in clade B, contained 11 of the 12 isolates from this study (Fig 2). In cluster III, three isolates intercepted by the CCPP in citrus introductions from China (i.e. CTLV-TL112, -TL113 and -TL114) grouped with seven CTLV isolates from China and Taiwan, one ASGV citrus isolate from Japan and three ASGV isolates from non-citrus hosts (i.e. apple and actinidia) from China, India and Germany (Fig 2). The nucleotide sequence identities among the isolates of cluster III ranged within 81.49-99.43% including a 100% identity between CTLV-Ponkan8 and CTLV-Pk both isolated from Ponkan mandarin (C. reticulata Blanco) in Taiwan (Fig 2, Table 2 and Table 4).
The apple virus isolates in clade B (cluster III) (III: ASGV-AC and ASGVp12) had sequence identities with a virus isolate from actinidia (III: ASGV-Ac) and 22 isolates from citrus and citrus relatives (cluster III and IV) with range of 81.42-82.68% (Fig 2, Table 2 and Table 4). This was in contrast to the high levels of sequence identity observed between apple isolates and lily, citrus and citrus relatives in clade A (91.07-98.47%).
Sequence identity analysis of the 28 available full genome sequences of the CTLV and ASGV citrus isolates (developed in this study and GenBank) showed that VRI was the most diverse region of the virus genome with 111 variable nucleotide sites among the 117 of the region. In addition, the nucleotide diversity of the VRII was equivalent to that of MP (variable sites 35.08% and 32.81%, respectively) since VRII and MP are essentially covering overlapping areas of the virus genome (Fig 1 and Table 5).
The CP and 3'-UTR (5,641-6,495 nt) was identified as the most conserved region. The percentage of variable nucleotide sites was the lowest (23.63%) and the minimum nucleotide sequence identity was the highest (89.60%) in the virus genome (Table 5). Further analysis revealed that nucleotide sites 6,241-6,440 were the most conserved within the CP and 3'-UTR (Table 6). Therefore, the newly developed RT-qPCR assay was designed to target this 200 nt region (Fig 1, Table 3, and S1 Fig).

CTLV RT-qPCR assay validation
The applicability, practicability and transferability of this assay was validated by two independent laboratories with consistent reproducible results ( Table 7). The assay was also proven to be robust since different annealing temperatures, reaction volumes, qPCR instruments, and master mixes had a minor effect on the Cq values and did not affect the classification of samples as positive or negative ( Table 8). The specificity of the assay was determined in silico by analyzing the sequence of amplicons from different samples followed by a BLAST search that recognized the amplicon sequences associated only with CTLV. Additionally, the specificity of the assay was evaluated qualitatively with the correct classification (false negative and positive rate 0%) of 112 known CTLV positive and negative samples (Tables 7, 9, 10 and 11). More specifically, the assay detected the virus in 39 known CTLV positive samples from various Table 5. Variable sites (%) and nucleotide sequence identities (%) of citrus tatter leaf virus and apple stem grooving virus isolated from citrus and citrus relatives (n = 28). geographic locations (Tables 7 and 9) and did not cross-react with 43 known CTLV negative samples of non-inoculated citrus varieties (Table 10) and a series of 30 non-targeted grafttransmissible citrus pathogens (Table 11). When samples were tested with 10-fold serial dilutions (run in triplicate), the sensitivity of the CTLV RT-qPCR showed a linear dynamic range from 10 5 copies to < 10 copies per μl which indicates the detection assay reached the level of LOD 6 with R 2 equal to 0.9999 and 100.4% as its efficiency (Fig 3). The mean of viral load was 6.37 x 10 4 copies of CTLV per μl of infected sample extraction measured by the newly designed CTLV RT-qPCR assay. The CV for CTLV in the RT-qPCR was in the range of 0.23-0.61% (intra-assay variation) and 0.65-1.40% (inter-assay variation) which indicates low variation between different repetitions and different runs.

Comparison with published CTLV detection assays
The SYBR 1 Green-based RT-qPCR assay developed by Liu et al. [35] was able to detect CTLV in all 22 samples with the expected melting temperature for the amplicon (81.5-82.0˚C) and its performance measurements (Sn, Sp, LR + , LRand J) were optimum and equal to those of the CTLV assay developed in this study ( Table 7). The Cq values of the Liu assay were consistently higher than the ones produced from the assay developed in the study ( Table 7). The TaqMan 1 probe-based RT-qPCR assay designed by Cowell et al. [25] detected CTLV in 15 samples with eight samples having lower Cq values than the assay developed in this study. However, Cowell et al. was unable to detect CTLV in seven samples of three different isolates (LR -= 0.32) and its performance measurements Sn and J were not optimum (Table 7).

Discussion
This study presented a systematic approach using the most current technologies for the development and analysis of genomic virus information for the development and validation of a diagnostic assay for CTLV that threatens citrus production worldwide [2,20,21].
The data obtained via NGS was de novo assembled onto 74% to 100% of the complete CTLV genome which demonstrated the strength of this technology to characterize the virus genome sequence. With RACE sequence data from each isolate, the full-length sequences were assembled in relatively short time compared to traditional sequencing methods. This allowed Table 6 Citrus tatter leaf virus characterization and detection assay development

. Variable sites (%) and nucleotide sequence identities (%) of the segmented coat protein and 3'-untranslated region of citrus tatter leaf and apple stem grooving virus isolated from citrus and citrus relatives (n = 28).
for a more comprehensive genome analysis of the CTLV not limited by the available sequences of a small number of virus isolates or parts of the virus genome [1,2]. The full genome sequence analysis of 28 CTLV and ASGV citrus and citrus relative isolates, developed in this study and available in the GenBank, confirmed the previously reported size, structure and variable regions in the virus genome [1,2]. Data presented in this study also supported the current taxonomic classification of CTLV as a strain of the ASGV in the Capillovirus genus of the Betaflexiviridae family since the analysis of multiple full genome sequences of CTLV and ASGV did not meet the species demarcation criteria which is less than 72% nucleotide identity or 80% amino acid identity between their CP or polymerase genes (S8 Table and S9 Table) [52]. The phylogenetic analysis of the 41 ASGV isolates, revealed four interesting evolutionary and distribution patterns for the virus. First, Asia was highlighted as the point of origin of the virus since countries such as China, Taiwan and Japan were represented in multiple clusters of both phylogenetic clades. This finding also indicated that the origin and diversity of CTLV coincided with the origin of the citrus host. Second, the bottleneck event of the introduction of the virus in the USA from the single citrus variety Meyer Lemon was reflected in cluster IV (first subgroup) in clade B and the high sequence identity (98.52-100%) among the isolates from Texas, Florida, and California. Third, high sequence identities among virus isolates from various citrus producing countries around the world demonstrated the impact of the human activities in the distribution of the virus and the importance of clean stock programs such as CCPP [53]. For example, the CTLV-TL115 isolate was intercepted in an illegal citrus introduction in California (second subgroup, cluster IV, clade B) [54,55] and it was different from the previously identified isolates of the virus in the state. In addition, the CTLV-IPPN122, -104, -112, -113, and -114 isolates were presented in different variety introductions, separated in time (1987 and 2014), from the original Meyer lemon introduction in 1900s and even though they all originated in China, these isolates clustered in three different phylogenetic clusters (I, III, and IV) in agreement with the principal of high diversity in virus sequences at the point of origin [56][57][58]. Last but not least, two ASGV spillover events were captured in clade A where ASGV isolates from apple had the highest sequence similarities (91.07-98.47%) with virus isolates from lily, citrus and citrus relatives [59][60][61][62][63]. No spillover event was captured in clade B since sequence identities of apple isolates with actinidia, citrus and citrus relatives was low (81.42-82.68%). Clade B most likely represented the establishment of ASGV in citrus and citrus relatives after its spillover from other species. The spillover events presented here provided some insight to the CTLV ancestry questions for citrus, kumquat, lily and apple presented by Hilf 2008 [32].  Since the genetic variation within the targeted virus population can lead to false negative RT-qPCR results, for the design of the CTLV detection assay we aimed to locate the most conserved region on the virus genome beyond the traditional approaches that focus on individual genes presumed conserved due to their function [64]. The newly developed detection assay was further validated according to the guidelines for validation of qualitative real-time PCR methods and its performance was assessed with statistical measurements [50,65]. We showed that the most conserved CTLV genome region was not confined in a single gene, but it spanned the region between the CP gene and 3'-UTR, thus it was targeted for the RT-qPCR assay design. The conserved nature of the CTLV CP could be a result of its function in virion assembly [64]. And for the 3'-UTR of CTLV, the high identity among isolates indicates that it has an important role in CTLV replication and/or translation [66].
Compared to published CTLV qPCR assays that were designed on limited or single isolate sequences, the assay in this study performed better (e.g. Youden's index) and detected a diverse range of CTLV isolates from different geographic locations, citrus varieties, and isolation times, because it was designed using a high number of virus sequences [25,34,35]. These results agree with Roussel et al. [67] who reported, that the RT-qPCR designed for prune dwarf virus (PDV) failed to detect many virus isolates because the assay was designed from very few published PDV sequences in the GenBank. In addition, the sensitivity and specificity of this assay was improved by using MGB probes [68,69], designed from the multiple sequence alignment, that targeted the identified conserved genomic region between the CP gene and 3'-UTR. Furthermore, measuring the intra and inter assay variations confirmed the reproducibility and repeatability of the developed RT-qPCR assay. Finally, measuring viral loads and performing reactions under variable conditions showed that the newly developed RT-qPCR is robust and can detect minimal quantities of the CTLV.
Next generation sequencing (NGS) technologies combined with bioinformatics analysis have proven to be powerful tools in identifying and characterizing novel sequences of pathogens, in studying disease occurrence, genome variability, and phylogeny [38][39][40]. Using NGS technologies within a well-defined qPCR design, development and validation protocol [41,42] is that qPCR assays can be regularly updated as more target pathogen genomes are sequenced, therefore, increasing the value of the assay in preventing virus outbreaks and managing virus spread and induced disease. Citrus tatter leaf virus characterization and detection assay development We propose that in the era of powerful affordable sequencing platforms the presented approach of full-genome sequence analysis of multiple virus isolates, and not only a small genome region of a small number of virus sequences, becomes a guideline for the design and comprehensive validation of qPCR-based virus detection assays especially for use in high value germplasm programs [26,30,31]. We understand the academic urgency for scientific publications however specifically in the case of diagnostics that affect international trade, quarantines and regulatory decisions that by extension affect the livelihoods of thousands of people, we urge the research community to dedicate the necessary resources and time for the appropriate design and validation of pathogen detection assays. We hope that this publication offers a valuable case study for such consideration.

S1 Fig. Citrus tatter leaf virus detection assay targeting region.
Multiple nucleotide sequences alignment of citrus tatter leaf virus and apple stem grooving virus isolated from citrus and citrus relatives host. Citrus tatter leaf virus detection assay targeting region (highlighted in dark grey) and primers-probe set are also shown. Apple stem grooving virus isolate P-209 is used here to represent the species. (PDF) S1 Table. Oligonucleotide primers used in this study. (PDF) S2 Table. Full-length nucleotide sequence identities (%) of citrus tatter leaf virus isolates in this study and capilloviruses from NCBI GenBank database. (PDF) S3 Table. Nucleotide sequence identities (%) of 5'-untranslated region (5'-UTR) and polyprotein (not including coat protein region). (PDF)