Identification and Genome Characterization of the First Sicinivirus Isolate from Chickens in Mainland China by Using Viral Metagenomics

Unlike traditional virus isolation and sequencing approaches, sequence-independent amplification based viral metagenomics technique allows one to discover unexpected or novel viruses efficiently while bypassing culturing step. Here we report the discovery of the first Sicinivirus isolate (designated as strain JSY) of picornaviruses from commercial layer chickens in mainland China by using a viral metagenomics technique. This Sicinivirus isolate, which contains a whole genome of 9,797 nucleotides (nt) excluding the poly(A) tail, possesses one of the largest picornavirus genome so far reported, but only shares 88.83% and 82.78% of amino acid sequence identity to that of ChPV1 100C (KF979332) and Sicinivirus 1 strain UCC001 (NC_023861), respectively. The complete 939 nt 5′UTR of the isolate strain contains at least twelve stem-loop domains (A–L), representing the highest set of loops reported within Sicinivirus genus. The conserved 'barbell-like' structure was also present in the 272 nt 3′UTR of the isolate as that in the 3′ UTR of Sicinivirus 1 strain UCC001. The 8,586 nt large open reading frame encodes a 2,862 amino acids polyprotein precursor. Moreover, Sicinivirus infection might be widely present in commercial chicken farms in Yancheng region of the Jiangsu Province as evidenced by all the tested stool samples from three different farms being positive (17/17) for Sicinivirus detection. This is the first report on identification of Sicinivirus in commercial layer chickens with a severe clinical disease in mainland China, however, further studies are needed to evaluate the pathogenic potential of this picornavirus in chickens.


Introduction
Birds are well known as an important reservoir for emerging pathogens [1], however, as the limitation of traditional virus isolation and sequencing approaches, it is difficult to find unexpected or unknown viruses. The whole metagenome sequencing of environmental viral communities [2], termed "viral metagenomics", has dramatically accelerated the viral discovery process. Thus, many new avian picornavirus species such as Melegrivirus A [3], Gallivirus A [4], Avisivirus A [5], and the recently accepted genera Sicinivirus [6] have been identified by metagenomic techniques.
Here we report the discovery of the first Sicinivirus isolate (designated as strain JSY) of picornaviruses in commercial layer chickens with a severe clinical disease in mainland China by using a viral metagenomics technique. The Sicinivirus strain JSY was completely sequenced and the complete 5 0 and 3 0 untranslated region (UTR) were further characterized. At least twelve stem-loop domains (A-L) were observed in the 5 0 UTR of Sicinivirus for the first time. In addition, comparative genome and phylogenetic analysis showed that the Sicinivirus strain JSY has certain sequence differences when compared to the previously reported Sicinivirus strains. The discovery of this novel strain facilitates further understanding of the molecular characteristics of the recently recognized genera Sicinivirus.

Ethics Statement
The present study was approved in accordance with the animal welfare guidelines (IACUC-2010) of the Animal Care and Use Committee of Institute of Animal Husbandry and Veterinary Medicine Beijing Academy of Agriculture and Forestry Sciences. Clinical samples were collected according to the Regulations for the Administration of Affairs Concerning Experimental Animals of the State Council of the People's Republic of China. As our research is supported by China Agriculture Research System, we are responsible for diagnosis and control of poultry diseases. We declared that we had permissions from the farm owners to collect the samples and conduct this study, and further confirmed that the permissions here did not involve endangered or protected species.

Clinical samples and screening for viral pathogens
In May 2014, a high mortality was occurred in some commercial chicken farms in Yancheng region of the Jiangsu Province (China). The affected chickens ranged in age from 30 to >65 days. Clinical signs were lethargy, tendency to huddle, decreased feeding and drinking, and diarrhea with white green faeces. Illness rates were up to 50-80% in tested chicken flocks, and mortality rates were 30-50%. Gross lesions were characterized by severely swollen livers with distinct petechial and ecchymotic haemorrhages spots, translucent pericardial substance effusion, severe splenomegaly, kidney swelling, as well as thymus haemorrhages and bursa of Fabricius atrophy. 9 faecal samples from one of the above farms (designated as farm A) in Yancheng region were collected for detecting potential viral pathogens. These fecal samples were re-suspended 1:10 (w/v) in phosphate-buffered saline (PBS), and then centrifuged at 4,200 × g for 5 min. The supernatants were further filtered through 0.22 μm filters (Millipore, USA), and aliquoted and stored at -80°C until use. Viral DNA and RNA from the resultant supernatants were extracted using an AllPrep DNA/RNA Mini Kit or a QIAamp MinElute Virus Spin Kit (Qiagen, Germany), cDNAs were prepared using a SuperScript 1 First-Strand Synthesis System for reverse transcriptase (RT)-PCR (Invitrogen, USA) following the manufacturer's instructions, then subjected to the regular PCR/RT-PCR detection of chicken pathogens including infectious bursal disease virus (IBDV), infectious bronchitis virus (IBV), avian encephalomyelitis virus (AEV) and chicken anaemia virus (CAV). The primers are listed in Table 1.
Virus nucleic acid isolation, Metagenomic library construction, sequencing and genome walking Viral nucleic acid libraries were then constructed by sequence-independent RT-PCR amplification as previously described [10,11]. Briefly, 11 μl of extracted RNA was mixed with 1 μl of 10 mM dNTP Mixture and 1 μl of 100 μM primer KN or RA01N (Table 1), incubated at 65°C for 5 min, and chilled on ice. A reaction mix consisted of 4 μl of 5 × First-Strand buffer, 1 μl of 100 mM DTT, 1 μl of RNase inhibitor, and 200 units of SuperScript III reverse transcriptase (Invitrogen). The reaction was then incubated at 25°C for 5 min and 50°C for 60 min. Second strand cDNA was synthesized by incubating with 5 units of Klenow fragment polymerase (New England BioLabs) at 37°C for 60 min followed by inactivation at 75°C for 10 min. The PCR reaction mixture consisted of 10 μl of double-stranded cDNA, 0.25 μl of Ex Taq (5 units/ μl, Takara), 5 μl of 10 × Ex Taq Buffer, 4 μl of 2.5 mM dNTP Mixture, and 1 μl of 100 μM primer K or RA01 (Table 1). For PCR procedure, 35 cycles of 98°C for 10 s, 56°C for 30 s, and 72°C for 2 min were used, followed by 10 min of final extension at 72°C. The PCR products ranging from 200 bp to 2,000 bp were separated and purified from a 1% agarose gel, the purified fragments were cloned into pEASY 1 -T5 Zero vector (TransGen Biotech, China), then six hundreds of single colonies were randomly selected and sequenced using commercial vector specific M13 forward primer. A total of 511 reads were assembled using the SeqMan program, which is part of the Lasergene sequence analysis software package (DNASTAR Inc., USA). Single contigs were compared to GenBank using BLASTx. Subsequently, special PCR primers were designed based on the obtained sequences to walk the entire genome of the focused virus. Terminal sequences were obtained using a kit for rapid amplification of cDNA ends (RACE) (Clontech, Japan), both of the reported RACE primers and designed inner primers were listed (Table 1). For each fragment, at least three clones (if conflict occurs, up to eight clones) were sequenced to determine the consensus sequence of any given region.

Phylogenetic analyses
Representative members of the 29 officially recognized genera and three other Sicinivirus sequences of family Picornaviridae were downloaded from NCBI, GenBank accession numbers of these sequences were listed in Table 2. Multiple sequence alignments of the Sicinivirus strain JSY and 32 sequences downloaded above were performed using Clustal Omega [16], and this alignment was determined online (http://www.ebi.ac.uk/Tools/msa/clustalo/). The phylogenetic tree based upon the results of multiple sequence alignment was constructed using the Molecular Evolutionary Genetics Analysis (MEGA) software version 6.0.6 [17] applying the maximum-likelihood method based on the JTT matrix-based model [18], the robustness of the

Nucleotide sequence accession number
The full-length genomic nucleotide sequence of the Sicinivirus strain JSY was deposited in GenBank under accession number KP779642.

Detection of clinical samples
Four pairs of primers were used to perform conventional PCR/RT-PCR detection, and the results showed that all the samples were negative for IBDV [19], IBV [20], AEV [21] and CAV [22]. In order to find other potential viral pathogens within these samples, cDNA from one of the fecal samples was generated as mentioned above to construct the metagenomic library for the further sequencing.

Overview of sequence data
The metagenomic sequencing data returned 511 useful reads. These reads were classified based on the best BLASTx expectation (E) scores. Summaries of the taxonomic classifications are shown in Fig 1. Among

Genome organization and coding potential of Sicinivirus
Based on the sequence of specific PCR products and RACE fragments of both terminals, a Sicinivirus with a genome size of 9,797 nt was obtained. To our knowledge, the identified  (Fig 2A).

Analysis of coding regions
The myristylation site GSISST was recognized in the VP0 protein of the Sicinivirus strain JSY as reported by Bullman S et al (2014) [6]. The 2C protein of the Sicinivirus strain contains a conserved NTP-binding site GxxGXGKS (X, uncharged; x, variable) motif as GPPGCGKS [5,26] and the DDLxQ motif as DDVGQ, which is associated with the putative helicase activity [5,27]. As the published Sicinivirus 1 strain UCC001 [6], the active-site cysteine in motif GxCG (x, variable) as GLCG [24] and the RNA binding domain (KFRDI) as QFKDL were present in For viral capsid protein VP1, the identified Sicinivirus strain JSY shows only 66% and 60.5% of amino acid sequence identity to that of ChPV1 100C and Sicinivirus 1 strain UCC001, respectively. Furthermore, residues I118 and L120 of the seven drug-binding pocket [28,29] sites in rhv_like capsid domain (cd00205) [9] were different among these five studied Sicinivirus sequences (Fig 5).

Confirmation of Sicinivirus in clinical faecal samples
Sicinivirus distribution in Mainland China is not restricted to one sample analysed by metagenomics. A RT-PCR survey targeted to a 652 bp fragment of the 3Dpol gene of Sicinivirus demonstrated the presence of this virus in stool samples of other 8 animals from the same farm. Furthermore, Sicinivirus was also detected in 5 and 3 animals of two other commercial chicken farms of the same region (Fig 7). The generated 17 PCR products in total were further confirmed by sequencing, and these PCR products shared almost 100% nucleotide sequence identity with the Sicinivirus strain JSY.

Discussion
A range of random amplification methods coupled with Next Generation Sequencing platforms, such as Roche-454 [32], Illumina HiSeq [33], and Ion Torrent PGM [34], have been recently used to discover unexpected and unknown viruses. However, plasmid cloning and Sanger sequencing remains still as a practical alternative for the rapid identification of viruses in clinical samples. In the present study, we chose to amplify the constructed library, and clone the purified fragments into traditional T-vector as previously described by Victoria et al (2008) [35]. BLAST analysis showed that 34 of the obtained 511 viral metagenomic sequences (6.65%) shared nucleic acid similarities to Sicinivirus. The results revealed that the described approach might be a relatively convenient and cost-effective method for quick screening unexpected or unknown viruses from clinical samples.
By comparing with other Sicinivirus strains, the Sicinivirus strain JSY identified in the present study shares 73.99%, 78.81%, 79.61%, and 73% nucleotide sequence identities with Sicinivirus 1 strain UCC001 [6], chicken picornavirus 1 100C, chicken picornavirus 1 55C, and Sicinivirus UCC1 [1] at the genome level, respectively. Bullman et al (2014) reported that the A-I domains of type II IRES were missed in the Sicinivirus 1 strain UCC001, thus indicating that the 373nt 5' UTR sequence of the strain was incomplete. Lau et al (2014) reported the sequences of chicken picornavirus 1 100C and chicken picornavirus 1 55C, these two strains share similar genome structure, including a short Leader (L) protein (21 aa) but not the predicted type II-like IRES [1] (Fig 2A). In addition, the polyprotein of the identified Sicinivirus   [17] applying the maximum-likelihood method based on the JTT matrix-based model [18], the robustness of the phylogenetic constructions was evaluated strain JSY shows high amino acid identities (88.13% and 88.13%) to ChPV1_55C and ChPV1_100C, respectively (Table 2), which indicates the close genetic relationship of Sicinivirus strain JSY with the two Siciniviruses that were found in Hong Kong [1]. However, when the upstream sequences (about 880 bp upstream of VP0 gene) of ChPV1_55C and ChPV1_100C were compared with the corresponding region of the Sicinivirus strain JSY reported in this study, the nucleotide sequence identity was 74.9% and 66.6%, respectively, suggesting that the 5' region of these two strains identified in Hong Kong probably have not been sequenced completely. The other Sicinivirus UCC1 was also identified in Ireland by Bullman et al (Unpublished), it only contains a partial L sequence (Fig 2A) and the upstream sequence has not been fully sequenced as well.
Considerable variation was observed in the P1 region, the identified Sicinivirus strain JSY shows 75.27% to 78.66% amino acid sequence identity to four other Sicinivirus sequences ( Table 2). The P1 region comprises virus capsid proteins VP0, VP3, and VP1. As viral polypeptide VP1 is the most surface-exposed capsid protein [36], and harbors important immunogenic sites [37,38], also contributes to virus attachment and entry [39,40]. Therefore, the VP1 region of picornaviruses, such as foot-and-mouth disease virus (FMDV), has been conventionally used to investigate the genetic relatedness of different isolates [41] and infer evolutionary dynamics including tracing the origin and movement of the outbreak strains [42]. Among these Sicinivirus strains, two pocket sites I118 and L120 (Fig 5) were seen in the VP1 polypeptide, which are different from the rhv_like capsid domain (cd00205). However, the effect of the differences on structure, pathogenicity and evolution still needs further evaluation.
For picornaviruses, five types of IRES have been reported, classified as type I [43], type II [44], type III [45], type IV [46] and type V [47], each type of IRES has a different characteristic by bootstrapping with 1,000 replicates, initial trees for the heuristic search were obtained automatically by applying neighbour-join and BioNJ algorithms. structure and initiation occurs via a distinct mechanism [47]. Type II IRES requires eIF4G/ eIF4A to form a 48S complex, and can function without eIF4E and factors associated with ribosomal scanning. Unlike type I IRES, which requires bind various IRES trans-acting factors (ITAFs), type II IRES requires fewer ITAFs [44,47]. Further analysis of the 5' UTR region showed that the Sicinivirus strain JSY contains a type II IRES. In the present study, we determined the full-length sequence of the 5' UTR region of Sicinivirus and found that the 5' UTR of Sicinivirus contains at least twelve stem-loop domains (A-L) for the first time.
Besides the predominant Sicinivirus, we also obtained other virus-related reads, including ALV (3 reads), ANV (2 reads), two other genera (also of Picornaviridae family) from the present study. Three ALV-related reads shared 99.5%, 100% and 94.1% nucleotide sequence homology with endogenous ALV [48], natural recombinant ALV-E/A virus PDRC-1039 [49] and ALV strain BR170E, respectively. The two ANV reads shared 76.1% and 89.4% nucleotide sequence homology with ANV1 and ANV3, respectively [50,51]. However, the whole genomic sequencing of these viruses was not successful due mainly to the low homology of these obtained reads, and the low abundance of associated viruses in the clinical samples.

Conclusions
In the present study, the viral metagenomics technique has been used to test a clinical sample. Unexpectedly, among 511 reads, 34 reads (82.9% of total virus reads) showed sequence similarities to viruses from the recently discovered genus Sicinivirus, suggesting that this virus was the predominant type in the tested sample. The viral polyprotein of the Sicinivirus isolate only had 88.83% and 82.78% of amino acid sequence identity to that of ChPV1 100C and Sicinivirus 1 strain UCC001, respectively. Moreover, we determined for the first time the full-length sequence of the 5' UTR region of Sicinivirus and found that it contains at least twelve stemloop domains (A-L). Sicinivirus infection might be widely present in commercial chicken farms in Yancheng region of the Jiangsu Province as evidenced by all the tested stool samples from three different chicken farms being positive (17/17) for Sicinivirus by RT-PCR detection. This is the first report on identification and genome characterization of Sicinivirus from chickens in mainland China, however, further studies are needed to evaluate the pathogenic potential of this picornavirus in chickens.