Characterization of the Role of Hexamer AGUAAA and Poly(A) Tail in Coronavirus Polyadenylation

Similar to eukaryotic mRNA, the positive-strand coronavirus genome of ~30 kilobases is 5’-capped and 3’-polyadenylated. It has been demonstrated that the length of the coronaviral poly(A) tail is not static but regulated during infection; however, little is known regarding the factors involved in coronaviral polyadenylation and its regulation. Here, we show that during infection, the level of coronavirus poly(A) tail lengthening depends on the initial length upon infection and that the minimum length to initiate lengthening may lie between 5 and 9 nucleotides. By mutagenesis analysis, it was found that (i) the hexamer AGUAAA and poly(A) tail are two important elements responsible for synthesis of the coronavirus poly(A) tail and may function in concert to accomplish polyadenylation and (ii) the function of the hexamer AGUAAA in coronaviral polyadenylation is position dependent. Based on these findings, we propose a process for how the coronaviral poly(A) tail is synthesized and undergoes variation. Our results provide the first genetic evidence to gain insight into coronaviral polyadenylation.


Introduction
Posttranscriptional modifications occurring in the nucleus of eukaryotic cells include cleavage of the 3 0 end of nascent mRNAs and the addition of a poly(A) tail [1][2][3][4][5]. The polyadenylation process involves two discrete phases [6]. In the first phase, synthesis of a short poly(A) tail of nearly 10 nucleotides (nts) depends on interaction between polyadenylation-related proteins and the polyadenylation signal (PAS) hexamer AAUAAA or its variant (AGUAAA, AUUAAA or UAUAAA) located 10-30 nts upstream of the poly(A) cleavage site [1,[7][8][9][10][11][12][13]. The rapid addition of a poly(A) tail of nearly 200 nts that occurs in the second phase requires the nearly 10 adenosine residues synthesized in the first phase. The synthesized poly(A) tail is important for the nuclear export of mature mRNAs and has been demonstrated to be involved in the control of mRNA stability and translation efficiency [14][15][16][17].
As opposed to mRNAs used only for translation, polyadenylation of viral RNA in RNA viruses may be involved in both translation and replication [16,18]. RNA viruses have developed several mechanisms for synthesizing a poly(A) tail based on genetic features. It has been demonstrated that influenza virus utilizes a stretch of short U residues, instead of the hexamer AAUAAA, located at the 5' terminus of the negative-strand genomic RNA as a signal for poly (A) synthesis by the viral RNA polymerase with a stuttering mechanism during positive-strand synthesis [19][20][21]. A similar mechanism is also used by paramyxoviruses to generate a poly(A) tail during transcription [22]. On the other hand, poliovirus uses homopolymeric stretch on negative-strand as template for the addition of poly(A) tail during positive-strand synthesis [23]. Moreover, the cis-acting element required for replication may also be used for polyadenylation in RNA viruses. For example, the hexamer AAUAAA in bamboo mosaic virus and a domain immediately upstream of the poly(A) tail in coxsackievirus B3 have been shown to function as cis-acting elements involved in both negative-strand RNA synthesis and polyadenylation [24,25].
Bovine coronavirus (BCoV), a betacoronavirus, subfamily Coronavirinae, family Coronaviridae and order Nidovirales, is a 5'-capped and 3'-polyadenylated positive-strand RNA virus. Although the mechanism for coronaviral polyadenylation remains unknown, a stuttering mechanism based on the short poly(U) stretch found in the negative-strand genome has been postulated [26]. Moreover, a regulated poly(A) tail length during the coronaviruses life cycle has been suggested, whereby the viral poly(A) tail length is increased in the early stage of infection but gradually decreases after the peak tail length (~65 nts) in the later stage of infection in both cell culture and animals [16,27]. Such regulated poly(A) tail length may function in translation regulation, as it has been experimentally demonstrated that a longer coronavirus poly (A) tail is associated with better translation efficiency [16]. However, the mechanism by which the coronaviral poly(A) tail is regulated remains unclear.
In the current study, we determined that both the poly(A) tail and hexamer AGUAAA are important elements responsible for the polyadenylation of coronavirus. The efficiency of poly (A) tail elongation during infection depends on the initial poly(A) tail length at the time of infection. Based on these findings, we propose a process for how the coronaviral poly(A) tail is synthesized and undergoes variation. The results presented here provide the first genetic evidence that will help in elucidating coronaviral polyadenylation.

In vitro transcription and transfection
To synthesize transcripts in vitro, all DNA constructs (except W-0A, W-25U, W-25C, W25G, R-25U, R-25C and R-25G, which were linearized with BsmBI to accurately synthesize transcript with no poly(A) tail, or with only poly (U), poly(C) or poly(G) tail) were linearized with MluI. The linearized DNA was transcribed in vitro with the mMessage mMachine T7 transcription kit (Ambion) according to the manufacturer's instructions and passed through a Biospin 6 column (Bio-Rad), followed by transfection [35]. For transfection, HRT-18 cells in 35-mm dishes at 80% confluency (~8 × 10 5 cells/dish) were infected with BCoV at a multiplicity of infection of 5 PFU per cell. After 2 hours of infection, 3 μg of transcript was transfected into mock-infected or BCoV-infected HRT-18 cells using Lipofectine (Invitrogen) [31,36].

Preparation of RNA from infected cells
To prepare RNA for the identification of DI RNA poly(A) tail length, RNA was extracted with TRIzol (Invitrogen) at the indicated times after transfection of DI RNA constructs into BCoVinfected HRT-18 cells; the virus within the transfected cells is referred to as virus passage 0 (VP0) (S1B Fig). Supernatants from BCoV-infected and DI RNA transfected HRT-18 cells at 48 hours posttransfection (hpt) (VP0) were collected, and 500 μl was used to infect freshly confluent HRT-18 cells in a 35-mm dish (virus passage 1, VP1) (S1B Fig). RNA was extracted with TRIzol (Invitrogen) at the indicated time points.
10 μg of extracted total cellular RNA in 25 μl of water, 3 μl of 10X buffer and 10 U of (in 1 μl) TAP (Epicentre) were used to de-block the 5 0 capped end of genomic RNA. Following decapping, RNA was phenol-chloroform-extracted, dissolved in 25 μl of water, heat-denatured at 95°C for 5 min and quick-cooled. Head-to-tail ligation was then performed by adding 3 μl of 10X ligase buffer and 2 U (in 2 μl) of T4 RNA ligase I (New England Biolabs); the mixture was incubated for 16 h at 16°C. The ligated RNA was phenol-chloroform-extracted and used for the RT reaction. SuperScript II reverse transcriptase (Invitrogen), which is able to transcribe poly(A) tails greater than 100 nts with fidelity [19,45], was used for the RT reaction with oligonucleotide BCV29-54(+), which binds to nts 29-54 of leader sequence of the 5' UTR of the BCoV positive strand, as previously described. A 5-μl aliquot of the resulting cDNA was used in a 50-μl PCR with AccuPrime Taq DNA polymerase (Invitrogen) and oligonucleotides BCV29-54(+) and MHV3UTR3(-), the latter of which binds to nts 99-122 counted from the poly(U) track on the MHVA59 negative strand. The resulting PCR product was subjected to sequencing to determine the poly(A) tail length. At least three independent experiments were carried out for determining the poly(A) tail length of DI RNA mutants.

Northern blot analysis
Detection of the reporter-containing DI RNA was performed essentially as described previously [35,36]. In brief, HRT-18 cells were seeded in 35-mm dishes at~80% confluency (~8 × 10 5 cells/dish). For RNA stability assay, 3 μg of DI RNA transcript was transfected into the HRT-18 cells. HRT-cells were incubated with RNase A (final concentration 0.5 mg/ml) for 15 min prior to extraction of cellular RNA. For a replication assay, after 2 h of infection with BCoV at a multiplicity of infection of 5 PFU per cell, 3 μg of DI RNA transcript was transfected into the BCoV-infected HRT-18 cells. Tthe supernatant was harvested at 48 hpt (VP0), and 500 μl was used to infect freshly confluent HRT-18 cells in a 35-mm dish (VP1). Total cellular RNA was extracted with TRIzol (Invitrogen) at 48 hpi of VP1, and 10 μg was electrophoresed through a formaldehyde-agarose gel. The RNA was transferred from the gel to a Nytran membrane by vacuum blotting, and the blot was probed with 5'-end 32 P-labeled oligonucleotide TGEV8(+) for 16 h. The probed blot was washed and autoradiographed at -80°C for 24 h.

Quantitation of DI RNA synthesis by qRT-PCR
To determine the replication efficiency of DI RNA, the cDNA prepared as described above from head-to-tail ligation RNA collected at 1 and 24 hpi of VP1 was used for real-time PCR amplification with TagMan 1 Universal PCR Master Mix (Applied Biosystems) using primers MHV3'UTR3(-) and BCV23-40(+). The real-time PCR amplification was performed according to the manufacturer's recommendations in a LightCycler 1 480 instrument (Roche Applied Science).

Results
Determination of the minimum length of poly(A) tail required to initiate poly(A) tail lengthening of coronavirus defective interfering (DI) RNA In a previous study, we demonstrated that the length of the coronaviral poly(A) tail on both viral RNAs and DI RNA is regulated during infection; that is, the poly(A) tail length is increased in the early stage of infection and then decreased after the peak tail length in the later stage of infection [16,27]. To test whether the increase in coronaviral poly(A) tail length requires a minimum tail length in the initial viral RNA, as with the requirement of nearly 10 adenosine residues for the second phase of polyadenylation in eukaryotic mRNA, a series of bovine coronavirus (BCoV) DI RNAs with various poly(A) tail lengths were constructed and tested ( Fig 1A). The 2.2-kb and helper virus-dependent BCoV DI RNA (S1A Fig, upper panel) is a naturally occurring DI RNA [36,46], and it has been extensively exploited for analyzing the cis-acting elements required for replication in coronaviruses [16,33,35,36,[47][48][49]. To differentiate the origin of the poly(A) tail between the helper virus BCoV genome and BCoV DI RNA, the latter was engineered to carry the mouse hepatitis virus (MHV) 3' UTR (S1A Fig,  lower panel) [16,31,32] with which an MHV-specific primer can be used for RT-PCR to determine the length of the DI RNA poly(A) tail [16,27]. It should be noted that both BCoV and MHV-A59 belong to the genus betacoronavirus and that the replication efficiency of this MHV 3' UTR-containing BCoV DI RNA is similar to that of wild-type BCoV DI RNA [32]. After transfection of DI RNA constructs into BCoV-infected HRT-18 cells, the virus within the transfected cells is referred to as virus passage 0 (VP0) (S1B Fig , whereby the poly(A) tail length was also shorter (21 nts) than the initial length at 48 hpt (VP0) but increased to 31 nts at 8 hpi of VP1 and then gradually decreased to 24 nts and 19 nts at 24 and 48 hpi of VP1, respectively. Because it has been shown that the poly(A) tail length of BCoV DI RNA within infected cells at 48 hpt (VP0) is similar to that of packaged BCoV DI RNA in inoculum collected at the same time point [16], the poly(A) tail length of DI RNA in infected cells may represent that in inoculum at 48 hpt (VP0). Under this criterion and based on the results that (i) the coronaviral poly(A) tail length of DI RNA W-15A increased from 8 nts at 48 hpt (VP0) to 10 nts at 8 hpi of VP1 and (ii) the tail length of W-0A and W-5A remained the same during VP0 and VP1, we conclude that the minimum poly(A) tail length required to initiate tail lengthening of coronavirus DI RNA may lie between 5 and 9 nts during the natural infection of VP1, regardless of the length of the input DI RNA transcript. Moreover, the level of lengthening was found to be correlated to the initial length of the poly(A) tail; that is, DI RNA with a longer poly(A) tail (for example, W-25A) showed a better lengthening than that with a shorter poly(A) tail (for example, W-15A), as evidenced by the comparison of poly(A) tail lengths synthesized at 48h of VP0 and 8h of VP1 for these DI RNA constructs during infection ( Fig 1B).

Effect of hexamer AGUAAA and poly(A) tail length on the efficiency of coronaviral polyadenylation
Interactions between consensus polyadenylation signal (PAS) hexamer AAUAAA or its variant (AGUAAA, AUUAAA or UAUAAA) located10-30 nts upstream of the poly(A) cleavage site [5,10] and related proteins are integral aspects of eukaryotic mRNA polyadenylation [6,50].
Although we in the previous study (Fig 1) have identified the minimum requirement to initiate lengthening of the coronaviral poly(A) tail during infection, little is known with regard to the detailed mechanism of how poly(A) tails are synthesized in coronaviruses. As with most eukaryotic mRNAs, both the coronavirus genome and subgenomic mRNAs are 3' polyadenylated. Moreover, the PAS hexamer AGUAAA is also found in the 3' UTR of genome and subgenomic mRNAs in BCoV and MHV-A59 between 37 and 42 nts upstream of the poly(A) site.
To determine whether the hexamer AGUAAA, as with the eukaryotic PAS, serves as a cis-element involved in coronaviral polyadenylation, the hexamer in BCoV DI RNA W-25A was mutated from AGUAAA to UCAUUU; the resulting DI RNA was designated R-25A (Fig 2A). RNA was collected at 24 hpi of VP1, and the RT-PCR product was detected and subjected to sequencing analysis. As shown in Fig 2B, the lengths of the W-25A and R-25A poly(A) tails were 24 and 22 nts, respectively, suggesting that the hexamer AGUAAA only had a minor effect on coronavirus polyadenylation when the tail length of the input R-25A was 25 nts. However, under the similar RNA stability (Fig 2C), the replication efficiency of R-25A was impaired in comparison with that of W-25A, as determined by Northern blot analysis (36% vs 100%) (Fig 2D), suggesting that the replication efficiency may not be a major factor determining the poly(A) tail length. Besides, to exclude the possibility that the detected poly(A) tail is from the potential recombination between the DI RNA and BCoV genome, the primer MHV3'UTR2(+), which anneals to the 3' UTR of DI RNA and primer BM3(-),which anneals to the BCoV M protein gene were used for RT-PCR to identify the potential recombinant [31,32,51]. However, no RT-PCR product was observed ( Fig 2E, lanes 2-3), suggesting there is no potential DI RNA-BCoV genome recombinant synthesized during infection. Furthermore, because the last 21 nts of 3' UTR between DI RNA and BCoV genome of helper virus are identical [52], it is also possible that the synthesized poly(A) tail may originate from genome of the helper virus BCoV via homologous recombination in this region either during negative-or positive-strand RNA synthesis. To test this possibility, the nt A at the position 2 upstream of poly(A) tail in DI RNAs R-25A and W-25A was mutated to C to create R(C)-25A and W(C)-25A as shown in Fig 2F, upper panel. As shown in Fig 2F, lower panel, the mutated nt C was still maintained at 24 hpi of VP1, suggesting that there is no homologous recombination between helper virus and DI RNA in this region and thus the detected poly(A) tail on DI RNAs R-25A and W-25A may not acquire from a potential DI RNA-BCoV genome recombination. It is noteworthy that the UCAUUU sequence in DI RNA mutant R-25A did not revert back to the wild-type AGUAAA at 24 hpi of VP1 (data not shown). Accordingly, the RNA stability ( Fig 2C), replication efficiency ( Fig 2D) and recombination between DI RNA and BCoV Since the hexamer AGUAAA may not be required for coronaviral polyadenylation when the poly(A) tail on input DI RNA is 25 nts long, we hypothesize that the AUGAAA hexamer may be critical when the tail length is short. That is, it may be possible that both the hexamer AGUAAA and poly(A) tail may contribute to coronaviral polyadenylation through concerted action when the tail is at a certain length shorter than 25 nts on input DI RNA. This hypothesis is based on the results that (i) a poly(A) tail was not synthesized for W-0A (Fig 1B), even though this DI RNA construct contains a hexamer AGUAAA, and (ii) the level of poly(A) tail synthesis for hexamer AGUAAA-deficient R-25A was similar to that for hexamer AGUAAAcontaining W-25A ( Fig 2B). Therefore, to test our hypotheses and further elucidate the role of the AGUAAA hexamer in coronaviral polyadenylation, we created a series of DI RNA constructs in which the hexamer was substituted with UCAUUU and with various poly(A) tail lengths (Fig 3A, left panel) or the hexamer was intact but with various poly(A) tail lengths ( Fig  3A, right panel). According to previous study, the W-65A (with 65 nts of poly(A) tail), which is structurally the same as W-25A (with 25 nts of poly(A) tail) except poly(A) tail length, is almost not detected at 1 and 2 hpi of VP1 but is steadily identified in the later stage (e.g. 24 hpi) of infection using head-to-tail ligation and RT-PCR [16] under the same amplification condition, suggesting the detected DI RNA in the later infection is newly synthesized; therefore, to ensure that the detected DI RNA is not from the input DI RNA which may be carried over by supernatant of VP0 but from the replicating DI RNA, total cellular RNA was collected at 24 hpi of VP1. As shown in Fig 3B (for uncropped gel images, see S4 Fig), RT-PCR products were observed for W-5A-or W-8A-transfected BCoV-infected cells at the same time point with poly(A) tail lengths of 5 and 8 nts, respectively (Fig 3C). Although RT-PCR products were observed for AGUAAA-deficient R-12A, no clear poly(A) tail was identified; instead, sequencing analysis revealed a mixed population at the 3'-terminal end. Nonetheless, an RT-PCR product was detected for W-12A, and after sequencing, the poly(A) tail length of W-12A was found to be 9 nts ( Fig 3C). Interestingly, an RT-PCR product was also observed for AGUAAA-deficient R-15A, but subsequent sequencing revealed a poly(A) tail length of 3 nts (Fig 3C). On the other hand, the length of the W-15A poly(A) tail was determined to be 13 nts ( Fig 3C). For R-18A, R-20A and R-25A, the poly(A) tail lengths were 10, 18, and 22 nts, respectively, whereas those for W-18A, W-20A and W-25A were 17, 19 and 24 nts, respectively ( Fig 3C). Based on a comparison of synthesized poly(A) tail lengths between R-5A and W-5A, R-8A and W-8A, R-12A and W-12A, R-15A and W-15A, and R-18A and W-18A (Fig 3C), the poly(A) tail in AGUAAA-deficient DI RNA is shorter than that in AGUAAA-containing DI RNA. Accordingly, it was concluded that when the poly(A) tail length for the input DI RNA transcript is 18 nts or less, the hexamer AGUAAA is required for coronaviral polyadenylation. However, according to the results for R-20A, W-20A, R-25A and W-25A (Fig 3C), once the poly(A) tail length for the input DI RNA transcript reached 20 nts, the synthesized poly(A) tail length for the AGUAAA-deficient DI RNA was similar to that of AGUAAA-containin g DI RNA. Thus, it was concluded that the hexamer AGUAAA is not required for polyadenylation when the tail length on input DI RNA transcript is 20 nts (for example, R-20A) or more (for example, R-25A). Because the poly(A) tail length of DI RNA became varied after transfection ( Fig 1B) and the DI RNA poly(A) tail length at 48 hpt (VP0) in infected cells was similar to that of packaged DI RNA in inoculum [16], we also applied RT-PCR and sequencing to identify the poly(A) tail length of R-20A at 48 hpt (VP0). The length of poly(A) tail for R-20A was determined to be 18 nts at 48 hpt (VP0) (data not shown) and therefore it was also concluded that when the initial poly(A) tail length for coronavirus genome is 18 nts or more, coronaviral polyadenylation during natural infection is independent of hexamer AGUAAA. Taken together, the results suggest that (i) the poly(A) tail length plays an important role in the efficiency of coronaviral polyadenylation and (ii) the hexamer AGUAAA is also involved in coronaviral polyadenylation and may function in concert with the poly(A) tail to accomplish the subsequent polyadenylation when the initial length of poly(A) tail is shorter than 18 nts. This conclusion, therefore, supports our hypothesis.
To further determine whether, besides hexamer AGUAAA and poly(A) tail length, the DI RNA stability, replication efficiency and recombination are also factors affecting the synthesis of poly(A) tail on DI RNA, DI RNAs W-25A, R-5A, R-15A and R-20A with various length of synthesized poly(A) tail (24, 0, 3 and 18 nts, respectively) at 24 hpi of VP1 (Fig 3C) were selected and tested. As shown in Fig 3D, the stability of selected DI RNA variants is almost the same, suggesting the stability may not the main determinant affecting polyadenylation. For the factor of replication efficiency, DI RNA was not detectable by Northern blot assay from R-15A and R-20A, suggesting that the replication efficiency for both constructs is low ( Fig 3E); however, the poly(A) tail length for both constructs was different (3 and 18 nts, respectively) (Fig 3C), also suggesting replication efficiency may not play a major role in poly(A) tail synthesis (see Discussion for more details). In addition, the unidentified poly(A) tail from R-5A may be attributed to the undetected RT-PCR product due to the poor replication efficiency. For the factor of recombination, no RT-PCR product was observed (Fig 3F, lanes 2-5) using the primers specifically binding to DI RNA and BCoV genome as described above (Fig 2E), suggesting that the detected poly (A) tail on the selected DI RNAs may not acquire from a potential DI RNA-BCoV genome recombinant. Furthermore, given that the poly(A) tail in DI RNAs originates from the genome or subgenome of the helper virus BCoV, the length in DI RNAs is expected to be longer than 40 nts because poly(A) tail in genome or subgenome of the helper virus BCoV at 8, 24 and 48 hpi is 68,~50 and~40 nts, respectively [16]. However, the lengths of poly(A) tail in all DI RNAs used in the current study at these time points of infection are all shorter than 40 nts (the longest length is~31 nts in W-25A at 8 hpi of VP1, Fig 1B). Therefore, consistent with the results shown in

Synthesis of a poly(A) tail from poly(A) tail-lacking DI RNA
To test further the role of hexamer AGUAAA and poly(A) tail in coronaviral polyadenylation, 25 nts of the W-25A poly(A) tail was first replaced with the same length of a poly(U), poly(C) or poly(G) tail to create mutant W-25U, W-25C or W-25G, respectively ( Fig 4A). Moreover, 25 nts of the W-25A poly(A) tail were also substituted with random sequences to generate mutants W-random and W-polyCC (Fig 4A). These mutants were then transfected into BCoV-infected cells to examine whether a poly(A) tail is synthesized and where on the DI RNA it is added if the tail is identified. As shown in Fig 4A, poly(A) tails with lengths of 18 and 21 nts were found on W-25U and W-25C. Moreover, the position where the tail was added was not at the 3' terminus of the poly(U) or poly(C) tail but at the 3' terminus of the 3' UTR. For W-25G, a poly(A) tail was not found until 48 hpi of VP1, with an 8-nt tail also added to the 3' terminus of the 3' UTR. For W-random and W-polyCC, RT-PCR products were detected but no poly(A) tail or clear sequence of 3' UTR was identified during VP1 infection. Although the mechanism remains to be elucidated, it was surprising that the poly(A) tail can be synthesized from DI RNA without poly(A) tail sequence such as W-25U, W-25C and W-25G. Regardless, since the poly(A) tail can be synthesized from DI RNA constructs W-25U, W-25C and W-25G, which contain a hexamer AGUAAA, they may be suitable candidates for further determining the requirement of the hexamer AGUAAA in poly(A) tail synthesis. That is, if the hexamer AGUAAA is mutated, lack of poly(A) tail synthesis for these hexamer AGUAAAdeficient constructs may reinforce the important role of the hexamer in coronaviral polyadenylation. To this end, the AGUAAA in W-25U, W-25C and W-25G was substituted with UCAUUU to create mutants R-25U, R-25C and R-25G. As shown in Fig 4B, the poly(A) tail was not found in R-25G because the RT-PCR product was not obtained. In this case, it was speculated that the poor replication efficiency may be the reason which leads to the result. However, when compared with W-25U and W-25C, a poly(A) tail was not synthesized for R-

Dissection of hexamer AGUAAA by mutagenesis to further determine its role in coronavirus polyadenylation
To further characterize the role of hexamer AGUAAA in coronaviral polyadenylation, the hexamer was dissected by mutagenesis (Fig 5A), and the effect on poly(A) tail synthesis was evaluated. RT-PCR products were detected for W-15A and other W-15A-derived mutants at 24 hpi of VP1, as displayed in Fig 5B, left panel. Sequencing analysis revealed that 1 (M1-15A) or 2 (M2-15A) substitutions from the 3' end of the hexamer did not alter the efficiency of poly (A) tail synthesis, and the resulting length (13 nts) for these two mutants was the same as that for W-15A, which has an intact AGUAAA motif. However, decreased poly(A) tail synthesis efficiency was found for M3-15A (8 nts) and M4-15A (10 nts), with 3 and 4 substitutions, respectively, from the 3' end of the hexamer. A severe impact on poly(A) tail synthesis occurred for M5-15A, in which 5 nts were substituted from the 3' end of the hexamer, and the resulting poly(A) tail length was 3 nts, the same as that obtained for R-15A in which the entire hexamer was substituted. Therefore, the length of the poly(A) tail gradually decreases with the increased mutations within the hexamer. Based on the results that these DI RNAs were able to replicate (S5B Fig) and still retained the mutated hexamer sequence (S5C Fig), these results further support our argument that the AGUAAA hexamer functions as a cis-acting element in coronaviral  polyadenylation. Additionally, different levels of substitutions within the hexamer AGUAAA were also performed in W-25A (Fig 5A, right panel). RT-PCR products were detected (Fig 5B,  right panel), and the overall poly(A) tail lengths for the mutants M1-25A, M4-25A and M5-25A were not altered, whereas the lengths for M2-25A, M3-25A and R-25A are slightly decreased (Fig 5C, right panel), suggesting that the alternations in the poly(A) tail length do not occur in the DI RNAs with 25 nts of poly(A) tail and mutated AGUAAA. Taken together, these results further support our finding that the hexamer AGUAAA is an important cis-element in coronavirus polyadenylation.

The function of hexamer AGUAAA in polyadenylation is position dependent
Based on the data presented here, this hexamer AGUAAA is involved in coronaviral polyadenylation. To determine whether the function of the hexamer AGUAAA in polyadenylation is position dependent, the sequence in W-15A between 37 and 42 nts upstream of the poly(A) tail site, where the original hexamer AGUAAA is located, was replaced with UCAUUU, and the sequence at 49 to 54 nts upstream of the poly(A) tail site was substituted with hexamer AGUAAA to create mutant PAS-R-15A (Fig 6A, left panel). In addition, mutant PAS-PAS-15A was also constructed in which the sequence between 49 and 54 nts upstream of the poly (A) tail site was replaced with the hexamer AGUAAA, but the original hexamer AGUAAA between 37 and 42 nts upstream of the poly(A) tail site was retained (Fig 6A, left panel). Wildtype DI RNA W-15A, mutant R-15A, in which the original hexamer AGUAAA was replaced with UCAUUU, and mutant PAS-PAS-15A were then used as controls to evaluate the position dependency of the hexamer AGUAAA in polyadenylation. Moreover, to ensure that the altered polyadenylation efficiency indeed results from sequence changes in the aforementioned positions of DI RNA with 15 nts of poly(A) tail, we also created constructs PAS-R-25A and PAS--PAS-25A, with poly(A) tail lengths of 25 nts (Fig 6A, right panel). We predicted that the sequence changes may not alter the polyadenylation efficiency for DI RNA with 25 nts of poly (A) tail according to the previous data shown in Fig 2. As shown in Fig 6B, left panel, RT-PCR products were observed for all constructs at 24 hpi of VP1, and the poly(A) tail lengths were determined to be 5, 15, 4 and 13 nts for PAS-R-15A, PAS-PAS-15A, R-15A and W-15A, respectively (Fig 6C, left panel). Note that the RT-PCR product for PAS-R-15A with the size less than 100 bp was sequenced and determined to be primer-dimer. These results suggest (i) the sequence substitution between 49 and 54 with hexamer AGUAAA in PAS-PAS-15A (the original hexamer AGUAAA was retained) did not affect the efficiency of polyadenylation when compared with the poly(A) tail length of W-15A (15 vs 13 nts) and (ii) the efficiency of poly(A) tail synthesis was still low (5 nts) for PAS-R-15A (the sequence between 49 and 54 was replaced with hexamer AGUAAA and the original hexamer AGUAAA was mutated). Note that these DI RNAs were able to replicate (S6B Fig) and the mutated sequences were still retained (S6C Fig). Consequently, since the position change of hexamer AGUAAA in PAS-R-15A did not restore the efficiency of poly(A) tail synthesis, it is concluded that the function of the hexamer AGUAAA in polyadenylation is position dependent. In addition, by RT-PCR ( Fig  6B, right panel) and sequencing, the poly(A) tail length for constructs PAS-R-25A, PAS-PAS--PAS, R-25A and W-25A was identified to be similar (Fig 6C, right panel), suggesting that for DI RNA with poly(A) tail of 25 nts, substitution mutation between 49 and 54 nts or 37 and 42 nts upstream of the poly(A) site has only a minor or no effect on the efficiency of polyadenylation. Taken together, it is concluded that the function of the hexamer AGUAAA in coronaviral polyadenylation is position dependent.

The hexamer AGUAAA or its variants are found among coronaviruses
The consensus PAS hexamer AAUAAA, located 10-30 nts upstream of the poly(A) cleavage site, is one of the cis-acting elements responsible for eukaryotic mRNA polyadenylation [1,7,8]. Variants, including AGUAAA, AUUAAA and UAUAAA, have also been identified as important for poly(A) tail synthesis in certain eukaryotic mRNA populations [9][10][11]. In the current study, we demonstrated that the hexamer AGUAAA, located between 37 and 42 nts upstream of the poly(A) site in BCoV, is involved in coronaviral polyadenylation. Moreover, the AGUAAA motif or its variants are also found in other coronaviruses, as summarized in Table 1. Among these hexamers, AAUAAA, AGUAAA and AUUAAA, which function in eukaryotic mRNA polyadenylation, are found in betacoronavirus C, betacoronavirus A and deltacoronavirus, respectively. Other hexamer variants, which differ from hexamer AAUAAA, AGUAAA or AUUAAA by one nucleotide, have also been identified in alphacoronavirus, betacoronavirus B, betacoronavirus D, and gammacoronavirus. Interestingly, although the positions of these hexamers in coronaviruses can be from 24 to 57 nts upstream of the poly(A) tail, the motifs are at similar positions within the same genus or lineage. However, it remains to be determined whether the hexamer has similar functions in coronaviruses other than BCoV.

Discussion
In this article, we provide genetic evidence of how coronavirus lengthens its poly(A) tail during the infection cycle and how the hexamer AGUAAA and poly(A) tail act in concert to accomplish coronaviral polyadenylation. The characterization of two polyadenylation-related elements, AGUAAA and the poly(A) tail, will help in a better understanding of the mechanism of coronaviral polyadenylation.

Factors involving in the regulation of coronaviral poly(A) tail length during coronavirus infection
In the previous study [16], it was found that the poly (

Contribution of hexamer AGUAAA for virus survival
In the present study, we demonstrated that both the poly(A) tail and hexamer AGUAAA are involved in coronaviral polyadenylation and the two elements appear to be able to functionally compensate for each other when one is missing or modified, as exemplified by DI RNA mutants R-25A and W-25U. Based on present results, we propose that both the hexamer AGUAAA and poly(A) tail contribute to virus survival under diverse environments. For example, a poly(A) tail length shorter than 20 nts was consistently found for coronaviral RNA collected from (i) BCoV-infected HRT-18 cells at 2 hpi, (ii) W-25A-transfected BCoV-infected HRT-18 cells at 72 hpi of VP1 and (iii) mouse brain at 5 days postinfection with MHV-A59 (S7 Fig). Moreover, in persistent infection, a MHV-A59 poly(A) tail length of less than 20 nts was also frequently found (S7 Fig); therefore, when such persistent-infection coronavirus infects fresh cells, the AGUAAA hexamer is also required in concert with the poly(A) tail for efficient polyadenylation. Thus, the hexamer AGUAAA along with the poly(A) tail may be required to restore efficient polyadenylation for subsequent translation and replication, contributing to virus survival under these conditions.
The importance of the minor changes in poly(A) tail length for gene expression of coronavirus Unlike most mammalian mRNAs with~250 nts of poly(A) tail, positive-strand RNA viruses such as piconaviruses, have heterogeneous lengths of natural poly(A) tail ranging from about 10 to 120 nts long [55][56][57]. Functional analysis reveals that 12 nts of poly(A) tail on positivestrand genome of poliovirus is the minimum length to initiate negative-strand synthesis of genome and that increasing the poly(A) tail length from 12 to 13 nts results in about a ten-fold efficiency in negative-strand synthesis [58], suggesting that minor changes in viral poly(A) tail length exert significant effect on viral replication. Similar results have also been obtained in sindbis virus during negative-strand synthesis in which increasing the poly(A) tail length from 10 to 15 nts leads to a nearly nine-fold increase in negative-strand RNA synthesis [59].

Evaluation of the effect of replication efficiency on polyadenylation of DI RNA
In the present study, we found that the replication efficiency of DI RNA mutants R-25A (36%) (Fig 2) and W-25C (<1%, S8 Fig) was much lower than that of W-25A (100%). However, the synthesized poly(A) tail lengths of these two mutants was similar to that of W-25A, suggesting that the replication efficiency may not be a major factor affecting coronaviral polyadenylation. For DI RNA with shorter poly(A) tail and AGUAAA mutation, for example, R-15A, its replication efficiency was similar to that of W-15A at the stage of 24 hpi of VP1,and RT-PCR products were detected for sequencing (S5A and S5B Fig); however, the synthesized poly(A) tail length (3 and 13 nts for R-15A and W-15A, respectively) was different (Fig 3C), also suggesting that the replication efficiency may not be a major factor in the poly(A) tail synthesis. Accordingly, this may be applied to account for the results shown in Fig 3E. The replication efficiency for both R-15A and R-20A was similar because they were not detected by Northern blot assay ( Fig  3E); however, both were detected by RT-PCR and sequencing results showed that the poly(A) tail length for both constructs was different (3 and 18 nts, respectively) (Fig 3C), also suggesting replication efficiency may not play a major role in poly(A) tail synthesis. For R-5A in Fig 3E, the unidentified poly(A) tail from R-5A may be attributed to the undetected RT-PCR product due to the poor replication efficiency. The aforementioned arguments may also account for the results shown in Fig 4. For W-25U and W-25C, the replication efficiency for both DI RNAs is low because they were not detected by Northern blot assay (S8A Fig) when compared with that for W-25A. Subsequent study showed that the replication efficiency was similar between W-25U, W-25C, R-25U and R-25C as determined by qRT-PCR (S8B Fig); however, poly(A) tail was synthesized from W-25U and W-25C but not from R-25U and R-25C (Fig 4B), suggesting replication efficiency may not be the main determinant in the poly(A) tail synthesis. For R-25G, similar to R-5A, the unidentified poly(A) tail may be attributed to the undetected RT-PCR product. Taken together, we speculate that the replication efficiency may not be a major factor affecting coronaviral polyadenylation and that replication and polyadenylation in coronaviruses are separate processes but proceed by a similar theme during viral RNA synthesis.

Role of hexamer AGUAAA in coronavirus replication
It has been shown that with reverse genetic approaches the deletion of nts 30-170 upstream of poly(A) tail do not affect the growth function in tissue culture for MHV [60,61]. It is reasonable to speculate that the hexamer AGUAAA, located between 37 and 42 nts upstream of the poly(A) site, may not play a role in the replication. However, in the current study, the results suggest that the replication efficiency is decreased in DI RNA R-25A with mutated hexamer AGUAAA in comparison with that in W-25A with intact AGUAAA (36% vs 100%). We speculate that the system (reverse genetics with a full-length cDNA vs DI RNA) and mutagenesis (deletion of entire region between nts 30-170 vs replacement of hexamer within the undeleted region of nts 30-170) employed for the analysis may lead to the discrepancy of the results. Nevertheless such discrepancy does not affect the role of hexamer AGUAAA in the coronaviral polyadenylation concluded in this study because the polyadenylation appears not to be influenced by replication efficiency as evidenced by the results of DI RNA constructs discussed above.

Proposed mechanism for coronaviral polyadenylation
Based on the evidence shown in this study and in others, we propose a model for coronaviral polyadenylation (Fig 7). First, coronavirus polymerase utilizes positive-strand viral RNA as a template to synthesize negative-strand viral RNA in which the length of the poly(U) tract is similar to that of the poly(A) tail at the same time point after infection [16]. Subsequently, the negative-strand viral RNA serves a template for synthesizing the positive-strand viral RNA. During positive-strand viral RNA synthesis, once the nascent hexamer AGUAAA is copied, cytoplasmic polyadenylation-associated factors such as CPSF and other accessory factors may interact with the hexamer on the positive strand. This hexamer-protein complex may interact with (an)other complex(es) formed by interactions between proteins and the poly(U) tract on the negative strand to generate a stable RNA-protein complex that directs the viral RdRp to synthesize the poly(A) tail. Alternatively, the RNA-protein complex formed may recruit cytoplasmic poly(A) polymerase instead of viral RdRp for poly(A) tail synthesis [62]. This model emphasizes the important role of the AGUAAA hexamer on the nascent positive-strand viral RNA as well as the 5' terminal sequence (i.e., poly(U) tract) on the negative-strand viral RNA, with which polyadenylation-associated proteins interact.
Accordingly, under the argument of constitution of a stable complex for efficient coronavirus polyadenylation, this model may explain (i) why a poly(A) tail was synthesized from poly (A)-deficient DI RNA constructs W-25U, W-25C, and W-25G, rather than poly(A)-deficient DI RNA W-0A, even though they all contain a hexamer AGUAAA; and (ii) why a poly(A) tail was found for R-15A and R-18A, but not R-25U, R25C and R25G, despite the fact that they all lack the hexamer AGUAAA. In the cases of W-25U, W-25C and W-25G, which contain the hexamer AGUAAA but lack a poly(A) tail, it is speculated that in spite of different affinity the binding of proteins (e.g., PABP [63,64]) to the 5' end of poly(A), poly(G) or poly(C) on the W-25U, W-25C or W-25G negative strand, respectively, and then to proteins binding to the AGUAAA motif on the nascent positive-strand RNA to form a protein complex is a key step in coronaviral polyadenylation. In contrast, in the case of W-0A, the aforementioned protein complex is not formed because there is no such RNA element at the 5' end of negative-strand W-0A for protein binding and thus no polyadenylation occurs. Regarding constructs R-15A, R-18A, R-25U, R25C and R25G, in which the hexamer AGUAAA was mutated, we reason that if polyadenylation-related protein binding to RNA elements at the 5' end of negative-strand viral RNA is not sufficient to form a stable RNA-protein complex, assistance from interactions between the AGUAAA motif and proteins may be required for polyadenylation. Accordingly, although proteins were able to bind to poly(A), poly(G) or poly(C) on the negative strand of R-25U, R-25C or R-25G, respectively, polyadenylation was unable to occur without the help of the hexamer AGUAAA to form a stable RNA-protein complex. Furthermore, under the same argument of constitution of a stable complex for efficient coronavirus polyadenylation, this model may be extended to explain why different levels of hexamer AGUAAA mutation (Fig 5) and the alternation of hexamer AGUAAA position (Fig 6) decreased the efficiency of polyadenylation on DI RNA.
In conclusion, we for the first time determined the viral RNA elements involved in coronaviral polyadenylation. Future works may elaborate on the identification of the cellular and viral proteins participating in the synthesis of the coronaviral poly(A) tail as well as the detailed mechanism of coronaviral polyadenylation and its regulation.