Equid herpesvirus 8: Complete genome sequence and association with abortion in mares

Equid herpesvirus 8 (EHV-8), formerly known as asinine herpesvirus 3, is an alphaherpesvirus that is closely related to equid herpesviruses 1 and 9 (EHV-1 and EHV-9). The pathogenesis of EHV-8 is relatively little studied and to date has only been associated with respiratory disease in donkeys in Australia and horses in China. A single EHV-8 genome sequence has been generated for strain Wh in China, but is apparently incomplete and contains frameshifts in two genes. In this study, the complete genome sequences of four EHV-8 strains isolated in Ireland between 2003 and 2015 were determined by Illumina sequencing. Two of these strains were isolated from cases of abortion in horses, and were misdiagnosed initially as EHV-1, and two were isolated from donkeys, one with neurological disease. The four genome sequences are very similar to each other, exhibiting greater than 98.4% nucleotide identity, and their phylogenetic clustering together demonstrated that genomic diversity is not dependent on the host. Comparative genomic analysis revealed 24 of the 76 predicted protein sequences are completely conserved among the Irish EHV-8 strains. Evolutionary comparisons indicate that EHV-8 is phylogenetically closer to EHV-9 than it is to EHV-1. In summary, the first complete genome sequences of EHV-8 isolates from two host species over a twelve year period are reported. The current study suggests that EHV-8 can cause abortion in horses. The potential threat of EHV-8 to the horse industry and the possibility that donkeys may act as reservoirs of infection warrant further investigation.


Viral isolation and identification by PCR
The tissues of two horse foetuses aborted in the third trimester were diagnosed by pathological examination as originating from EHV-associated abortions. Post mortem tissues were received by the Virology Unit for identification of the causal agent as EHV-1 or EHV-4. The abortions occurred in 2003 and 2010 in two different counties in Ireland (Co. Kerry and Co. Kildare). Viral DNA was initially detected in both samples by PCR using primers specific for EHV-1 ORF16 (encoding glycoprotein C) [12]. The viruses were isolated from tissue homogenates by a single passage in rabbit kidney (RK13) cell monolayers [13]. EHV-4 and equine arteritis virus were not detected in the tissue samples by PCR. PCR of a sequence of approximately 500 bp in EHV-1 ORF30 that contains a putative neurological marker [14] in the DNA polymerase catalytic subunit was carried out by using primers 30.2141.F (5'-TGGTTGTGTTTGACTTCGCT-3') and 30.2655.R (5'-GTAGATAACCC TGACGGAGTA-3'). The sequence of the PCR product showed a high level of similarity to EHV-1 for both isolates. However, BLAST analysis [15] carried out at this stage (prior to deposition into GenBank in 2012 of the genome sequence of EHV-8 strain Wh; accession no. JQ343919 [11]) showed that the closest relative was EHV-9 rather than EHV-1. To enable comparison with EHV-8, PCR products were generated and sequenced by using primers AHV3.gG. 915.F (5'-CTTACGGAGACATCAACG-3') and AHV3.gG.1200.R (5'-GCCTG AGCCAAGATTCT-3'), which target a 286 bp region of ORF70 in the only EHV-8 sequence available at the time [9] (from an Australian strain; GenBank accession U24184, 1585 bp).
In addition to these two isolates from horses, which were designated EHV-8/IR/2003/19 and EHV-8/IR/2010/47, viruses were obtained from two nasal swabs taken from donkeys in Co. Cork, one of which was exhibiting respiratory signs and the other neurological signs. These samples were submitted for virus detection in 2010 and 2015, and AHV-3 was detected with the AHV-3 ORF70 primers. The viruses were then isolated by a single passage in RK13 cell monolayers. These isolates were designated EHV-8/IR/2010/16 and EHV-8/IR/2015/40, respectively.
The characterisation of archived isolates from diagnostic specimens was approved by the Board of Governors of the Irish Equine Centre.

ORF30 sequencing
The subsequent availability of the EHV-8 strain Wh genome sequence and of the ORF30 sequence from EHV-8 strain 804/87 [16] allowed sets of overlapping primers to be designed by using Primer3 [17], in order to facilitate the amplification and sequencing of the whole of ORF30. Viral DNA from the four viruses was isolated on first passage from the RK13 cell culture supernatant by using a QIAamp DNA mini kit (QIAGEN, Hilden, Germany). The amplified PCR products were purified by using a QIAquick gel extraction kit (QIAGEN) and sequenced commercially (QIAGEN Genomic Services, Hilden, Germany). The sequences were assembled by using BioEdit sequence alignment editor version 7.2.5 [18] and aligned by using ClustalW [19].

Genome sequencing
Each virus was passaged a further three times in RK13 cells. DNA was then extracted from infected cells by using a Hi-Pure PCR purification kit (Roche Diagnostics, Indianapolis, IN, USA) and quantified by using a Nanodrop 1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA).
Sequencing libraries were prepared from 5 μg DNA as described previously [20], and sequenced on a NextSeq instrument (Illumina, San Diego, CA, USA) to generate paired-end reads of 150 nucleotides (nt). The numbers of reads trimmed of adapters and low quality data are listed in Table 1. The reads for EHV-8/IR/2003/19 and EHV-8/IR/2010/47 were assembled de novo by using SPAdes version 3.5.0 [21], and the resulting contigs were ordered against the sequence of EHV-8 strain Wh in order to produce draft genome sequences. The draft genome  [22]. Alignments of the respective data sets against the four draft genome sequences were visualised by using Tablet version 1.14.11.07 [23], and the consensus sequences were corrected manually. No major variants were noted in any of the sequences. The sizes of 16 tandem reiterations in each genome were determined by generating eight PCR products and sequencing them using internal primers (Sigma-Aldrich). The genome termini were mapped by identifying sets of sequence reads that shared a common end in the regions corresponding to the EHV-1 and EHV-4 genome termini [24][25][26]. Locating the genome termini led to the construction of the complete genome sequences. The numbers of reads mapping to each sequence, the coverage depths and the GenBank accessions are listed in Table 1.
Protein-coding regions were assigned on the basis of the EHV-1 model [24]. Comparative analysis of predicted amino acid sequences was carried out by using ClustalW implemented in Bioedit. The genomes of EHV-1 strain V592 (N752 strain, GenBank accession AY464052), EHV-1 hypervirulent strain AB4 (D752 strain, GenBank accession AY665713) and EHV-9 strain P19 (GenBank accession AP010838) were included in these comparisons. The corresponding amino acid sequences for each strain were generated as Fasta files, pairs of sequences were aligned for each protein, and pairwise identity values were calculated by using the BLOSUM62 similarity matrix. Pairwise identity range values were calculated for each alignment from the minimal and maximal identity values of the four EHV-8 strains in comparison with EHV-1 and EHV-9.

Phylogenetic analysis
An alignment containing the four Irish EHV-8 genome sequences and that of EHV-8 strain Wh was created by using Gap4 [27], and used as the basis for comparison. Multiple sequence alignments of the predicted amino acid sequences of individual EHV-8 genes and of related genomes (S1 Table) were created by using Clustal implemented in Bioedit. Phylogenetic trees were computed by implementing the maximum likelihood method [28] in MEGA7 version 7.0.14 [29]. The optimal model for each tree was chosen on the basis of the lowest Bayesian information criterion (BIC) score. Further investigations of phylogenetic relationships were carried out by aligning the concatenated amino acid sequences of 74 of the 76 viral genes, implementing the JTT matrix-based model [30] with 1000 bootstrap replicates. ORF1 and ORF2 were omitted from this analysis because the genome of EHV-1 strain T529 10/84 lacks the 1,611 bp region containing these coding regions [31]. EHV-8 strain Wh was also omitted because of the presence of frameshifted genes.

Detection of EHV-8
Prior to the availability of the EHV-8 strain Wh genome sequence, analysis of the 287 bp portion of ORF70 of the two viruses from horse foetuses revealed differences of only 1-2 nt in comparison with the Australian EHV-8 strain. The two isolates from donkeys were identical to EHV-8/IR/2003/19 in ORF70. These results identified the four Irish viruses as being isolates of EHV-8. The levels of identity of the corresponding regions in the closest related genomes (those of EHV-1 and EHV-9) were approximately 95% (i.e. differences of up to 15 nt).

ORF30 sequences
Comparison of the nucleotide sequences of ORF30 for the four Irish strains showed that they share 99.67-99.97% identity. The sequences obtained through Sanger sequencing using overlapping PCR products were identical to those obtained during the whole genome/Ilumina read assembly. Horse strain EHV-8/IR/2003/19 is more closely related to asinine strain EHV-8/IR/2010/16 than to horse strain EHV-8/IR/2010/47. The levels of identity to EHV-8 strains Wh and 804/87 are 99.5-99.7% and 99.5-99.6%, respectively. The four Irish strains are slightly more closely related to EHV-9 (approximately 95%) than they are to EHV-1 (approximately 93%).
At the level of predicted amino acid sequences, the four Irish strains are 99.51-99.92% identical to each other, 99.51-99.75% identical to EHV-8 strain Wh and 99.34-99.67% identical to EHV-8 strain 804/87. All EHV-8 strains have an aspartic acid (D) residue at position 752 in the ORF30 protein, which corresponds to the putative D752 hypervirulence marker (formerly identified as a marker of neuropathogenicity) in EHV-1. The level of identity to EHV-9 strain P19 is approximately 97%, and those to EHV-1 strains AB4 and V592 are slightly lower, at 96% and 95%, respectively. To illustrate the relationship between the viruses, a phylogenetic tree was constructed (Fig 1). The four new Irish strains, EHV-8 strain Wh and EHV-8 strain 804/87 form a separate clade that is more closely related to EHV-9 than EHV-1.

Genome sequences
The linear genome of EHV-8 consists of unique long and short regions (U L and U S ) flanked by inverted terminal and internal repeats (TR L /IR L and TR S /IR S ) in an arrangement (TR L -U L -IR L -IR S -U S -TR S ) that is typical of other varicelloviruses, including EHV-1 and EHV-9. The genome has the same complement and arrangement of 80 open reading frames (ORFs) predicted to encode functional proteins as EHV-1 and EHV-9, of which 76 are unique and four are duplicated in TR S /IR S (Fig 2). The size of the four Irish EHV-8 genomes ranges from 149405 to 149800 bp ( Table 1). The component sizes are: U L (113 kbp), U S (12 kbp), TR L /IR L (32 bp) and IR S /TR S (12 kbp). The average nucleotide composition of the whole genome is 54.5-54.6% G+C, and that of IR S /TR S is substantially higher (64.61-64.75% G+C).
The four Irish EHV-8 genome sequences are 98.42-99.03% identical to each other. EHV-8/IR/2003/19 and EHV-8/IR/2010/16, which were isolated from a horse and donkey, respectively, share the highest level of identity to each other (99.03%; 1461 substitutions and indels), and the two viruses isolated from aborted horse foetuses (EHV-8/IR/2003/19 and EHV-8/IR/2010/47) share the lowest (98.42%; 2377 substitutions and indels). Levels of identity to EHV-8 strain Wh are lower (approximately 97%) for all four strains, and those to EHV-9 and EHV-1 are lower still, at approximately 92% and 89%, respectively. Many of the differences between the Irish strains and EHV-8 strain Wh are located in ORF64 (in TR S / IR S ) and ORF71 (in U S ).
Alignment of sequences near the termini of the four Irish EHV-8 genomes with those of other varicelloviruses demonstrated the conservation of sequence elements (AnTn and γ) [32] that are required for the maturation of unit-length genomes from replicated concatemers (Fig  3). Compared with the Irish EHV-8 genomes, the EHV-8 strain Wh genome lacks a region of 398-601 bp at the left terminus and has an extra region of 1304 bp at the right terminus that corresponds to the region at the right end of U L .

Tandem reiterations
Sixteen tandem reiterations consisting of multiple copies of short, G+C-rich sequences (plus a partial copy) were identified in the genomes of the sequenced EHV-8 strains. Five are duplicated in TR S /IR S and seven form parts of protein-coding regions (four in ORF24 and three in ORF71) ( Table 2). The number of repeat units differs among isolates, and three reiterations exhibit minor sequence variations in the unit sequence. EHV-8 strain Wh lacks four of the reiterations, three being part of ORF71 in the Irish strains and the other being located in the region near the left genome terminus that is not represented in the strain Wh sequence. The number of repeat units in some reiterations was substantially fewer in EHV-8 strain Wh than in the Irish strains.

Origins of DNA replication
Three potential origins of DNA replication were identified in the Irish EHV-8 genomes, each being partially palindromic and containing inverted copies of a diagnostic 9 bp sequence (CGTTCGCAC) separated by an A+T-rich sequence [33]. OriS is located in IR S /TR S between ORF64 and ORF65, and OriL is located in the centre of U L between ORF39 and ORF40 ( Fig  2).

Conservation of protein-coding regions among EHV-8 strains
The predicted amino acid sequences of 73 of the 76 predicted protein-coding regions are !99% identical among the Irish strains, the exceptions being ORF24, ORF64 and ORF71. The level of identity among the Irish strains in ORF24 (encoding the large tegument protein) is 95.61-99.03%. These lower values are largely due to differences in the lengths of tandem reiterations (3499-3602 residues). Likewise, the levels of identity between the Irish strains and EHV-8 strain Wh is lower (93.65-97.75%) because the latter has shorter tandem reiterations in ORF24 (3445 residues).
The level of identity among the Irish strains in ORF64 (encoding transcriptional regulator ICP4) is 97.08-99.93%. EHV-8/IR/2010/47 is the most divergent, having 33 differences from EHV-8/IR/2003/19, with the majority of these located at the N terminus. Isolate EHV-8/IR/ 2015/40 has 14 differences from EHV-8/IR/2003/19, with all but one of these being common to EHV-8 strain Wh. However, the EHV-8 strain Wh protein has a much lower overall level of identity (87.80-90.58%) to the Irish EHV-8 proteins. This is due to a frameshift in the EHV-8 strain Wh gene that results in a C-terminally truncated protein. This truncation leads to the absence of conserved amino acid sequence motifs that are required for gene regulation in other alphaherpesviruses [34].
ORF71 (encoding gp2, which is equivalent to herpes simplex virus type 1 (HSV-1) glycoprotein J) is the least conserved of the 76 predicted proteins, having a level of identity among the Irish strains of 92.42-97.60%. Most differences are due to three tandem reiterations that differ in length among strains. Strain Wh shares only 62.10-69.60% identity to the Irish strains, due in part to the absence of two of the reiterations that are common to the Irish strains and in part to a frameshift-induced truncation of the strain Wh protein to 544 residues, compared to 853-923 residues in the Irish strains.

Discussion
The complete genome sequences of four EHV-8 strains isolated in Ireland between 2003 and 2015 were determined. Two of the strains were isolated from the aborted foetuses of horses, which on the basis of gross post mortem findings and histopathological examination had been diagnosed as EHV-associated abortions. A further two strains were isolated from nasal swabs from donkeys, one exhibiting neurological signs and the other respiratory disease. The  sequences were determined by Illumina sequencing and compared with that of EHV-8 strain Wh, which was determined by Sanger technology [11], and those of other equid alphaherpesviruses. EHV-8 shares a common genome structure with other equid alphaherpesviruses in the genus Varicellovirus, whereas the EHV-8 strain Wh sequence lacks the accurate identification of the genome termini [32] and is apparently incomplete at the left terminus. In addition, frameshifts are apparent in two genes (ORF64 and ORF71) of EHV-8 strain Wh, and at least the former is likely to represent sequencing errors. The sequences of the four Irish strains thus represent the first complete EHV-8 genome sequences to be reported. Phylogenetic analysis indicates that the Irish EHV-8 strains exhibit minimal diversity. They cluster together and are closely related to, but distinct from, EHV-9 and EHV-1. Sequence alignments revealed a low degree of heterogeneity among strains isolated from two host species over a 12 year period. EHV-8/IR/2003/19 and EHV-8/IR/2010/16 show the highest similarity of the Irish EHV-8 isolates at both the nucleotide and amino acid levels, despite having been isolated from a horse and donkey, respectively. In contrast, EHV-1 strains isolated from gazelle, onager and Grevy's zebra form a separate genetic group from EHV-1 strains isolated from horses [35].
Although donkeys have been reported to be the natural hosts for EHV-8 [9], our data demonstrate that the virus has the capability to cross host species and cause abortion in horses. Cases of natural infection by EHV-9, which is closely related to EHV-8, have been reported in multiple species including gazelles [3], zebras [36] and giraffes [37], and the virus has been isolated from an aborted Persian onager in a zoo [36]. Infections caused by cross-species transmission of herpesviruses can result in increased virulence and cause severe or fatal diseases. For example, EHV-9 has been associated with meningoencephalitis in a polar bear that necessitated euthanasia [38], and fatal acute encephalitis has been induced by experimental infection in goats [39] and pigs [40]. EHV-9-inoculated horses exhibited mild encephalitis but lacked vasculitis, which is one of the main pathological lesions of EHV-1 neurologic disease [41]. EHV-1 has been isolated from non-equid species including cattle [42], deer [43], antelope [44], alpacas and llamas [45]. Wohlsein et al. [46] identified D752 EHV-1 in Thomson's gazelles, black bear and guinea pigs in two different zoo epizootics that were associated with abortion, severe neurological signs and high mortality rates.
It is possible that EHV-8 is under-diagnosed in horses because of its close relationship to other equid alphaherpesviruses. Indeed, the EHV-8 strains isolated from two aborted foetuses of horses in this study were identified incorrectly as EHV-1 during initial testing, due to cross reactivity of the PCR assay. Viral causes of abortion in horses have only been attributed to EHV-1, and less frequently EHV-4 and equine viral arteritis [47]. In future, EHV-8 will need to be included in the differential diagnosis. The availability of EHV-8 genome sequences will facilitate the development of a specific PCR assay for EHV-8 in aborted foetuses and neonatal foal deaths, and will also enable further investigation into the putative of role of EHV-8 in equid neurological disease. In this study, we report the first isolation of EHV-8 from a donkey with neurological disease. The number of outbreaks of neurological disease in horses attributed to the EHV-1 hypervirulent phenotype may be inflated, as may the increased prevalence of this phenotype in EHV-1 abortions [48], due to the inclusion of EHV-8 cases. The ORF30 variation (D752/N752) has been applied to the development of PCR tests to discriminate between EHV-1 hypervirulent and non-hypervirulent strains [49][50][51]. All sequenced EHV-8 strains have the D752 genotype, and the specificity of the EHV-1 allelic discrimination test has not been assessed for EHV-8 [49]. Given the high level of sequence identity to EHV-8 at the primerbinding sites used in this assay [49], with one mismatch in the forward primer and none in the reverse primer or the probe (G2254), there is a strong likelihood of misidentified detections. Furthermore, an assessment of the EHV-1 gC primers used initially in this study [12] showed that there were only two mismatches in the forward primer and one in the reverse primer.
The EHV-8 strains isolated from the two horse abortions originated in two distinct provinces approximately 7 years apart. The source of infection in either case was unknown. Donkeys and horses occasionally share pasture in Ireland, but neither mare had a history of contact with donkeys during the gestation period. Moreover, cross-species transmission of EHV-1 and EHV-9 has been reported in animals that do not have direct contact. It has been hypothesised that water can act as a source of herpesvirus infections, and EHV-1 has been shown to remain infectious in water for up to three weeks [52]. However, there is currently no evidence to support the survival of EHV-8 in water as a basis for cross-species transmission to horses. As an alternative, the possibility that EHV-8 circulates continuously in horses, even at a low level, merits investigation. One potentially useful corollary to the availability of the EHV-8 genome sequences would be the development of an EHV-8-specific peptide-based ELISA similar to that developed for the detection and differentiation of EHV-1-and EHV-9-specific antibodies [53]. This ELISA could be used to determine the seroprevalence of EHV-8 infections in donkeys and horses. Alternatively, the available sequence information makes it possible to develop PCR assays to investigate, by examination of existing or latent virus populations in horses, whether these viruses represent sporadic events in which the donkey virus has crossed into individual horses, or whether EHV-8 is circulating in some horse populations. This approach has been employed previously when trying to detect co-infections with neuropathogenic versus abortigenic strains [54] or in a wider range of equids [4].
The co-occurrence of EHV-1 and EHV-8 also needs to be monitored because EHVs have the potential to diversify rapidly by recombination. Recombinant EHV-1/EHV-9 infections that originated in asymptomatic zebras were reported to cause non-fatal and fatal encephalitis in polar bears [55] and abortion and neurological disease in Indian rhinoceroses [56].
In conclusion, the current study suggests that EHV-8 can cause abortion in horses. The potential threat of EHV-8 to the horse industry and the possibility that donkeys may act as reservoirs of infection warrant further investigation. The determination of complete genome sequences of EHV-8 strains will serve as a key reference for the development of specific assays for diagnosis and epidemiological research.