Evolutionary dynamics and transmission patterns of Newcastle disease virus in China through Bayesian phylogeographical analysis

The Chinese poultry industry has experienced outbreaks of Newcastle disease (ND) dating back to the 1920s. However, the epidemic has exhibited a downtrend in recent years. In this study, both observational and genetic data [fusion (F) and haemagglutinin-neuraminidase genes (HN)] were analyzed, and phylogeographic analysis based on prevalent genotypes of Newcastle disease virus (NDV) was conducted for better understanding of the evolution and spatiotemporal dynamics of ND in China. In line with the observed trend of epidemic outbreaks, the effective population size of F and HN genes of circulating NDV is no longer growing since 2000, which is supported by 95% highest posterior diversity (HPD) intervals. Phylogeographic analysis indicated that the two eastern coastal provinces, Shandong and Jiangsu were the most relevant hubs for NDV migration, and the geographical regions with active NDV diffusion seemed to be constrained to southern and eastern China. The live poultry trade may play an important role in viral spread. Interestingly, no migration links from wild birds to poultry received Bayes factor support (BF > 3), while the migration links from poultry to wild birds accounted for 64% in all effective migrations. This may indicate that the sporadic cases of ND in wild bird likely spillover events from poultry. These findings contribute to predictive models of NDV transmission, and potentially help in the prevention of future outbreaks.


Introduction
Newcastle disease (ND) is one of the most contagious diseases of poultry. The causative agent of ND is known as Newcastle disease virus (NDV), which is a member of the family Paramyxoviridae in the genus Avulavirus [1,2]. It was first reported both in Java, Indonesia and Newcastle-upon-Tyne, England in 1926. Since then, four epidemic waves have occurred worldwide through the 1990s [3]. In China, the time of the first probable ND outbreak was nearly synchronous with the initial global epidemic in the 1920s. It was not until 1946 that the etiology of the outbreak in China was identified. This delay resulted in enormous losses to the poultry industry. Although ND poses a threat to the Chinese chicken industry, the number of outbreaks, cases and deaths since 2005-2015 have been decreasing year by year due to a strict vaccination program [4]. Most cases that occurred were mild and sporadic in Chinese vaccinated chicken flocks, which may be the result of immune failure [5]. Characteristic of "atypical ND", presents as a prolonged disease duration with no typical clinical and pathological manifestation [6]. Similar to other single-stranded-RNA respiratory viruses, multiple genotypes of NDV can co-circulate and cause outbreaks, and the "mild ND" under long-term immune pressure may also provide the conditions for the evolution of the virus. In general, the capacity of some viruses to adapt to hosts and environments is highly dependent on their ability to generate de novo diversity in a short period of time [7], and prevalent genotypes of viruses tend to have a higher evolutionary rate under more selection pressure [8]. Low evolutionary rates of fusion (F) gene exhibited in genotypes II and IX of virulent NDV (7.05 × 10 −5 and 2.05 × 10 −5 per year, respectively) make that there is a high genetic similarity to virulent isolates from the 1940s [9], while the evolutionary rate and diversity of the predominant NDV genotypes VI and VII in China remain unknown. According to Fan (2017), nucleocapsid protein is observed with an unexpected rapid evolutionary rate, 1.059 × 10 −2 per year (95% HPD: 4.187 × 10 −3~1 .74 × 10 −2 ) rather than surface proteins (F and HN) of NDV [10]. Wrong estimation of the evolutionary rate of viruses may significantly affect the prevention and control of viral diseases [11,12].
Similar with the avian influenza virus, wild birds are also considered to play an important role in the spread of ND by the high nucleotide homology of viruses between wild birds and poultry [13][14][15]. However, the directionality of viral transmission between wild birds and poultry remains unknown.
The present study aimed to estimate the evolutionary rate and diversity of surface protein of the predominant NDV genotypes in China, and explore the evolutionary dynamics and transmission patterns in multiple hosts of NDV in China using phylodynamics analysis.

Epidemiologic data
Clinical case data of NDV were obtained from the Official Veterinary Bulletin, which is made available by the Ministry of Agriculture and Rural Affairs of the People's Republic of China [16]. We collected the data (S1 Table), including the total number of outbreaks, number of cases and deaths, province and animal species in 2006-2019 by month and compared the distribution of NDV outbreaks in China.
The Coalescent Bayesian skyline plot (BSP) was used to infer the past population dynamics [29]. The uniform sampling strategies were used to select datasets with a maximum of 20 sequences per year [30]. To avoid the effect of left censoring [31], the BSPs were truncated at the time of the last coalescent event. Package Tracer was used to plot BSP and lineagesthrough-time (LTT) plots.

Bayesian phylogeography analysis
An asymmetric discrete trait phylogeography model was specified to explore the spatial diffusion patterns of NDV. Both location and hosts were imported into the model to infer a spreading network with Bayesian stochastic search variable selection (BSSVS) using BEAST 1.10.4 [32]. SpreaD3 v0.9.6 was used to calculate Bayes factor support for each transmission path between discrete location states and hosts [33]. The settings used here can be found in the SpreaD3 tutorials [34]. Only migration links with Bayes factor support of at least 3 were considered. Also, the number of expected location-state and host-state transitions (Markov jump counts) along the branches of the phylogeny using the asymmetric migration model were estimated [35]. Total number of state counts for migration into and out of each region and host were also plotted.
To uncover potential predictors of viral spread, we tested the association between the viral dispersal and predictors (including environmental predictors, poultry farming and live poultry trade predictors) among provinces using generalized linear model (GLM). GLM analyses were run in BEAST v1.10 using prior specifications recommended above on the set of trees obtained by Bayesian phylogenetic analysis [36,37]. The province-level matrix data of live poultry transportation were referred to a recent research [38]. Province-level poultry farming data (including domestic broiler/layer population of each large, medium and small scale chicken farms and annual output of poultry) were obtained from statistical yearbooks of China, and annual relative humidity and temperature data were obtained from China Meteorological Administration (S3 Table).

Epidemiology
From 2006 to 2019, a total of 4,789 ND epidemics were reported in China, covering 26 provinces, municipalities and autonomous regions, and the Chinese epidemics of ND primarily occurred in the south and southwest of China (S1 Fig). The number of NDV outbreaks showed a downtrend each year, as did the number of cases and deaths over time (Fig 1).

Sequence dataset compilation
The F (N = 876) and HN (N = 387) gene sequences were downloaded from GenBank, and only sequences subtyped as VI and VII-type were retained (F gene: N = 753; HN gene: N = 177). The duplicate, problematic sequences and short (nucleotide length < 1,600 base pair) sequences were removed, leaving 444 taxa of VII-F gene, 72 taxa of VI-F gene, 132 taxa of VII-HN gene and 18 taxa of VI-HN gene. And the VI-HN gene sequences were not used for the subsequent study due to its small number.
NDV infection was reported in 19 host species so far. Of these, 6.25% of the VI and VII genotype viruses were reported in wild species and more in domestic poultry (93.56%). Among 25 discrete regions, most NDVs were isolated in Shandong (25.57%) and Jiangsu (24.81%) provinces, followed by Heilongjiang (7.95%) and Guangdong (7.58%). To mitigate the potential impact of sampling biases in following phylodynamic reconstructions, three randomly down-sampling were used to select datasets with a maximum of 15 taxa per location. After down-sampling randomly, three final sets (N = 177, 171 and 178) of VII-F gene were used in the following analysis. As there was small number of sequences of VI-F gene and VII-HN gene, all of the sequences were retained, and the meta-data was listed in S4 Table.

Phylogenetic and population dynamic analysis
A examine for molecular clock signal revealed that there was sufficient accumulation of divergence over the sampling time span to estimate evolutionary rates (S2 Fig). The evolutionary rates and past population dynamics of NDV were inferred using a Bayesian coalescent approach. The mean evolutionary rates of the VI-F, VII-F and VII-HN genes were estimated at 8.07 × 10 −4 subs/site/year (95% HPD: 5.06 × 10 −4~1 .09 × 10 −3 ), 1.03 × 10 −3 subs/site/year (95% HPD: 8.54 × 10 − 4~1 .19 × 10 −3 ) and 8.78 × 10 −4 subs/site/year (95% HPD: 7.11 × 10 −41 .05 × 10 −3 ), respectively. For the effective population size of three subsets (VI-F, VII-F and VII-HN), the LTT graphs (S3 Fig) showed that there were no new lineages since 2013. Therefore, we assumed that there was no change imputed in the effective population size from 2013 onwards. Effective population size in BSP plots of VII-F and VII-HN genes showed that an increasing trend was observed from 1995 to 2000, and the trend from 2000 to 2013 kept relatively constant (Fig 2). Compared with VII-genotype, the effective population size of VI-F gene was relatively stable since the 1970s supported by a 95% HPD interval (Fig 2).
The MCC tree showed (S4 Fig

Phylogeographic analysis
Among 25 provinces, municipalities, and autonomous territories of China, a total of 285 migration links of well supported (Bayes factor support, BF > 3) were identified for VI-F, VII-F and VII-HN genes (S5 Table). Herein, the total number of 156 migration links were the sum of VI-F, VII-F (subsample one) and VII-HN BSSVS outputs (Fig 3). The eastern seaboard of China, Shandong (29.50%; N = 46/156) and Jiangsu (21.20%; N = 33/156) provinces were the most frequently implicated source and recipient location, followed by southern seaboard of China, Guangdong (16.02%; N = 25/156) and Guangxi (12.80%; N = 20/156). The results showed that the eastern seaboard of Shandong and Jiangsu provinces might have played a key role in seeding the NDV epidemics. This is further supported by the number of observed state changes in Markov jump count analysis with migration into and out of Shandong and Jiangsu provinces, which was higher than any other region (S5 Fig). Furthermore, the visual migration maps (Fig 3) indicated that the eastern and southern regions of China seemed to become the hot spots of NDV diffusion.
The data sets of VI-F, VII-F and VII-HN genes were used to infer bird migration history and the Bayes factor support for each migration path between hosts were estimated using BSSVS. Between the 19 hosts (poultry: 5, wild birds: 14) identified by F and HN genes, 44 routes (counting strategy is the same as above) of statistically supported (BF > 3) were identified. Interestingly, among all supported well migration paths (S6 Table), the migration directions were spread from poultry to wild birds accounted for 58.10% (N = 25/44), between poultry accounted for 30.30% (N = 13/44), and between wild birds accounted for 11.60% (N = 5/44). However, no migration links of wild birds to poultry were observed. This observation indicates that the sporadic cases of ND in wild birds are likely spillover from poultry. Besides, the number of expected host-state migrations was also estimated in this study. Pigeons may play an important role in the transmission of NDV genotype VI, with the largest into and Chicken is the biggest output source of VII-NDVs, which spread to other poultry/wild birds, such as duck and goose. These results demonstrated that NDV appears to mainly spread from poultry to other poultry and wild birds.
The results GLM analysis inferred from the data sets of VI-F, VII-F and VII-HN genes showed that live poultry trade network is positively associated with viral spread (Fig 4). In

PLOS ONE
Epidemiology and phylogeographical analysis of Newcastle Disease in China addition to the predictor of live poultry trade, other biological potential predictors and abiotic predictors had also been estimated, but they did not get noticeable support by any of the analyzed datasets (Fig 4).

Discussion
In the present study, large amounts of epidemiologic and genetic data, along with associated temporal and geographic information, were collected to investigate the emergence and dispersal of the predominant genotypes VI and VII NDV in China. The number of outbreaks, cases, and deaths of NDV in China has been decreasing in recent years. A strict immunization program in conjunction with a reduction in the numbers of backyard poultry are the major contributors to the decline of ND in domestic poultry. Phylogeographic analysis based on prevalent genotypes of NDV was conducted for the better understanding of the evolution and spatiotemporal dynamics of ND in China.
In the previous study, the BSP analysis of NP gene performed by Fan [10], suggesting the population size of NDV showed an increase in the 1990s. Similar results were obtained in our study of population dynamics history for NDV genotype VII, which are summarized in a BSP supported by a narrow 95% HPD (Fig 2). This increase may be closely related to the fourth panzootic of NDV worldwide [40][41][42]. Genotype VII virus evolved into epidemic lineages and the viruses spread to most parts in China during this time [43,44]. Unlike previous studies, the population dynamics observed after 2000s displayed a different behavior since a relatively stable trend in the effective population size was observed. Although the factors responsible for the observed population size are currently unknown, a compulsory vaccination program has been considered to be a major factor leading to the death of some lineages [39]. While, the extant diversity has not decreased over time, the circulating of mutants and/or new sub-genotypes of the virus may keep the stable effective population of NDVs in China [45,46]. More studies should be carried out to explore these changes.
Two eastern seaboard provinces, Shandong and Jiangsu, were identified as the most frequently implicated source and recipient location, which might play central roles for NDV spread in China. This finding is also supported by the number of observed state changes in Markov jump count analysis with migration into and out of Shandong and Jiangsu provinces, which was greater than any other region (S5 Fig). Visualizing migration links (BF > 3) revealed more detail about the migration patterns of the virus (Fig 3). All these migration maps reflected that the eastern and southern seaboard of China became the active regions in the transmission of NDV. The Bohai Economic Rim, Pearl River Delta and Yangtze River Delta regions, located in eastern and southern China, are the most densely populated and convenient transportation network in China, making them the economic powerhouses of the country [47]. Those regions are also the places with the highest density of poultry farming in China. However, GLM analysis showed that the viral dispersal may not directly associate with the density of poultry. The possible reason is that both broilers and layers are vaccinated with NDV vaccine nationwidely, which causing the low risk of NDV outbreak in most farms. Live poultry transport is presumed to be related to the viral spread. While, how the virus is transmitted in live poultry transporting remains unclear.
Long-distance migratory birds play a major role in the global spread of avian influenza viruses in previous studies [48 -50], while the role of wild bird migration in the spatial diffusion of NDV is unknown. In our study, several long migration paths, such as Beijing to Xinjiang, Jiangsu to Tibet, and Guangdong to Shandong were observed, which are associated with NDV migration. Surprisingly, all these routes are associated with spread by poultry (S4 Fig). By counting the migration links (BF > 3) of NDV spread between diverse hosts (S6 Table), no migration links supported by BF (BF > 3) and the direction was from wild birds to poultry. Furthermore, the number of expected host-state migrations (Markov jump counts) estimated in this study also demonstrated that poultry (i.e. Chicken and pigeon) were the main output source of NDV expansion and contributed most to the virus spread (S6 Fig). Therefore, we speculated that the NDV mainly migrated from poultry to poultry/wild birds. However, this result could be affected by the lack of NDV samples from wild birds [51].
A major limitation of any phylodynamic analysis is the dependence on sampling [52]. It is an inherent issue that sampled viruses are concentrated in high-risk areas, potentially resulting in sampling bias and inaccurate ancestral reconstruction processes [53]. Similar to previous studies, an attempt was made to reduce sampling biases by down-sampling with a maximum of 15 sequences per location [54,55]. However, owing to passive and active surveillance in wild bird populations appears to be very limited for NDV in China, the few available sequences were collected from wild birds. Therefore, we did not opt for a down-sampling to obtain even number of sequences by host category to infer the contribution of wild bird in the diffusion of NDV, which inevitably leads to potential biases and limitations of the results. Furthermore, we recommend that the active systemic surveillance of wild birds should be strengthened and valued to obtain more viral samples from wild birds.

Conclusion
Our study demonstrates that the number of outbreaks, cases and deaths of NDV appeared to be gradually decreasing since 2006, and a relative stable trend in the effective population size was observed in the predominant genotypes of NDVs in recent ten years. The regions of Shandong and Jiangsu were estimated to be the most relevant hubs for NDV migration, and the live poultry trade may play an important role in viral spread. Also, the potential of NDV migration appeared to be the highest between poultry fowl and spillover from poultry to wild birds. These findings extend our understanding of dispersal patterns of the predominant genotypes of NDV and cross-hosts transmission in China, which may improve awareness and future control capability and other important avian pathogens.