The Viruses of Wild Pigeon Droppings

Birds are frequent sources of emerging human infectious diseases. Viral particles were enriched from the feces of 51 wild urban pigeons (Columba livia) from Hong Kong and Hungary, their nucleic acids randomly amplified and then sequenced. We identified sequences from known and novel species from the viral families Circoviridae, Parvoviridae, Picornaviridae, Reoviridae, Adenovirus, Astroviridae, and Caliciviridae (listed in decreasing number of reads), as well as plant and insect viruses likely originating from consumed food. The near full genome of a new species of a proposed parvovirus genus provisionally called Aviparvovirus contained an unusually long middle ORF showing weak similarity to an ORF of unknown function from a fowl adenovirus. Picornaviruses found in both Asia and Europe that are distantly related to the turkey megrivirus and contained a highly divergent 2A1 region were named mesiviruses. All eleven segments of a novel rotavirus subgroup related to a chicken rotavirus in group G were sequenced and phylogenetically analyzed. This study provides an initial assessment of the enteric virome in the droppings of pigeons, a feral urban species with frequent human contact.


Introduction
Many infectious diseases in humans are caused by pathogens originating from a wide variety of animals. More than 60% of emerging diseases are estimated to originate from wildlife [1][2][3][4]. Public awareness of zoonoses has recently increased because of their public health and economic impacts. Birds are recognized as frequent reservoirs for viruses that are of concern to humans; notably influenza A which is capable of infecting other mammals thereby facilitating genome segment reassortments and changes in tropism and transmission efficiency [5][6][7][8]. Sporadic human infections of the virulent H5N1 resulting from direct contact with infected poultry or wild birds have been reported in 15 countries, mainly in Asia [9][10][11][12][13], and H7N9 has recently emerged as a virus of concern. The prevalence of avian influenza viruses was 12% of oropharyngeal and 20% of cloacal swab specimens collected from urban pigeons in Slovakia [14]. H5N1 was found in a dead feral pigeon in Hong Kong [15] but is generally apathogenic in this host species and the overall risk of H5N1 transmission from pigeons to humans or chickens appears low [16,17]. West Nile virus (WNV) and Saint Louis encephalitis (SLE) virus, two arboviruses in the Flavivirus genus transmitted by mosquitoes bites, are disseminated by wild birds [18][19][20][21][22][23]. WNV-specific antibody and viremia was found in 25.7% and 11% of rock pigeons, respectively in the United States [24]. WNV was also isolated in pools of brains, kidneys, heart and spleen of feral pigeons and mapgies [25].
Pigeons developed low levels of WNV viremia; insufficient to infect mosquitoes [24,26]. Avian paramyxoviruses, including Newcastle disease virus, are common domestic and wild bird pathogens [27][28][29][30]. Paramyxovirus type-1 can be found in pigeons worldwide [31][32][33][34] but the clinical signs vary depending on the immunity of the host and virulence of the specific isolates [35]. While human infection with Newcastle disease virus is rare, at least two outbreaks of conjunctivitis due to Newcastle disease virus have been reported in poultry workers [28,36,37]. Chicken anemia virus (CAV), until recently the only member of the gyrovirus genus, is highly contagious and causes severe anemia, hemorrhage and depletion of lymphoid tissue in chickens [38][39][40]. Related gyroviruses were recently characterized in human feces, blood and on healthy human skin [41][42][43][44] indicating possible human tropism. Gyrovirus DNA was also detected in three blood samples of solid organ transplant patients and in one HIV-infected person [44] as well as in 0.85% of healthy French blood donations [45].
Pigeons are therefore natural reservoirs for pathogens that have caused emerging and re-emerging diseases in humans. In order to better understand the viruses shed by pigeons to which humans are frequently exposed, we genetically characterized the viral community in droppings from wild pigeons in Hong Kong and Hungary following an unbiased amplification method and deep sequencing.

Materials and Methods
Biological Samples and viral Metagenomics 51 fecal specimens were collected from feral pigeons (Columba livia) in Hong Kong (N = 50) in August 2011 and in Pécs, Hungary (N = 1) in April 2011, then stored at 280uC. Droppings were collected without any physical contact with wild urban pigeons. Procedure therefore did not require animal committee ethical review. The field studies did not involve endangered or protected species. Fresh and well-separated droppings were sampled using sterile swabs which were stored in two ml of viral transport media as described [46]. The fecal sample from Pécs, Hungary was collected from a nesting 5-8 days old, clinically healthy pigeon into Hanks' buffered saline solution (Gibco BRL). Prior to the shipment on dry ice, samples were heat treated at 74uC for 50 minutes to inactivate viruses. The suspensions were vigorously vortexed and clarified by 15,000 6g centrifugation for 10 minutes. 200 ml of the supernatant was filtered through a 0.45-mm filter (Millipore) to remove bacterium-sized particles. The filtrate was then treated with a mixture of DNases (Turbo DNase from Ambion, Baseline-ZERO from Epicentre, and Benzonase from Novagen) and RNase (Fermentas) to digest unprotected nucleic acids [47]. Viral nucleic acids protected from digestion within viral capsids and other small particles were then extracted using a QIAamp spin-column technique according to the manufacturer's instructions (Qiagen). For the single fecal specimen from Hungary, non-specific RNA and DNA amplification was then performed by random RT-PCR using a primer with randomized 39 ends [47]. Amplicons were then pooled, 454 libraries generated according to the manufacturer's protocol (GS FLX Titanium General Library Preparation Kit, Roche) and pyrosequenced using the 454 Titanium FLX+ sequencer. For the 50 fecal specimens from Hong Kong, pools of nucleic acids from five fecal specimens were generated resulting in ten separate pools. A library was then constructed using ScriptSeq TM v2 RNA-Seq Library Preparation Kit (Epicentre) and then sequenced using the Miseq Illumina platform with 250 bases paired ends with a distinct molecular tag for each pool. Because fecal specimens were analyzed in pools of five, the viruses identified are reported per specimen pool rather than for individual specimens. The 454 and Illumina singleton reads and assembled contigs greater than 100-bp were compared to the GenBank protein databases using BLASTx. An E value of 10 25 was used as the cutoff value for significant hits. The sequence data from the MiSeq run is in the short read archive at GenBank accession number SRX263026.

Genome Acquisition of Novel Viruses
Sequences showing significant but divergent BLASTx hits to parvoviruses, picornaviruses or rotaviruses in the same sample pool were linked together using RT-PCR or PCR and primers based on the initial short sequence reads or contigs. 59 and 39 rapid amplification of cDNA end (59 and 39 RACE) was used to acquire the 59 and 39 extremities of viral genome. Phylogenetic analyses were performed using novel virus sequences, their best BLASTx hits, and representative members of related viral species or genera. All alignments and phylogenetic analyses were based on the translated amino acid sequences. Sequence alignment was performed using CLUSTAL X (version 2.0.3) with the default settings [48]. Aligned sequences were trimmed to match the genomic regions of the viral sequences obtained in the study. A phylogenetic tree with 100 bootstrap resamples of the alignment data sets was generated using the neighbor-joining method based on the Jones-Taylor-Thornton matrix-based model in MEGA version 5 [49]. Bootstrap values (based on 100 replicates) for each node are given if .70%. The generated phylogenetic trees were visualized using the program MEGA version 5. Resulting trees were examined for consistency with published phylogenetic trees. Sequence identity was measured using BioEdit [50]. The sequence distance comparison was calculated by using SSE [51]. Conserved amino acid analyses were performed over the alignments produced by CLUSTAL X (version 2.0.3). The hypothetical cleavage map of the picornavirus polyprotein was derived from alignments with other closest picornaviruses and NetPicoRNA prediction. Putative ORFs in the genome were predicted by NCBI ORF finder. The secondary structure of 59 and 39 UTR was predicted using the Mfold program.

Bioinformatics Analysis
We received 36.5 million 250-bp paired-end reads generated on the Illumina Mi-Seq platform. Short reads were debarcoded using vendor software from Illumina. A total of 6,818 reads from 454 pyrosequencing were trimmed of their primer sequence and adjacent eight nucleotides corresponding to the randomized part of the primer. In-house analysis pipelines were developed to process both datasets. Adaptors were trimmed by in-house code. The cleaned reads were de-novo assembled using SOAPdenovo2 (38). The assembled contigs, along with singlets, were aligned to an in-house viral protein database using NCBI BLASTx. The significant hits to viral sequences were then aligned to an inhouse non-virus-non-redundant (NVNR) universal protein database using BLASTx. Hits with more significant adjusted E-value to NVNR than to virus were removed.
The pigeon fecal sample from Hungary was analyzed using 6,818 unique reads from 454 pyrosequencing and showed viral hits to picornaviruses, astroviruses, and tobamoviruses.
We further characterized a subset of divergent mammalian viruses by acquiring their nearly complete or full genomes and phylogenetically comparing them to their closest viral relatives. Table 1. Distribution of sequence reads to different viral species/families in 10 sample pools from Hong Kong and the single sample from Hungary.    [57,58]. Parvoviruses are widespread pathogens that can cause a range of diseases in a variety of mammals as well as birds and reptiles [57].
One hundred and five out of a total of 39,193 reads from one specimen pool (Table 1 pair 2) were related to parvovirus proteins (BLASTx E scores: 7610 26 to 5610 241 ). Gaps between short sequence reads were filled by PCR and then directly Sangersequenced. The nearly complete genome (5492-bp) of pigeon parvovirus, including a partial 59 UTR (252-bp), complete NS (631-aa), complete VP (696-aa) and a partial 39 UTR (54-bp) was acquired (GenBank KC876004). As expected the genome contained two large open reading frames (ORFs) with the left and right ORFs encoding non-structural (NS) and viral capsid protein (VP), respectively ( Figure 1A). Distinct from its closest genetic relatives (chicken and turkey parvoviruses), pigeon parvovirus A contained an unusually long middle ORF (477-aa) ( Figure 1A). The carboxy termini of this ORF (135-aa; position 343-477) showed best although weak similarity (BLASTx to NR E score: 8.9610 22 ) to a segment of a 221-aa ORF43 predicted encoded protein from the large dsDNA fowl adenovirus genome [59], sharing 28% amino acid identity ( Figure 1A). A smaller middle ORF (155-aa) was also detected, which did not show  Table S4. doi:10.1371/journal.pone.0072787.g002 similarity to any other sequence in GenBank. The start codon of NS was located in a strong Kozak sequence context, CAG-CATGGC and contained the ATP or GTP binding Walker loop motif 401 GPANTGKT 408 [GXXXXGK(T/S)] [60]. Additionally, two conserved replication initiator motifs 118 RCHVHIMLI 126 and 145 ITKYVTEALT 154 were also found [61]. Similar to chicken parvovirus, pigeon parvovirus's VP protein did not possess the phospholipase A 2 (PLA 2 ) motif with its highly conserved calcium-binding site (YLGPF) as well as the phospholipase catalytic residues (HD and D). The N-terminus of its VP protein contained glycine rich sequence (GGGGSVGSGGGGGVG) also present in other parvoviral VP proteins. Pair-wise amino acid sequence analysis showed that the NS and VP proteins in pigeon parvovirus shared the highest aa-identities of 41% and 34% to those of chicken parvovirus [62], and less than 20% to the proteins of other parvovirus genera (Table S1). Figure 1B shows that pigeon parvovirus NS and VP proteins shared a monophyletic root with chicken and turkey parvoviruses, indicating a common origin. Based on the NS and VP phylogenetic analysis and genetic distance calculations, parvoviruses from pigeons, chickens and turkeys may be included as members of as a novel genus with a proposed name of Aviparvovirus (for avian parvovirus) in the subfamily Parvovirinae.

Pigeon Picornavirus
Picornaviruses are small, non-enveloped, single-stranded RNA viruses whose prototype is poliovirus. The family Picornaviridae currently consists of seventeen genera (http://www. picornaviridae.com) that infect a wide range of hosts including mammalian and bird species. Many other recently characterized picornavirus genomes have increased the number of potential Picornaviridae genera [63][64][65][66]. The genome of a typical picornavirus ranges from 7 Kb to 9 Kb and, with one exception, contains a single long ORF coding for a polyprotein [67]. Picornaviruses have been reported in fecal and respiratory specimens from numerous vertebrates including humans, bats, rodents, pigs, fish and reptiles [65,[68][69][70].
In this study we characterize two complete genomes of a novel picornavirus from pigeon fecal samples collected from Hong Kong (strain HK-21, KC876003) and Hungary (strain GALII-5/2011/ HUN, KC811837) with a 9,072/9,192 nucleotide (nt) long genome [excluding the poly(A) tail] (Figure 2A). The P1, P2 and P3 regions of GALII-5/2011/HUN showed 74%, 92% and 97% amino acid identity to strain HK-21 indicating a presence of two genotypes of the same picornavirus species. Polyprotein alignments revealed that strain GALII-5/2011/HUN had an alternative upstream start codon MREY instead of the putative start codon of MATF seen in strain HK-21 ( Figure S2). A stop codon was found between MREY and MATF in strain HK-21 but none was found in strain GALII-5/2011/HUN. The start codon in the upstream MREY in GALII-5/2011/HUN was also located in a weaker Kozak consensus sequence [GGGGATGCG] which plays a major role in the initiation of the translation process than MATF [GGAGATGGC] ( Figure S2). We therefore selected the second methionine codon (MATF) as the start codon for the analysis. The single 2,707/2,711-aa-long (of HK-21/GALII-5/2011/HUN) polyprotein coding region was flanked by the 620/727 nt-long 59 UTR and the 320/332-nt-long 39 UTR. By BLASTx, its polyprotein showed the closest identity to the turkey hepatitis virus belonging to Megrivirus genus [52]. Therefore, we provisionally named these piconaviruses as Mesivirus-1 (strain HK-21) and Mesivirus-2 (strain GALII-5/2011/HUN) for Megrivirus sisterclade virus.

Pigeon Rotavirus
Rotaviruses consist of at least eight groups or species (A through H) with multiple P and G genotypes which together comprise a genus in the family Reoviridae. Rotavirus has a non-enveloped, triple-layered icosahedral capsid containing eleven segments of double-stranded RNA, encoding for six structural and five nonstructural proteins [78][79][80]. We successfully acquired 17,827 nucleotides of this highly divergent rotavirus (HK18), encoding the complete proteins of all 11 genome segments (GenBank KC876005-KC876015).
The six structural proteins (VP1-VP7) showed best hits by BLASTx to group G rotavirus, first described in chicken feces from Northern Ireland [81]. While VP1-VP3 shared 84%-94% aa-identities with group G rotavirus, other structural proteins showed lower aa-identities, ranging from 35% to 67% ( Table 2). Classification of rotavirus is based mainly on the inner capsid protein VP6 and according to ICTV definition the members of the same rotavirus species should share .60% nt-identity in VP6 region [82]. The sequence (1176-nt) of the VP6-encoding genome segment shared 68% nt-identity to group G rotavirus ( Figure 3A). Direct observation of VP6 alignment between HK18 and its closest relatives revealed numerous mismatches in the groupspecific antigenic regions ( Figure S3), indicating that HK18 represents a candidate prototype for a novel subgroup within group G. Two structural proteins VP4 (the protease-cleaved polyprotein) and VP7 (the glycoprotein) on the outermost surface of rotavirus are targets for neutralizing antibodies and have also been used to classify rotaviruses into their P and G genotypes, respectively [83]. The nucleotide sequence alignments of HK18 VP4 and VP7 showed low identity to other rotaviruses ( Figure 4A). In addition, HK18 and group G rotavirus shared only 35% for VP4 and 51% for VP7 at the amino acid level, suggesting that HK18 should be considered as novel P and G genotypes, tentatively proposed as G2P [2] within group G rotavirus. The VP1 protein of HK18 contained three conserved RNA-dependent RNA polymerase motifs 558 AEKIILYTDVSQWDAS 573 , 638 LKI RYLGVASGEKTTKIGNSFANVALI 664 and 683 MRVDGDD NVVT 693 [84]. Similar to group G rotavirus, the VP3 protein of HK18 did not possess the NTP-binding sequence motif present in group C rotavirus [85]. However, an alignment of VP3 sequences from different rotavirus groups showed that the VP3 protein of HK18 shared a high degree of conservation in ALYXLSN [ALYSLSN] motif, which has unknown function [86]. In order to phylogenetically classify pigeon rotavirus HK18, its six structural proteins were aligned to other group rotavirus representatives. While the VP1-VP3, VP6 and VP7 of HK18 were clustered with those of group G rotavirus ( Figure 3B and Figure S4), its VP4 was located on a branch diverging closer to the root ( Figure 4B).
The pair-wise amino acid sequence analysis demonstrated that nonstructural proteins NSP2-NSP5 shared top identities to group G rotavirus rather than other groups, ranging from 50% to 92% at the amino acid level (Table 2). However, NSP1 had a very low aaidentity, less than 31% to group G rotavirus. Like group G rotavirus, the NSP1 gene of HK18 had two ORFs encoding two minor and major peptides labeled as NSP1-1 (104-aa) and NSP1-2 (310-aa), respectively. While NSP1-1 had the best identity of 30% to group G rotavirus, NSP1-2 shared the closet match to group B rotavirus with 23% identity versus 19% to group G. The NSP1 zinc-binding domain could not be identified in HK18. The NSP2 was found to have a conserved sequence HGXGHXRXV and histidine triad (His-X-His-X-His-XX) located at the RNA binding domain [87]. Phylogenetic analysis of NSP2-NSP5 showed that the group G rotavirus was the closest relative to HK18 ( Figure S4) in all fragments except for NSP1 where it appeared basal to both groups G and B. A similar observation was also made for avian group A rotaviruses where their NSP1 could not be grouped into taxonomic species [88].

Discussion
Metagenomics has been used to analyze viral nucleic acids in feces collected from humans and a growing range of animals including primates, horse, bats, rodents, pigs, dogs, and turkeys [42,58,65,69,[89][90][91][92]. We investigated viral sequences in 51 wild pigeon droppings followed by viral sequence similarity searches revealing sequences closely or distantly related to viral genomes in GenBank. The eukaryotic viruses detected included insect and plant viruses likely reflecting the pigeon diet as well as viruses already known to infect pigeons such as pigeon circovirus and adenoviruses. Also detected and characterized were novel genomes in viral families known to infect vertebrates.
Circovirus-like inclusion bodies were initially identified in the bursa of pigeons [93,94] and the molecular epidemiology of pigeon circovirus worldwide has confirmed its importance as a pathogen associated with a wide range of illnesses in pigeon populations, including weight loss, respiratory distress and diarrhea [94][95][96]; possibly as a result of induced immunodeficiency aggravating the pathogenicity of co-infections such as those of adenoviruses [97]. Both pigeon circovirus and diverse adenovirus sequences were identified here.
Parvoviruses cause a variety of mild to severe symptoms in birds [98][99][100][101]. The Parvoviridae subfamily has recently undergone a large expansion in the number of know genera and species [58,62,102,103]. Here a novel parvovirus was characterized in pigeon feces that was distantly related to chicken and turkey parvoviruses found in feces of farm animals with signs of enteric diseases [62,98]. The pigeon parvovirus genome, the first from that host species, possessed an unusually long middle ORF that encoded a protein showing similarity to that encoded by an ORF of unknown function in the fowl adenovirus genome. This observation might reflect past lateral gene exchange between avian DNA viruses replicating in the nucleus. The pigeon, chicken and turkey parvoviruses cluster together phylogenetically, but have not yet been assigned to a Parvovirinae genus. We therefore propose a new genus provisionally called Aviparvovirus containing at least two species, pigeon parvovirus and the closely related turkey/ chicken parvoviruses.
Two related picornaviruses (81% nucleotide similarity) from pigeons in Hong Kong and Hungary were also characterized and provisionally named Mesivirus-1 and -2 reflecting a wide geographic distribution of this viral clade. Based on the phylogenetic analysis of the most conserved region (P3) many of the bird picornaviruses clustered together (Gallivirus, Orthoturdovirus, Paraturdivirus, Megrivirus, and Mesivirus) in a supported clade that includes only the mammalian infecting Kobuvirus and Salivirus genera. It is therefore possible to speculate on the past existence of a strictly avian clade a member of which adapted to mammals later resulting in the Salivirus/Kobuvirus clade.
Rotaviruses were first described as a causative agent of gastroenteritis in humans in 1973 by using electron microscopy to examine biopsies of duodenal mucosa from children with acute non-bacterial gastroenteritis [104]. The first pigeon rotavirus was isolated from feces in 1983 and belongs to group A [105]. We describe here the complete coding regions of all eleven segments of a group G rotavirus, the first from pigeons, distinct from chicken rotavirus [85]. The addition of a second pigeon rotavirus genome to the viral database also shows that the diversity of known avian rotaviruses is likely to continue to expand with multiple rotavirus species infecting the same host species.
Animal species living in or around human habitation, including some bats and rodents species and pigeons, are recognized as the reservoirs of multiple zoonotic pathogens. Because of increasing contact between such peridomestic animals and humans a better understanding of these animals' virome can inform future studies of their cross-species transmissions potential. Monitoring viral exchanges between pigeons and highly exposed humans using nucleic acids or serological assays for avian viruses will be facilitated by improved knowledge of their virome. Table S1 Pairwise amino acid sequence identities (%) between NS and VP regions of the novel pigeon parvovirus, turkey parvovirus and representatives of parvovirus genera. GenBank numbers of these viruses are available in Table S3. (PDF) Table S2 Coding potential/putative proteins of the genome of Mesivirus-1 and comparison of amino acid sequence identity (%) of its P1-3 and other closelyrelated picornaviruses in the Picornaviridae family. (PDF) Table S3 Representative members in the subfamily Parvovirinae for the phylogenetic trees in Figure 1 and their GenBank numbers. *Parvoviruses were used for pairwise calculation in Table S1. (PDF) Table S4 Representative members in the family Picornaviridae for the phylogenetic tree in Figure 2 and their GenBank numbers. (PDF)