A Quantitative Prioritisation of Human and Domestic Animal Pathogens in Europe

Disease or pathogen risk prioritisations aid understanding of infectious agent impact within surveillance or mitigation and biosecurity work, but take significant development. Previous work has shown the H-(Hirsch-)index as an alternative proxy. We present a weighted risk analysis describing infectious pathogen impact for human health (human pathogens) and well-being (domestic animal pathogens) using an objective, evidence-based, repeatable approach; the H-index. This study established the highest H-index European pathogens. Commonalities amongst pathogens not included in previous surveillance or risk analyses were examined. Differences between host types (humans/animals/zoonotic) in pathogen H-indices were explored as a One Health impact indicator. Finally, the acceptability of the H-index proxy for animal pathogen impact was examined by comparison with other measures. 57 pathogens appeared solely in the top 100 highest H-indices (1) human or (2) animal pathogens list, and 43 occurred in both. Of human pathogens, 66 were zoonotic and 67 were emerging, compared to 67 and 57 for animals. There were statistically significant differences between H-indices for host types (humans, animal, zoonotic), and there was limited evidence that H-indices are a reasonable proxy for animal pathogen impact. This work addresses measures outlined by the European Commission to strengthen climate change resilience and biosecurity for infectious diseases. The results include a quantitative evaluation of infectious pathogen impact, and suggest greater impacts of human-only compared to zoonotic pathogens or scientific under-representation of zoonoses. The outputs separate high and low impact pathogens, and should be combined with other risk assessment methods relying on expert opinion or qualitative data for priority setting, or could be used to prioritise diseases for which formal risk assessments are not possible because of data gaps.


Introduction
Disease or pathogen risk prioritisation exercises are used by organisations charged with providing surveillance and mitigation measures including disease management and control, and biosecurity measures. Qualitative, semi-quantitative or quantitative approaches can be used, but most take significant time to develop, so their use is limited, and when research involving the study of multiple diseases or pathogens is planned, agents are rarely systematically selected.
Quantitative measures for risk prioritisation include the calculation of epidemiological parameters such as disease incidence, prevalence, mortality and morbidity rates, costs of prevention, treatment or control, and for human disease, years lived with disability (YLD) and disability-adjusted-life-year estimates (DALY). Additional measures for animals include losses to production. For many diseases, robust estimates of these measures do not exist. Semi-quantitative and qualitative risk assessments are less demanding of data than quantitative approaches. Nevertheless, they require significant time and physical resources (for example, to obtain parameters and effect sizes from the scientific literature), need updating regularly, and they usually require expert-opinion, adding subjectivity [1,2,3,4,5].
The H-index is an alternative approach to disease prioritisation. It objectively and rapidly provides a quantitative proxy of human disease or pathogen impact [6], (McIntyre, unpublished). The Hindex captures scientific interest in a disease by deriving a metric from the number of papers published and how many citations each receives. Combining scientific impact (citations) with technical productivity (papers published) is useful as, individually, total papers does not account for the quality of publications, while citation count may be influenced by a small number of seminal papers or if a disease becomes 'fashionable' briefly. The H-index method is significantly correlated with more comprehensive measures of human infectious disease impact, including DALYs [6], and deaths from disease (McIntyre, unpublished data). It can be rapidly obtained at low cost, attained automatically, and repeated regularly to reflect changes in impact, serving as a generic tool to assess the relative bearing of diseases or pathogens, in an easier, timelier manner than traditional risk assessments. While the H-index method undoubtedly has limitations, these tend to be different to those of other approaches; its use could be a step forward in separating high and low priority diseases or pathogens, in combination with other risk assessment methods.
The ENHanCEd Infectious Diseases (EID2) database integrates published data sources on pathogens, their hosts (including vectors) and geographic ranges [7]. By coupling the H-index method with the EID2, the primary aim of this study was to establish priority lists of human and domestic animal pathogens (including zoonoses) present in Europe. We then consider reasons for the omission of some pathogens in our lists from those of other disease prioritisations: the 2010 Global Burden of Disease (GBD) estimates [8], communicable human diseases reportable in the European Community [9], the OIE list of notifiable animal diseases, infections and infestations [10], and the EU FP7 DISCON-TOOLS project [11]. The GBD 2010 study was a large collaborative five year project which used all relevant published and unpublished evidence to create the strongest evidence-based epidemiological assessment of people's infectious and non-infectious health problems around the world [8]. The DISCON-TOOLS project, funded by the European Commission over five years, investigated the impacts of 52 domestic animal diseases, to focus and prioritise future research [11]. As the zoonotic and emerging status of pathogens as well as their taxonomic division could affect the likelihood of their inclusion in surveillance and impact quantification work, these factors were also investigated as reasons for omission from the other disease prioritisations.
The H-index can be obtained in the same way for both human and animal pathogens. It therefore has potential as a single metric for prioritising across both host groups. Its potential as a quantitative One Health indicator (i.e. a single measure applicable to both human and animal diseases) was investigated by comparing scores for human-only, zoonotic, and animal-only pathogen groups, including emerging status as this would likely drive research impact.
Previous work has shown that the H-index is a proxy for human disease impact [6], (McIntyre, unpublished). We investigated its value as a proxy for animal disease impact by comparing domestic animal pathogen H-indices with other measures of impact including presence on the OIE list [10], and inclusion in DISCONTOOLS [11].

EID2 pathogen information
The EID2 database collates data on human and domestic animal pathogens: where, when, and in which hosts there is evidence of their occurrence. The database is built largely using automated procedures to interrogate publicly available databases. An EID2 background has been described previously [7]; here, we used similar criteria to define pathogens, including pathogenic status (frequently pathogenic: a pathogen which frequently causes a clinically pathogenic effect -morbidity or mortality -in humans or domestic animals; non-pathogenic: an organism which causes no clinical signs within any of its hosts; unknown pathogenicity: an organism for which there is insufficient evidence to decide), evidence of pathogens affecting hosts ('host-pathogen interactions': evidence from at least one piece of meta-data uploaded with DNA or RNA sequence information to [12], which describes where, when and from which host the pathogen came, or specific scientific publications [13]), and evidence of pathogens occurring within countries (evidence from at least one piece of meta-data [12], or at least five publications in [13] where pathogen name and a country MeSH-term [14] co-occurred in the title/abstract). Information on host-pathogen interactions was collated when there was evidence of a pathogen occurring in at least one host of interest to the study (including humans and European domestic animals; see Table 1). Further information about each organism, such as their taxonomic division for pathogens (bacteriaincluding rickettsia, fungi -including algal pathogens, helminths -including thorny-headed worms and pentastomids, protozoa, and viruses -including prion agents) or their taxonomic rank (genus, species, etc.) is stored using a series of statements. Previously, we examined characteristics of pathogen species [7]; here, we include sub-species, to account for important strains e.g. Escherichia coli O157:H7.

Emerging/zoonotic pathogen status
Information on whether pathogens were zoonotic, non-zoonotic, emerging and not emerging was examined based upon previously published information [15,16]. If not included in earlier work or if their status had changed due to more recent scientific evidence, updated pathogen information was based upon the previous definitions. Zoonotic pathogens were classified as those naturally transmitted between vertebrate (non-human) animals and humans (as the definitive host), not including species which have recently evolved from animal pathogens but are no longer transmitted between animals and humans [15,17]. Emerging pathogens are those that have appeared in a host population for the first time (including newly-evolved strains), or have occurred previously but are increasing in incidence or expanding into areas where they had not previously been reported [15,17]. Pathogens needed to have emerged in several geographically distinct areas to be 'emerging'.

H-index literature search protocol
Information sources. H-index searches were undertaken in January 2012 using Web of Science (WoS) [18]. Previous work established that results of H-index searches for pathogens undertaken using different bibliographic sources (e.g. WoS, SCOPUS, Google Scholar) are not identical but are highly correlated [6].
Eligibility criteria. Searches were restricted to the years 1900 to 2010, inclusive. English is used in WoS, however searches also include foreign-language publication title translations. All literature in the WoS database has been published.
Searches. Searches were undertaken using search phrases specified in quotation marks (''''), the 'topic' search field and with no lemmatization. Phrases were compiled including pathogen scientific name, alternative names, synonyms and alternative spellings according to NCBI Taxonomy [19]. H-indices for clinical diseases used clinical terms as well as pathogen phrases for the main pathogens of disease. Virus searches also included synonyms and acronyms from the NCBI Taxonomy database and International Committee on Taxonomy of Viruses [19,20], and the term 'virus', and excluded other entities (viral or non-viral) which shared acronyms. The Boolean operators 'AND', 'OR', and 'NOT' linked multiple search phrases.

Pathogen prioritisation framework
A full list of human and domestic animal pathogens frequently causing pathogenic effects and for which there was evidence of European occurrence was created using EID2 information [7], and defined criteria, see Figure 1. The relative impact of pathogens in this full list was assessed by calculating H-indices using the specified search protocol; high impact pathogens had the highest H-indices. The list was split according to host-pathogen interaction information, into two directories, one including pathogens with evidence of their occurrence in humans, and the second including domestic animal occurrence; zoonotic pathogens appeared in both lists. Information was manually obtained on whether these pathogens cause diseases featuring in other prioritisation lists [8,10,11], by examining pathogens listed under each disease's details in the NCBI MeSH library [14]; specific lists of diseases had been provided in other work [9,11]. These additional pieces of information are included in the results (Tables 2 and 3). Finally, information on the pathogenic status of each pathogen, whether they frequently occurred in the relevant hosts and in Europe was verified by the study authors using manual literature searches of the scientific literature, for the pathogens with the highest H-indices.

Data analyses
H-indices and previous prioritisations. Pearson's Chisquared tests with Yates' continuity correction and Fisher's Exact Tests (FET) were used to test for differences in counts of pathogens included in previous work [9,10,11], according to outcomes including their taxonomic division, zoonotic and emerging status. Where appropriate, odds ratios (OR) and 95% confidence intervals (CI) are presented.
H-indices for One Health. Differences in H-indices for human-only, zoonotic, and animal-only pathogens were examined using a two-way Analysis of Variance (ANOVA) with log 10transformation of the response, including emerging status as an explanatory covariate. Post-hoc Tukey multiple comparisons of treatments were undertaken using the HSD.test [21].
H-indices for domestic animal pathogens. -OIE list. Homogeneity of variances of H-indices for animal-only (and not zoonotic) pathogens included or not within the OIE list of notifiable animal diseases was examined using the Fligner-Killeen (median) test. One-way ANOVA thereafter established differences in the (log 10 -transformed) H-indices of pathogens included or not in the OIE list. -DISCONTOOLS. H-indices and DISCON-TOOLS scores were compared using Spearman's Rank correlations. If more than one pathogen had been included within disease information for the DISCONTOOLS rankings (for Campylobacter, Leishmaniasis, and Salmonellosis), the higher H-index score was used for analyses. Table 1. Animal species including humans for which pathogens have been studied, including domestic animals we eat or companion animals we keep as pets, and exotic animals also used as food sources or as pets. All analyses were undertaken using the statistical software package R [22], with statistical significance determined by a Pvalue of less than 0.05.

Priority lists of human and domestic animal pathogens present in Europe
Two lists each including the top 100 human ( Table 2) and domestic animal pathogens (Table 3) which cause significant clinical disease and which therefore need consideration from a health and well-being perspective were short-listed using the Hindex prioritisation method (for alternative names and synonyms see [19]). When combined, 114 (72.6%) pathogens appeared solely in the human or animal list, and 43 (27.4%) were in both lists. Of the top 100 human pathogens, 66 were classed as zoonotic and 67 were emerging, compared to 67 and 57 for domestic animal pathogens, respectively.

H-indices and previous prioritisations
Of the top 100 human pathogens identified, 42 were either included in the GBD [8], or are reportable to the EC [9], or both. Reasons for failure to include pathogens may be that pathogenic agents cause rarely diagnosed disease (e.g. Human T-lymphotropic Virus 1, Lymphocytic Choriomeningitis Virus, and Moraxella catarrhalis), or because disease agents are diverse, e.g. pneumonia or other lung infections (Aspergillus niger, Chlamydophila pneumonia, Cryptococcus neoformans, Klebsiella pneumonia, and Mycoplasma pneumoniae) and gastro-intestinal (GI) symptoms or GI-tract infections (Aeromonas hydrophila, Bacillus cereus, Bacteroides fragilis, Clostridium species, Vibrio parahaemolyticus, and Yersinia enterocolitica). The impact of chronic disease or diseases causing low morbidity may be difficult to quantify or seen as less important (Bartonella henselae, Borrelia burgdorferi, Human Enterovirus C, Human Herpesvirus group, Human Papillomavirus, Human Parvovirus b19, Mycobacterium bovis, Mycobacterium avium, and Mycoplasma genitalium). In addition, some pathogens may generally be commensals or natural biota (Aggregatibacter actinomycetemcomitans, Candida species, Enterobacter, Enterococcus, Staphylococcus species, Candida tropicalis, Helicobacter pylori, and Porphyromonas gingivalis) or species existing in the environment (Acinetobacter baumannii, Burkholderia cepacia, Candida glabrata, Entamoeba histolytica, Fusarium oxysporum, Gibberella moniliformis, Proteus mirabilis, Pseudomonas species, Rhizopus oryzae, Serratia marcescens, Staphylococcus aureus, and Stenotro-phomonas maltophilia) causing opportunistic infections in immunecompromised individuals (including those young, old or pregnant); their impact upon the general population may not be quantified.
Of the top 100 domestic animal pathogens described, 76 were either notifiable according to the OIE [10], or included in DISCONTOOLS [11], or both. Reasons for failure to include may be similar to for human pathogens (only pathogens not previously mentioned are cited: multiple disease symptoms or lack of diagnosis -Ascaris suum, Feline Immunodeficiency Virus, Feline Leukemia Virus, Gallid Herpesvirus 2, Haemonchus contortus, and Yersinia pseudotuberculosis; causes of specific disease being diverse, for respiratory infection -Feline Calicivirus, and GI symptoms -Campylobacter fetus, Cryptosporidium parvum, and Listeria monocytogenes,; and existing in the environment and opportunistic -Pseudomonas aeruginosa). In addition, some omitted pathogens may be production issues with impact difficult to quantify (Streptococcus agalactiae causing Mastitis in cattle and Neospora caninum causing abortion in cattle and dogs) and some may be issues of pets (Canine Parvovirus, Neospora caninum, and Parainfluenza Virus 5).

H-indices for One Health
There was a statistically significant difference between the Hindices of zoonotic, human-only or animal-only pathogens (Twoway ANOVA, F 2,152 = 24?40, P,0.001); H-indices were significantly higher for human-only (untransformed mean = 132.3966.14 and lower for animal-only pathogens (68.1166.06) compared to zoonotic (100.8369.93). H-indices were higher (with borderline statistical significance) for emerging (106.61610.19) compared to not emerging (86.9168.14) pathogens (Two-way ANOVA, F 1,152 = 3.78, P = 0.054). The interaction between zoonotic and emerging factors was not significant (P = 0.25). -DISCONTOOLS. There were significant correlations between H-indices and DISCONTOOLS estimates of public (human) health (zoonotic and animal pathogens) and impact on wider society (animal-only pathogens), and a further relationship of borderline significance between H-indices and the DISCON-TOOLS overall result; no other correlations were significant (Table 4).

Discussion
The European Commission has outlined measures to strengthen coordinated approaches to health security at EU level, including monitoring, early warning and combating specific threats of a cross-border nature. These measures could be for climate change resilience [23] or for biosecurity, particularly for infectious diseases including communicable diseases, antimicrobial resistant and healthcare-associated infections related to communicable diseases, and biotoxins or other biological agents [24]. In this study, we implement a number of previously defined actions [25], including presenting a quantitative evaluation for the impact of infectious pathogens affecting human health and well-being (via effects upon domestic animals) [24]. The work is unique, starting with all known infectious pathogens, and then objectively and systematically deciding which occur in relevant hosts in Europe using a transparent process. The study establishes priority lists of human and domestic animal pathogens (including zoonoses) present in Europe, using the H-index as a proxy measure for impact.
Previous work suggests that higher H-indices indicate higher impact for a pathogen relative to lower H-indices [6], (McIntyre, unpublished material). The H-index method has both strengths and weaknesses. The strengths include that it is much more evidence-based and objective than semi-quantitative and qualitative approaches, and the results provide an easily understood quantitative estimate of impact. H-indices estimates can be simply and rapidly calculated, and they can therefore be repeatedly obtained to reflect changes in status, with the potential for automation of this process. The results are available for all pathogens at a global scale, and the scores reflect the wider scientific interest that would be expected to follow from a pathogen being either zoonotic or emerging [6]. Most importantly, within a study of 27 human diseases, H-indices were correlated with DALY estimates [6], (McIntyre, unpublished material). DALYs are an accepted measure of true disease burden in humans which accounts for the years of healthy life lost as a result of poor health or disability as well as the potential years of life lost due to premature death [26]. In further work, H-indices were also correlated with the number of human deaths (McIntyre, unpublished material).
The weaknesses of the H-index method include that calculations need some manual oversight, as false positives can occur for instance when pathogens are used as model organisms; biases in results may happen because of trends in interest in specific pathogens, diseases or research fields or in certain regions; and estimates are subject to biases in funding (McIntyre, unpublished material) and research publication. H-indices are likely to underestimate the contribution of scientific literature published in non-English languages, although after translation some publications are included in WoS and consequently in our calculations of H-indices. The literature searching method also doesn't account for the quality of publications in which pathogen names appear and the typical number of citations within different fields, and all bibliographic software packages incorporate newly published literature from different literature sources into their databases at different rates. Finally, H-indices are only a proxy for impact, with the results susceptible to a lag in time-to-publication, and newly emerging pathogens likely to be under-represented.
As the strengths and weaknesses of using the H-index method are different to those of other prioritisation methods, it is probably best used in combination with other approaches, for example, to shortlist a set of pathogens for more detailed risk assessment relying on expert opinion or qualitative data. It may also be used to prioritise diseases for which formal risk assessments are not possible because of data gaps.
Our priority lists of pathogens enabled investigation of why infectious pathogens are omitted from disease surveillance and impact quantification work [8,9,10,11]. We considered several reasons for exclusion, including lack of diagnosis or misdiagnosis [27], because the impact of particularly chronic infections is difficult to quantify or they are seen as less important, and because some pathogens are commensals or natural biota causing opportunistic infections in immune-compromised individuals; their Pathogens include those which are zoonotic (Z), non-zoonotic (NZ), emerging (E) and not emerging (NE) [15,16], or given a new status (NS) in this work. Pathogens also included in the list of top 100 animal pathogens are noted (A). The major pathogens causing diseases included within the 2012 Global Burden of Disease (GBD) report are noted [8,31], as are those reportable in the EC (EC) [9]. doi:10.1371/journal.pone.0103529.t002 Table 3. Top 100 domestic animal pathogens in Europe, prioritised according to the H-index methodology [6] with the same emerging and zoonotic definitions as for Table 2.  [6]; for some animal pathogens, this is the first time that emerging status has been examined.
Methods to assess disease impacts use metrics capturing either human or animal host effects; they neither measure the magnitude in all hosts nor take account of scientific knowledge and tools for control. It is hard to prioritise human and animal diseases, because of the different metrics used (health or societal impacts versus welfare or economic impacts). Significant differences between Hindices mean values for human-only, zoonotic, and animal-only pathogens provide evidence that this single measure may have some use as a One Health metric accounting for such factors. For example values for zoonotic pathogens were higher than for animal-only, suggesting that they account for human as well as animal-impact. Higher values for human-only compared to other pathogen groups suggests that zoonoses may be under-represented due to underestimation of their global burden [28,29], or research impact [6], or because of biases in research impact and funding for chronic human pathogens [29]. In addition, lower animal-only Hindices may be due to funding biases.
Finally, there was limited evidence that the H-index method is a reasonable proxy for the impact of animal pathogens; animal pathogen H-indices were significantly positively correlated with subsections of DISCONTOOLS [11], including impact on public (human) health and overall results (borderline significance). If animal-only (not zoonotic) diseases were included, there was a significant positive relationship with impact on wider society. As the more animal-focussed subsections (disease knowledge, impact on animal health and welfare, impact on trade, and available control tools) were not correlated with H-indices, and H-indices were not affected by inclusion in the OIE list [10], this suggests a human-centric bias in H-indices; for example, a pathogen causing little impact in animals may nevertheless have a high H-index if zoonotic.
The priority lists presented in this work should be used by agencies and research organisations in combination with other risk assessment methods to identify gaps in working for priority setting. It has been suggested that zoonoses must be dealt with at the interface of human and animal health using all available information [30]; this work, combining the EID2 and H-index technique, demonstrates such 'big-data' approaches. Pathogens also included in the list of top 100 human pathogens are noted (H). The major pathogens causing diseases included within the OIE list of notifiable terrestrial and aquatic animal diseases (OIE) are noted [10], as are those included in the DISCONTOOLS project (DISC) [11]. doi:10.1371/journal.pone.0103529.t003 Table 4. Results of Spearman's Rank correlations between H-indices and the DISCONTOOLS prioritisation of major animal diseases [11].