Drug Discovery Prospect from Untapped Species: Indications from Approved Natural Product Drugs

Due to extensive bioprospecting efforts of the past and technology factors, there have been questions about drug discovery prospect from untapped species. We analyzed recent trends of approved drugs derived from previously untapped species, which show no sign of untapped drug-productive species being near extinction and suggest high probability of deriving new drugs from new species in existing drug-productive species families and clusters. Case histories of recently approved drugs reveal useful strategies for deriving new drugs from the scaffolds and pharmacophores of the natural product leads of these untapped species. New technologies such as cryptic gene-cluster exploration may generate novel natural products with highly anticipated potential impact on drug discovery.


Introduction
High-throughput screening and combinatorial chemistry based drug discovery efforts have not led to the expected drug productivity, raising renewed interest in searching drugs from nature [1]_ENREF_2. Various species have been extensively searched in the past [2][3][4] with few percentage of the identified bioactive natural products carried forward to derive 939 approved drugs [2,5] that are composed of limited number of molecular scaffolds [6]. Natural products have fallen out of favor partly because of technology shift [7], diminishing returns and high rediscovery rates [8,9], and supply and screening problems [1,10]. Although new technologies are expected to overcome some of these problems and enable significantly expanded bioprospecting efforts [1,7,11,12], questions remain unanswered about the prospect of deriving new drugs from untapped species at acceptable return rates.
As part of the efforts in probing these questions, we have recently studied the distribution patterns of nature-derived drugs in phylogenetic trees and shown that drug-productive species tend to be clustered in specific regions of phylogenetic space (drugproductive clusters) with most drugs derived from existing drugproductive families (families that have yielded at least one approved drugs) [5]. Further clues to the prospect of deriving drugs from untapped species and the effective drug discovery strategies may be gained from the analysis of approved drugs derived from previously untapped species, particularly those approved in recent decades. In this work, we analyzed the species origins of nature-derived drugs approved in 1991-2010 with respect to those approved in previous decades  to find the exploration trends indicative of future bioprospecting prospect and likely sources of untapped new drug-productive species. We also tracked development histories of several classes of approved nature-derived drugs to reveal effective strategies for deriving new drugs from the bioactive natural products isolated from these species.
While the analysis of the approved drugs may provide useful clues to the drug discovery prospect of untapped species, some aspects of drug discovery prospect may not be fully captured because the approved drugs are different from discovered drugs by the additional commercial and technological considerations. For instance, the pharmaceutical industries have moved away from antibiotics partly because these are less lucrative than drugs for chronic conditions [13]. The exploration of microbial species has been limited by the cost and efficiency of cultivation technologies, with the majority of microbial organisms remain uncultured [14,15]. Therefore, caution needs to be exercised in interpreting the results of our analysis, particularly with the possibility that new technologies that explore cryptic geneclusters and networks [11,12,16,17], inter-species crosstalk [11,18,19] and high-throughput fermentation [20] may provide significantly expanded molecular scaffolds for natural product drug discovery [15].

Materials and Methods
A total of 939 nature-derived approved drugs have been found from Newman and Cragg's seminal work [2] and our own literature search [5]. Their species origins have been identified from comprehensive literature search by using combinations of such keywords as drug name and alternative names, species, natural product and nature, and the search results have been confirmed based on such descriptions as ''originates from'', ''derived from'', ''isolated from'', or ''comes from'' a species. The corresponding species families of the host species of these drugs are from the NCBI taxonomy database [21]. The approval dates of these drugs were from Newman and Cragg's work [2] and Drugs@FDA on FDA website (http://www.accessdata.fda.gov/ scripts/cder/drugsatfda/index.cfm). Table 1 presents the statistics of drugs approved in every fiveyear period of 1961-2010 (divided into drugs derived from previously untapped and previous drug-productive species respectively), and the statistics of drug-productive species that have produced drugs in each period. There are 46-126 nature-derived drugs approved in every 5-year periods since 1991, 7.1%-14.5% of which are from previously untapped species (i.e. untapped before the specific 5-year period) and these species represent 11.4%-41.7% of the drug-productive species that have yielded approved drugs in 1991-2010. In contrast, there are 26-133 nature-derived drugs in every 5-year period of 1961-1990, 18.0%-62.8% of which are from previously untapped species and these species represent 36.7%-76.2% of the drug-productive species in . While the percentages have been reduced to some extent, the recent trends of substantial percentage of drugs and substantial percentage of drug-productive species being from previously untapped species strongly suggests that the untapped drug-productive species is unlikely near extinction, and future bioprospecting efforts are expected to yield new drugs at comparable levels. This is consistent with an earlier analysis of the largest antibiotic-producing genus Streptomyces which suggests that the new compound discovery rate from the unexplored strains of Streptomyces would not decline for several decades and 15-20 antibiotics would be discovered each year at the 1995 exploration level [9]. It is also consistent with the estimated high antibiotic production frequencies by the untapped strains of the actinomycetes class (5610 26 -2610 21 in screening 10 4 -10 7 strains) [22]. Table 2 provides the list of the new drug-productive species emerged in 1991-2010 together with the approved drugs derived from these species since the first drug approval. There are 59 new drug-productive species emerged in this period, 33 (55%) of which are from existing drug-productive species families and another 22 (37%) of which are from new species families in existing drugproductive clusters. These suggest a high probability of finding new drug-productive species from existing drug-productive families or new families located in existing drug-productive clusters. This coincides with the reported refocused efforts for finding new antibiotics from new sources in the actinomycetes class and cyanobacteria phylum [22], which cover an existing drug-productive cluster (Actinomycetales cluster) and an existing drug-productive family (Symploca family) respectively [5].

Results and Discussion
From the time of their first drug approval to the end of 2010, these 59 new drug-productive species have yielded 85 approved drugs, 12 (14.1%) and 14 (16.5%) of which are unmodified natural products and biologics respectively, while the majority (69.4%) of the 59 drugs have been derived from bioactive natural products by semi-synthetic modification, structural mimicking or pharmacophore mapping to overcome problems frequently encountered by bioactive natural products such as weak potency [23], low target selectivity [24], toxicity [25], undesired pharmacokinetic properties [26], and supply issues [27]. Retrospective study of the case histories of several classes of recently approved drugs reveal useful strategies for deriving new drugs from bioactive natural products from untapped species to overcome these frequently encountered problems.
Since the discovery of Paclitaxel [28], extensive efforts have been directed at the search of tubulin interacting anticancer drugs from species other than the tapped taxus species, leading to the identification of a number of natural products from untapped Table 1. Historic data of the numbers of nature-derived approved drugs from previously untapped and previous drug-productive species, and the numbers of drug-productive species during every five-year period from 1961 to 2010.       particularly epothilone B, from the previously untapped myxobacterium Sorangium cellulosum species as tubulin interacting anticancer agents with potent cytotoxic activity toward paclitaxelsensitive and paclitaxel-resistant cells [26]. But these compounds are prone to the inactivation by esterase cleavage, and in an effort of overcoming this problem semisynthetic lactam analogs of epothilone B has been derived which led to the discovery of Ixabepilone approved in 2009 [26] (Figure 2). The identification of BCR-ABL as a key target of chronic myeloid leukemia has prompted extensive efforts in identifying anticancer ABL inhibitor drugs [29]. Design works based on the pharmacophore of a nonselective pan-kinase inhibitor staurosporine from the previously untapped Lentzea albida species have resulted in an EGFR selective inhibitor CGP 52411 [24] and subsequently a potent ABL selective inhibitor CGP 57148 [30]. CGP 57148 has undesired pharmacokinetic properties, which is improved by formulating it with mesylate salt leading to Gleevec, a milestone anticancer kinase inhibitor approved in 2001 [31] ( Figure 3).
The identification of HIV-1 protease as a key anti-HIV target has also motivated extensive efforts in identifying HIV-protease inhibitor drugs. Before the start of HIV-1 protease inhibitor projects, pepstatin from previously untapped Streptomyces argenteolus subsp. Toyonakensis and Streptomyces testaceus species has been found to show weak human rennin inhibitory activities, and more potent peptidomimetic human rennin inhibitors have been derived from it by further optimizing binding configuration to better mimicking substrate binding [23]. Based on these works, a series of potent peptidomimetic HIV-protease inhibitors have been designed [32,33], leading to a series of approved anti-HIV drugs Saquinavir, Indinavir, Ritonavir, Neflinavir, Amprenavir, and Lopinavir in 1995-2000.
In 2001-2011, some new molecular scaffolds have been derived from both previously explored species and previously untapped species. For instance, the 4-anilinoquinazoline scaffold in Gefiti- nib, Erlotinib and Lapatinib has been derived based on the pharmacophore of a selective kinase inhibitor olomoucine, a semisynthetic derivative of zeatin from Maize, Cocos nucifera, Spinacia oleracea, and Pisum sativum species that have been previously explored for deriving cardiovascular and anthelmintic agents [34][35][36]. The 2-phenylaminopyrimidine scaffold in Imatinib and Nilotinib has been derived based on the pharmacophore of a nonselective pan-kinase inhibitor staurosporine from the previously untapped Lentzea albida species [24,30,31]. During the same period of time, many of the existing molecular scaffolds (those found in nature-derived drugs approved before the period) continue to contribute new approved drugs. Examples are the tetracycline scaffold of Minocycline, Methacycline and Tigecycline approved in 2001 and 2005 respectively, and the steroide scaffold of Falecalcitrol and Acetyldigitoxin approved in 2001 and 2002 respectively.
While most of the 2001-2011 approved nature-derived drugs targeting previously-explored pathways, some previously untargeted pathways have become highly successfully targeted. These are MAPK, ErbB, mTOR, Brc-Abl regulated, and hematopoietic cell lineage pathways targeted by 7, 6, 3, 3 and 3 new drugs respectively. These pathways become successfully targeted partly because of the derivation of selective kinase inhibitory scaffolds based on the pharmacophores of zeatin derivatives [34][35][36] and staurosporine [24,30,31] etc. Some previously successfully explored pathways continue to be success-fully targeted. Specifically, the calcium signaling, insulin signaling, and Toll-like receptor pathways are targeted by 10, 4, and 6 drugs approved before 2001 and 5, 4, and 4 drugs approved during 2001-2011 respectively.
Our analysis suggests that untapped drug-productive species are unlikely to be near extinction, and there is a high probability for finding new drug-productive species in the existing drug-productive families and drug-productive clusters. New technologies that explore cryptic gene-clusters [11], pathways [12,17], inter-species crosstalk [11,18,19] and high-throughput fermentation [20] enable the generation of significantly more diverse groups of novel natural products, which has been anticipated to have some impact on drug productivity from nature. Some of the revolutionary new drugs of novel targets, such as Gleevec and Gefitinib approved in recent years and Saquinavir and Paclitaxel from earlier years, as well as new drugs of existing targets, have been derived from bioactive natural products of previously untapped species. Retrospective analysis of the case histories of these drugs reveals useful strategies for deriving new drugs from these natural sources.