Suites of Terpene Synthases Explain Differential Terpenoid Production in Ginger and Turmeric Tissues

The essential oils of ginger (Zingiber officinale) and turmeric (Curcuma longa) contain a large variety of terpenoids, some of which possess anticancer, antiulcer, and antioxidant properties. Despite their importance, only four terpene synthases have been identified from the Zingiberaceae family: (+)-germacrene D synthase and (S)-β-bisabolene synthase from ginger rhizome, and α-humulene synthase and β-eudesmol synthase from shampoo ginger (Zingiber zerumbet) rhizome. We report the identification of 25 mono- and 18 sesquiterpene synthases from ginger and turmeric, with 13 and 11, respectively, being functionally characterized. Novel terpene synthases, (−)-caryolan-1-ol synthase and α-zingiberene/β-sesquiphellandrene synthase, which is responsible for formation of the major sesquiterpenoids in ginger and turmeric rhizomes, were also discovered. These suites of enzymes are responsible for formation of the majority of the terpenoids present in these two plants. Structures of several were modeled, and a comparison of sets of paralogs suggests how the terpene synthases in ginger and turmeric evolved. The most abundant and most important sesquiterpenoids in turmeric rhizomes, (+)-α-turmerone and (+)-β-turmerone, are produced from (−)-α-zingiberene and (−)-β-sesquiphellandrene, respectively, via α-zingiberene/β-sesquiphellandrene oxidase and a still unidentified dehydrogenase.


Introduction
Ginger (Zingiber officinale Rosc.) and turmeric (Curcuma longa L.) have been used for centuries to treat human ailments. Ginger is effective against symptoms of the common cold, fever, rheumatic disorders, gastrointestinal complications, motion sickness, diabetes, and cancer [1]. It has anti-bacterial [2] and anti-fungal [3] activities. Many of these medicinal activities including anti-cancer and anti-inflammatory [4] are believed to be due to the presence of active phenolic compounds such as the gingerols, paradols and shogaols [5,6,7]. However, terpenoids from ginger have also been reported to have important human health roles. For example, belemene arrests the cell cycle and induces apoptotic cell death in lung cancer cells [8] and elemene aids patients with chylothorax [9]. Zingiberene as well as [6]-gingerol significantly inhibited gastric lesions [10] and further research revealed (2)-b-sesquiphellandrene, b -bisabolene, ar-curcumene and 6-shogaol as antiulcer active principles in ginger [11]. Turmeric has antiinflammatory [12] and anti-cancer [13] properties, which have been reported to be mainly due to the presence of curcumin, a diarylheptanoid. On the other hand, turmeric oil, reported to contain ar-turmerone, turmerone and curlone, showed antioxidant effects and anti-mutagenic action [14]. Turmeric oil, consisting largely of terpenoids, also has anti-bacterial activity [15]. Both curcuminoids and sesquiterpenoids in turmeric exhibit hypoglycemic effects via peroxisome proliferator-activated receptor-c (PPAR-c) activation and suppress an increase in blood glucose levels in type 2 diabetic KK-Ay mice. The effect was synergistic when both curcuminoids and sesquiterpenoids in turmeric were applied together [16]. ar-Turmerone from turmeric oil displays anti-tumorigenesis activity, inhibiting cell proliferation and activating ar-turmerone-mediated apoptotic protein in human lymphoma U937 cells [17]. It was also found that apoptosis was selectively induced by ar-turmerone in human leukemia Molt 4B and HL-60 cells, but not in human stomach cancer KATO III cells. ar-Turmerone also has antiplatelet activities that can prevent and treat arteriol thrombosis [18].
Despite the importance of the sesquiterpenoids from these two plants to human health, the enzymes involved in their formation have not been previously identified. Terpene synthases (TPSs) are the entry point and perhaps most important type of enzymes leading to the various subclasses of terpenoids in these plants. However, only four sesquiterpene synthases (STPSs) have been characterized from the Zingiberaceae: (+)-germacrene D synthase [19] and (S)-b-bisabolene synthase [20] from culinary ginger (Zingiber officinale Rosc.) rhizome, and a-humulene synthase [21] and b-eudesmol synthase [22] from shampoo ginger (Z. zerumbet Smith) rhizome. These enzymes do not account for the major compounds produced in these species. For example, both ginger and turmeric produce large amounts of (2)-a-zingiberene and (2)b-sesquiphellandrene. Turmeric also synthesizes appreciable quantities of a-turmerone and b-turmerone (Figure 1), which are also sometimes called tumerone and curlone, respectively [14,23].
These compounds are not the direct or downstream products of the four reported TPSs. In this report, we describe the identification and characterization of a large suite of TPS enzymes involved in the formation of the large array of terpenoids found in these plants, and elucidate the means by which the sesquiterpenoids a-turmerone and b-turmerone are formed in turmeric.

Plant Material
Ginger (Zingiber officinale Rosc.) and turmeric (Curcuma longa L.) were grown in a greenhouse for 5 to 7 months. Two varieties of culinary ginger (white ginger and yellow ginger) were used, which are different from the white and yellow ''gingers'' (not Zingiber species at all) that have white and yellow flowers, respectively. The white and yellow gingers used in this study are culinary and medicinal varieties of ginger that have green inflorescences and morphologically are very similar to each other except that they have slightly different rhizome colors. Hawaiian red turmeric (HRT) was used for cloning genes and ''fat mild orange'' (FMO) turmeric was used to for GC/MS analysis, as in Figure 1. These two ''varieties'' are really the same clonal line that was obtained from two different organic ginger growers in Hawaii, respectively, Dean Pinner from Pinner Creek Organics and Hugh Johnson, from Puna Organics. The GC/MS total ion chromatograms of FMO and HRT are essentially identical.

Cloning of Full Length cDNAs
Some unitrans identified in the ginger and turmeric EST databases were homologous to TPSs and appeared to be full length, although others were incomplete. For those genes missing either/both the 59 and/or the 39 end sequences, the SMART RACE (Rapid Amplification of cDNA End) method (Clontech) was used to find the missing 59 or/and 39 end(s) except for the unitrans ST00. 59 RACE ready cDNAs and 39 RACE ready cDNAs were synthesized from 8 different total RNAs (GW-Rh, GW-R, GW-L, GY-Rh, GY-R, GY-L, T-Rh and T-L, where GW: white ginger, GY: yellow ginger, T: turmeric, Rh: rhizome, R: root, L: leaf) extracted with the RNeasy Plant Mini kit (Qiagen) using superscript III reverse transcriptase (Invitrogen) for 39 RACE ready cDNAs and superscript II reverse transcriptase (Invitrogen) for 59 RACE ready cDNAs respectively. 39 RACE CDS (AAGCAGTGGTATCAACGCAGAGTAC(T) 30 VN), 59 RACE CDS ((T) 25 VN), and SMART II TM A Oligonucleotide (AAGCAGTGGTATCAACGCAGAGTACGCGGG) were used for RACE ready cDNA synthesis according to the manufacturer's protocol. Amplifications of either the 39 or 59 end were carried out using the Advantage 2 PCR kit (Clontech) with Universal Primer A Mix (UPM, Long: CTAATACGACTCACTATAGGGCAAG-CAGTGGTATCAACGCAGAGT, Short: CTAATACGACT-CACTATAGGGC) and gene specific primers for RACE (Table  S1). Using RACE products, a second round of PCR was done with gene specific nested primers for RACE (Table S1) and either N-UP (59-AAGCAGTGGTATCAACGCAGAGT-39), UP-M (59-CACTATAGGGCAAGCAGTGGT-39), or UP-S (59-CTAA-TACGACTCACTATAGGGC-39). PCR products of the expected estimated size were eluted using the MinElute PCR purification kit (Qiagen) and inserted into the pCR2.1-TOPO vector (Invitrogen). For ST00, we found the 59 end by a different method, using the ZO__Ed (T-Rh) cDNA library stock. Using Zd01L13RR, a ST00 specific primer (Table S2), and UPM-T7 (59-CTAATACGACTCACTATAGGGCGTAATACGACT-CACTATAGGGCGAATTG-39), a cloning vector specific prim-er, we amplified a ST00 specific fragment, which was sub-cloned as described above.
After sequencing and confirmation of 59 and/or 39 sequences, full length cDNAs were amplified from either 59 RACE ready cDNA or 39 RACE ready cDNA using Pfu thermostable polymerase with two gene-specific primer sets (Table S2) and were inserted into the pCR2.1-TOPO vector. In this manner, full length mono-and sesquiterpene synthases were cloned into the pCR2.1-TOPO vector. Truncated monoterpene synthases (with plastidial peptides removed) and full length sesquiterpene synthases were sub-cloned into expression vectors such as pCRT7/CT-TOPO vector (Invitrogen), pEXP5/CT-TOPO vector (Invitrogen), pET101/D-TOPO vector (Invitrogen) and/or pH9GW vector for expression in E. coli. For pH9GW-MT00, the PCR product produced using the 10N21GtwyAttF and 10N21G-AttR primers (Table S2) was first cloned using the Gateway BP reaction by Gateway BP clonase II enzyme mix (Invitrogen) with pDONR207 vector (Invitrogen) to yield pENTR207-10N21, and then the pENTR207-10N21-based Gateway LR reaction was performed using Gateway LR clonase II enzyme mix (Invitrogen) with pH9GW. For pH9GW-Zc05I02tt, the PCR product with Zc05I02tFt and Zc05I02tR primers (Table S2) was first inserted into pENTR/D-TOPO vector (Invitrogen) to produce pENTR-Zc05I02tt vector, which was used to produce pH9GW-Zc05I02tt using the Gateway LR reaction with Gateway LR clonase II enzyme mix and pH9GW. There are two versions of pCRT7CT-MT11: with His-tag or without His-tag at the 39 end due to absence or presence of a stop codon in gene specific reverse primers, Zc07C01CT-R and Zc07C01tR, respectively (Table S2). Several genes were also cloned into the pESC-URA vector (Stratagene). PCR products amplified by appropriate primer pairs (Table S2) were sub-cloned into pCR8/GW-TOPO (Invitrogen) and fragments produced by digestion of the resulting constructs with BamHI (NEB) and XmaI (NEB) were sub-cloned into pESC-URA vector digested with BamHI and XmaI.
E. coli was grown at 37uC to OD 0.6 at 600 nm and induced for 18 h at 18uC with IPTG (0.05 mM , 1 mM) and 0.2% arabinose for BL21-AI derived strains.

Enzyme Assays of Terpene Synthases Expressed in E. coli
Overnight grown E. coli cultures that had been induced to express recombinant proteins were centrifuged to collect cell pellets. The pellets were vortexed with Washing Buffer (20 mM Tris-HCl, pH 7.0, 50 mM KCl) and then centrifuged. Protein Extraction Buffer (50 mM 3-(N-morpholino)-2-hydroxypropanesulfonic acid, pH 7.0, 10% [v/v] glycerol, 5 mM MgCl 2 , 5 mM DTT, 5 mM sodium ascorbate, 0.5 mM phenylmethylsulfonyl fluoride) was added to washed E. coli pellets, which were then vortexed, sonicated and centrifuged. Supernatant was recovered and the buffer was changed to Enzyme Assay Buffer (10 mM 3-(Nmorpholino)-2-hydroxypropanesulfonic acid, pH 7.0, 10% [v/v] glycerol, 1 mM DTT) using PD-10 columns (GE Healthcare Life Sciences). Divalent cations (20 mM MgCl 2 and/or 0.5 mM MnCl 2 at final concentration), phosphatase inhibitors (0.2 mM NaWO 4 , 0.1 mM NaF at final concentration) and either geranyl diphosphate (GPP, 10 mg) or farnesyl diphosphate (FPP, 10 mg) were added to total 500 ml of Enzyme Assay Buffer containing soluble proteins and incubated for 3 h at 30uC with 200 ml of top layered pentane. The top pentane phase was removed directly at the end of the assay time (or vortexed with the aqueous phase and then centrifuged prior to removal) and was used for metabolite analysis.

Terpenoid Analysis
A Thermo Finnigan Trace GC 2000 with a Rtx-5MS w/5 m Integra-Guard Column (Restek, 0.25 mm ID, 0.25 mm df, 30 m) coupled to a DSQ mass spectrometer was used for gas chromatography/mass spectrometry (GC/MS) analysis, using methods previously described [26]. A chiral column, Rt-bDEXse (Restek, 0.25 mm ID, 0.25 mm df, 30 m), was used for determining enantiomers of linalool and caryolan-1-ol. Eluted compounds were identified by comparison of resulting mass spectra to the NIST/EPA/NIH Mass Spectral Library (NIST 02) and the essential oil GC/MS mass spectra library from Dr. Robert P. Adams [27]. For peak identification, we used both mass spectra similarity and peak retention time indices unless we specify the use of authentic standards. Adams' essential oil library has a retention time index. An example of how the retention time index was used with mass spectra similarity for peak identification is shown in Figure S1. Unless an authentic standard was used, all identifications should be viewed as tentative, although we believe that for the important compounds discussed in this manuscript, they are indeed correct.

Western Blot Analysis for Terpene Synthases Expressed in Yeast
Proteins from yeast expressing terpene synthases were extracted using acid-washed glass beads (425-600 mm, 30-40 U.S. sieve) (Sigma). Yeast cell pellets from 10 to 15 ml of media were vortexed 15 times for 30 seconds on ice with 0.5 g of acid-washed glass beads and Protein Extraction Buffer (see above). Protein concentrations were determined by the Bradford Protein Assay (Bio-Rad). After 10 mg of total and soluble proteins were run on SDS-PAGE, the gel was blotted onto PVDF Transfer Membrane (0.45 mm, Thermo Pierce) with transfer buffer (

Protein Structural Modeling
SWISS-MODEL [28] was used to model the putative protein structures and UCSF Chimera [28] was used to visualize the models.

Cloning and Expression of Ginger and Turmeric TPSs
In efforts to identify how the large array of mono-and sesquiterpenoids ( Figure 1) in ginger and turmeric are produced, we first searched a database of 50,139 expressed sequence tags (ESTs) from ginger and turmeric tissues (http://www.agcol. arizona.edu/cgi-bin/pave/GT/index.cgi), and identified many putative terpene synthases (TPSs) that would be expected to catalyze the formation of monoterpenoids (20 unique transcripts [unitrans]), sesquiterpenoids (10 unitrans), diterpenoids (2 unitrans), triterpenoids (3 unitrans), and tetraterpenoids (10 unitrans). Two monoterpene synthases (MTPSs) and no sesquiterpene synthase (STPS) were represented by full length sequences in the database. The rest of the identified TPS genes either required RACE or Genome Walking followed by RT-PCR to obtain full length cDNAs or did not yield full length clones after these efforts (Table S3). During the cloning process, we sequenced several independent clones for each unitrans and found that some exist as multiple paralogs and/or alleles in these species. In some cases, only the paralog(s) could be cloned, whereas the gene represented by the sequence in the original database could not. Because we often found more than two sequences during the RT-PCR-based cloning, we are confident that most of these are indeed paralogs, and not merely allelic pairs. Turmeric is sterile and likely a nonaploid (x = 7, 2n = 9x = 63) [29], thus providing an explanation for why so many related sequences could be found in one species.
A similarity tree generated from a large set of plant TPSs and ginger and turmeric TPS genes including full length genes has one MTPS cluster and one STPS cluster ( Figure 2). One MTPS, called MT00, is a linalool/nerolidol synthase and the outgroup relative to other MTSs. All of the ginger and turmeric TPS proteins possess the conserved DDXXD motif required for interaction with the diphosphate group of the substrate ( Figure S2). All MTPSs, except for MT00, have a transit peptide and the conserved RRX 8 W motif in the 59 region. Most STPSs have the RX 9 W motif except for ST01, b-selinene synthase. Although MT00 does not possess the conserved tryptophan of the RRX 8 W motif, ST01 does.
The major products produced by ginger and turmeric TPSs are summarized in Table 1 and 2. When provided the alternative substrate in vitro, some MTPSs produce sesquiterpenes whereas some STPSs synthesize monoterpenes. However, based on the results outlined below, it is likely that such functions have little or no relevance to production of these compounds in vivo in most cases. Although several of the proteins expressed well in E. coli, many required expression in yeast in order to produce soluble and functional enzymes. More detailed descriptions of expression and analysis of many specific TPS genes is included in Results S1.

Function of TPSs Explain Formation of Major Terpenoids in Ginger and Turmeric
As seen in Figure 1, the most abundant terpenes in ginger and turmeric are (2)-a-zingiberene and (2)-b-sesquiphellandrene, based on GC/MS peak areas in total ion chromatograms of extracts from these plants. Two very similar TPS proteins, ST00A and ST00B with 98.4% similarity to each other and high similarity to known STPSs, and lacking a chloroplastic transit peptide, were cloned from turmeric rhizome. These expressed well in E. coli and synthesized the same products in vitro when supplied FPP as substrate: (2)-a-zingiberene (49.3%), (2)-b-sesquiphellandrene (40.7%) and b-bisabolene (6.3%) as major products ( Figure S2). When expressed in the yeast strain EPY219, which supplies endogenous prenyl diphosphate precursors, GPP and FPP for production of terpenoids in vivo, these proteins produced the same major products, although the ratios of the products differed according to expression temperature and induction time, e.g., (2)a-zingiberene (67%), (2)-b-sesquiphellandrene (22.7%), and bbisabolene (6.2%) after 4 days of induction at 18uC (Figure 3; Figure S3). When expressed in other yeast strains or under different expression temperatures and induction times, the ratios of these products varied, as did the amounts of other minor products of these enzymes. The other STPS genes identified in our database did not produce (2)-a-zingiberene or (2)-b-sesquiphellandrene when expressed in E. coli or in various yeast strains. Thus, it appears that these two enzymes and their paralogs are responsible for formation of the most abundant terpenes in the rhizomes of these plants. As outlined in a later section, these compounds appear to be the precursors for the major oxygenated terpenoids in turmeric as well, indicating a major role for these enzymes in determining that plant's terpenoid profile. A recently reported terpene synthase from sorghum, SbTPS1 also produces (2)-azingiberene and (2)-b-sesquiphellandrene as major products in very similar ratios to ST00A/B, although SbTPS1 is only 43% identical with ST00A [30,31].
b-Phellandrene, the major product of the protein designated MT08, is the second most abundant monoterpene in ginger rhizomes ( Figure 1). This protein, a MTPS with a chloroplastic targeting peptide, was very difficult to express in soluble form. However, when finally expressed and assayed, it produced both bphellandrene (88.3%) and a-pinene (11.7%) (Table 1, Figure S5). No other MTPS produced b-phellandrene as a major product.
(2)-b-Phellandrene and (2)-b-sesquiphellandrene are structurally identical, except that the latter has a longer tail. b-Phellandrene can also be produced by the two a-zingiberene/b-sesquiphellandrene synthases (ST00A and ST00B) described above, which make b-phellandrene as 49.2% of their monoterpene production, when provided GPP as substrate. However, GPP is a very poor substrate for these enzymes in general. When considering the fact that monoterpenes are mainly produced in plastids, and ST00A and ST00B are not targeted to that organelle, whereas MT08 is, the latter is likely to be the main enzyme responsible for formation of b-phellandrene in ginger rhizome. Microarray data also shows high expression of MT13 ( = MT08) in ginger rhizome over time (Table S4).
Camphene, the most abundant monoterpene in ginger, was produced by three different enzymes, MT06B, MT09A2 and MT12A-M2, which all function as camphene/a-pinene synthases and produce camphene as the major product when provided with GPP as substrate (Table 1, Figures S6, S7 and S8) but were not able to use FPP as a substrate. In contrast, the enzyme MT04 produces no camphene, but produces a-pinene (60.1%) and bpinene (30.7%) as major products (Table 1, Figure S9). It is likely that most of the a-pinene accumulated by ginger rhizomes is produced by the camphene/a-pinene synthases (MT06B, MT09A2 and MT12A-M2). The b-pinene peak from ginger rhizome samples is very small (Figure 1). Moreover, the amount of a-pinene produced by MT04 is only about twice that of b-pinene, which suggests that MT04 contributes at most ,21% of the apinene produced in ginger rhizomes. In contrast, ginger roots accumulate both a-pinene and b-pinene at significantly higher levels compared to camphene ( Figure 1). Differences between extracts from plants of different ages supported this observation, where MT04 products (a-pinene and b-pinene) are more abundant in roots of 2 month old yellow ginger plants when compared to 7 month old plants; 5 times more for b-pinene. These results were supported by microarray results for tissues from these plants (with 7-fold higher expression in the younger plants, for MT04; Table S4). It thus appears that MT04 synthesizes most of the b-pinene found in ginger rhizomes and roots, whereas a-pinene accumulation is the result of the action of several enzymes. b-pinene is also the main monoterpene in turmeric leaves, but MT04 was not expressed at all in turmeric, suggesting that turmeric possesses an as yet unidentified leaf-specific b-pinene synthase.
a-Phellandrene is the most abundant monoterpene in turmeric rhizome ( Figure 1). Although ginger contains a-phellandrene, it is not a major product. The enzyme designated MT03 synthesizes aphellandrene (92.2%) as a major product, with b-phellandrene (3.8%) and several other compounds as minor products (Table 1, Figure S11). FPP was not a substrate for MT03. Microarray data suggest high expression of MT10 ( = MT03, see Table S3) at early stages of turmeric rhizome development, with a reduction in expression over time (Table S4). The levels of a-phellandrene also decrease over development in turmeric rhizome, although this decrease was not as great in magnitude as that observed for the transcript level. It is possible that turmeric rhizome stores a-phellandrene in the rhizome, leading to accumulation in older Figure 2. Similarity tree of cloned full length mono-and sesquiterpene synthases and unitrans sequences of di-, tri-and tetraterpene synthases from ginger and turmeric. The neighbor-joining tree was generated by ClustalX with ginger and turmeric sequences and 181 additional TPSs from GenBank. The non-ginger/turmeric TPSs were removed from the tree for clarity. The STPS cluster is separate from other terpene synthases. Linalool/nerolidol synthase (MT00) is located outside of the MTPS cluster. Major product(s) of each corresponding recombinant protein is/ are shown next to each gene name, as is the tissue used for cloning. doi:10.1371/journal.pone.0051481.g002 Table 1. Monoterpenes produced by ginger and turmeric terpene synthases.  Table 2. Sesquiterpenes produced by ginger and turmeric terpene synthases.   rhizomes. The expression levels of MT03 in turmeric root and leaf are lower than that observed for turmeric rhizome, and aphellandrene amounts in root and leaf of 7 month old turmeric plants are 28.7% and 14.5% of that observed for the rhizome of the same plants.
Ginger rhizome and leaf and turmeric leaf produce small amounts of linalool ( Figure 1). Two MTPSs, MT00 and MT17 from turmeric leaf, produce linalool from GPP. The best hit from  a BLAST search of MT00 against public sequence databases is to another linalool synthase ((3S)-linalool/(E)-nerolidol synthase [Vitis vinifera] [32]). Enzyme assays with recombinant MT00 produced linalool (100%) with GPP as a substrate and (E)-nerolidol (100%) with FPP as a substrate (Table 1 and 2, Figure S13). MT17, in contrast, is more similar to other monoterpene synthases than to other known linalool synthases. Efforts to clone a full length cDNA for MT17 yielded diverse paralogs: MT17A, MT17A2, MT17B, MT17B2, MT17C and MT17D (Figure 2). Enzyme assays with MT17A2 synthesized linalool (100%) with GPP and a variety of sesquiterpenes with FPP as substrate (Table 1 and 2, Figure S14), with cis-a-bisabolene (22.7%), trans-a-bergamotene (20.6%), bbisabolene (13.6%), epi-a-bisabolol (12.6%) as the major products. When a chiral column was used to analyze the products of these assays, we found that MT00 produces (S)-(+)-linalool while MT17A2 synthesizes (R)-(2)-linalool ( Figure 4). Thus, these are indeed quite different enzymes, yielding different products, which are both unfortunately trivially called ''linalool''. The evolution of MT00 is likely to have followed the same course as the Vitis vinifera S-linalool synthase, whereas MT17 is like other known R-linalool synthases, which have evolved very recently in their respective plant species from other MTPSs.
MT00 and MT17 are not expressed in ginger tissues. Instead, MT06/MT06A, which produces 24.8% linalool, is likely to be the enzyme responsible for formation of -linalool in ginger. MT06 and MT06A are 98.0% identical, the only differences are found in the transit peptide sequences ( Figure S2), and thus appear to be paralogs. MT06 and MT06A produce a variety of monoterpenoids and sesquiterpenoids (Table 1 and 2, Figures S15 and S1). Several MT06/MT06A products, such as a-thujene (3-thujene), pmenth-1-en-4-ol (terpinen-4-ol), b-curcumene, epi-b-bisabolol and some unknown compounds, were only produced by MT06/ MT06A among the recombinant proteins that we characterized. Although it is possible that there are other yet unidentified enzymes that specifically synthesize these products, these are minor products in ginger and turmeric tissues and it is likely that they are produced by the paralogs MT06/MT06A.
There are small amounts of epi-a-bisabolol and a-bisabolol in ginger and turmeric rhizomes. MT02A was difficult to express in soluble form, but produced the sesquiterpenoids epi-a-bisabolol (58.3%) and a-bisabolol (38.7%) as major products when expressed in yeast strain EPY224. MT02A also synthesized trace amounts of (Z)-a-bisabolene (1.7%), b-bisabolene (1.1%), and trans-a -bergamotene (0.1%) ( Figure S16). Based on sequence similarity, MT02A is classified as a monoterpene synthase, however it did not produce monoterpenes in yeast. This may be due to the properties of the enzyme or because yeast may have limited ability to produce monoterpenes [33,34].
The sesquiterpene b-selinene (eudesma-4(14),11-diene) is produced at detectable levels in ginger rhizome, but is found only at trace amounts in ginger root and is not detectable in ginger leaf or turmeric tissues. Recombinant ST01 synthesized b-selinene (eudesma-4(14),11-diene) (51.9%) as the major product when FPP was used as substrate (Table 2, Figure S17). With GPP as a substrate, ST01 did not produce any detectable product. ST01 is expressed at higher levels in ginger rhizome than ginger root and leaf and is not expressed in turmeric, according to microarray data (Table S4), supporting the role of this enzyme in production of bselinene in vivo.
None of the STPS genes were represented by complete sequences in the EST database, requiring additional efforts such as RACE or genome walking to yield full length cDNA sequences. During the cloning of one such gene, designated ST02, many paralogs were found, ST02A, ST02A2, ST02A3, ST02A4, ST02B, ST02B2-FS, ST02C and ST02C2, which had similar yet distinct product profiles. Although many of the corresponding recombinant proteins for these genes were insoluble, some were soluble or partially soluble in E. coli or yeast. Yeast strain EPY224 expressing ST02A4 produced (2)-neointermedeol (48.7%) as a major product and a long list of minor products (Table 2, Figure  S18). Enzyme assays using E. coli crude extract expressing ST02B with GPP as a substrate produced several monoterpenes, with linalool (25.1%), myrcene (25.0%), limonene (15.5%) as major products (Table 1, Figure S19). With FPP as a substrate, ST02B also produced several sesquiterpenes, with a-elemol (44.3%) as the major product ( Figure S20). Enzyme assays using E. coli crude extracts expressing ST02C with GPP as a substrate produced several monoterpenes, with myrcene (30.4%), limonene (17.9%), linalool (15.6%) as major products (Table 1, Figure S21). Similar assays for ST02C with FPP as a substrate produced b-elemene (49.3%) and germacrene D (12.4%) as major products and a long list of minor products (Table 2, Figure S22). It is likely that the belemene detected in these assays is a thermal degradation product of germacrene A [35][36] (Figure 5), because our GC/MS inlet was initially set to a high temperature (220uC). Reducing the GC inlet temperature (to 150uC), as has been suggested [37], led to a decrease in the amount of b-elemene detected with an increase in germacrene A detected in our system. Although the amount of germacrene A was still small and the reduction of b-elemene not so dramatic, it is likely that the b-elemene that we detected may indeed result from thermal degradation of germacrene A. Similarly, a-elemol is likely the thermal breakdown product of (+)-hedycaryol ( Figure 5). However the low inlet temperature led to greatly reduced sensitivity, so the relative product amounts shown in Table 2 are based on injection at 220uC.
Based on the microarray data, ST02 was expressed 8-, 120-and 36-fold higher in ginger rhizome compared to ginger root, ginger leaf and turmeric leaf, respectively. The microarray data indicated lack of expression of ST02 in turmeric rhizome and root (Table  S4). Based on ginger and turmeric terpene synthase product profiles, several terpenes were exclusively produced by ST02A4, ST02B or ST02C, including d-elemene, (+)-cyclosativene, acopaene, c-elemene, c-muurolene, a-muurolene, d-cadinene (cadina-1(10),4-diene), a-elemol, germacrene B although some are produced at very low levels. Among these compounds, cmuurolene and d-cadinene (cadina-1(10),4-diene) were not detected from ginger or turmeric samples, a-copaene was detected in both ginger rhizome and root samples, and all others are only detected in ginger rhizome samples (Figure 1). When these results are considered together, it appears that diverse ST02 paralogs play an important role in enriching sesquiterpene diversity in ginger rhizome.
The sesquiterpene synthase designated ST03 is very similar (89%-95% identical) to ST02 derived genes. However, recombinantly expressed ST03 synthesizes different products compared to the ST02 enzymes. Enzyme assays with FPP as substrate and using crude E. coli extract containing expressed ST03 yielded camorphene (65.4%) as the major product, with allo-aromadendrene (11.8%), germacrene D-4-ol (9.6%), c-cadinene (8.7%) and germacrene D (4.4%) also produced at appreciable levels (Table 2, Figure S23). Comparable assays with GPP as substrate yielded a number of monoterpenoids (Table 1, Figure S24), although this enzyme is not likely transported to the plastid. The major sesquiterpene product of this enzyme, c-amorphene, is present at low levels in ginger, and is barely detectable in turmeric (Figure 1). It is difficult to detect c-amorphene in extracts from ginger rhizome (the tissue where the ST03 gene is most highly expressed) because it comes off of the GC column immediately after the very large (2)-a-zingiberene peak. Two products of ST03, alloaromadendrene and c-cadinene, are more readily detected, with allo-aromadendrene present in ginger rhizome, root and leaf and c-cadinene in ginger rhizome. ST03, which is expressed in ginger rhizome, root and leaf based on microarray data (Table S4), is the likely source of these compounds. And as expected based on the chemical profile, ST03 was not expressed in turmeric according to microarray data.
a-Humulene (also called a-caryophyllene) is present in all turmeric tissues and in ginger rhizome at very low levels. However, it is the most abundant sesquiterpene in ginger roots. ST05 (and close paralog ST05A), which produces a-humulene in vitro, was expressed at much higher levels in ginger root than in other tissues (Table S4). ST05 and ST05A are very similar paralogs: 99.1% and 98.1% identity at the DNA and amino acid sequence levels, respectively. Although they are very similar, their solubilities are different. ST05A is barely soluble and ST05 is quite soluble (Table  S5). Therefore, we tried to purify ST05A using a HIS-tag. However, the amount of soluble ST05A was small, and some E.
coli proteins co-eluted, leading only to partial purification. Enzyme assays with crude E. coli extracts containing expressed recombinant ST05 or with partially purified ST05A produced very similar results. These assays did not produce detectable monoterpenes with GPP as a substrate. With FPP as a substrate, however, ST05 synthesized a-humulene (a-caryophyllene) (83.4%) as the major product and (E)-caryophyllene (b-caryophyllene) (14.2%), belemene (1.5%) and 1,5,9-trimethyl-1,5,9-cyclododecatriene (0.8%) as minor products (Table 2, Figure S25). ST05A showed a similar product profile, except for the last compound, which was not detected. An a-humulene synthase from shampoo ginger (Zingiber zerumbet Smith) has 91% similarity to ST05 and ST05A. That enzyme was reported to produce a-humulene (a-caryophyllene) as major product and small amounts of (E)-caryophyllene (bcaryophyllene) [21]. However, shampoo ginger a-humulene synthase did not synthesize b-elemene or 1,5,9-trimethyl-1,5,9cyclododecatriene. It is possible that these compounds were not detected due to very low abundance, just as we could not detect 1,5,9-trimethyl-1,5,9-cyclododecatriene in our ST05A assays. . Proposed mechanism for the formation of caryophyllene related compounds, the proposed major products of ST07/ ST07A. The mass spectrum of the major product of ST07A is very similar to caryophyllenyl alcohol, caryolan-1-ol and caryolan-8-ol. However, it seems unlikely that caryophyllenyl alcohol would be produced by a single terpene synthase enzyme. doi:10.1371/journal.pone.0051481.g006    Interestingly, (E)-caryophyllene is the most abundant sesquiterpene in ginger leaf but there are only small amounts of ahumulene in ginger leaves. This suggests that either a-humulene is converted in ginger leaves into another compound (which we could not detect in our metabolite profiling experiments) or ginger leaves have a different terpene synthase that produces (E)caryophyllene as a major product.
A different pair of parologous STPSs, ST07 and ST07A (97% identical to each other), is not expressed in turmeric, but is expressed at high levels in ginger root and leaf tissues (Table S4). These expression patterns are similar to (E)-caryophyllene production patterns in ginger and turmeric tissues, suggesting that ST07 and ST07A might be good candidates for involvement in production of (E)-caryophyllene. These proteins were essentially insoluble in E. coli expression systems (several were tested, see Table S5). Despite their high sequence similarity, ST07A expressed in yeast is highly soluble, whereas ST07 expressed in yeast was much less soluble when checked by Western blotting. Based on GC/MS library searches, the best hits for the major product of yeast-expressed ST07 and ST07A (see Figures S26 and S27) are caryophyllenyl alcohol, caryolan-1-ol or caryolan-8-ol, all of which are oxygenated forms of (E)-caryophyllene. It seems unlikely that caryophyllenyl alcohol could be produced directly by a terpene synthase, whereas caryolan-1-ol, caryolan-8-ol or other oxygenated forms of (E)-caryophyllene (Compound A, Compound B and Compound C shown in Figure 6) can be made by a terpene synthase. We could not find oxygenated (E)-caryophyllene derivatives in ginger and turmeric except for caryophyllene oxide. Therefore, it seemed possible that ST07 and ST07A may produce (E)-caryophyllene, which is observed in all tissues of both ginger and turmeric, and that (E)-caryophyllene produced in yeast would then be oxygenated by yeast proteins in the expression system used. However, (E)-caryophyllene was produced in yeast without signs of being oxygenated [36]. Therefore, ST07 and ST07A do indeed appear to produce oxygenated forms of (E)-caryophyllene. In ginger, the product of ST07 and ST07A, an oxygenated (E)caryophyllene, could be further processed by other enzymes and would therefore not be detected in our GC/MS analysis shown in Figure 1.
Identification of a-zingiberene/b-sesquiphellandrene Oxidase As discussed above, (+)-a-turmerone and (+)-b-turmerone appear to be produced by the oxidation of (2)-a-zingiberene and (2)-b-sesquiphellandrene, respectively. By comparing EST data, microarray data and metabolite data, four P450 monooxygenases were selected from 170 ginger and turmeric P450s as the most likely candidates to be involved in formation of these two sesquiterpenoids. These four monooxygenases were named P1, P2, P3 and P4. P2 and P3 were partial clones in the EST database, missing 59 ends, and genome walking revealed the complete sequences. P1, P2 and P4 are very similar to each other and belong to the clade (CYP71D) that contains limonene hydroxylase, which forms a secondary hydroxyl group. P3 is different from the other three P450s and it belongs to the clade (CYP71AV1) that contains amorphadiene oxidase. Amorphadiene oxidase catalyzes a three step of oxidation (primary alcohol R aldehyde R acid) at the end of a hydrocarbon chain [25] (although the latter two oxidations are much less favorable than the first) and limonene hydroxylase forms a secondary alcohol on a six carbon ring [39]. Oxygenation of (2)-a-zingiberene and (2)-b-sesquiphellandrene at position 9 ( Figure 8) is a more similar reaction to that catalyzed by limonene hydroxylase than to that by amorphadiene oxidase. Therefore, P1, P2 and P4 seemed to be the best candidates for the first step in the conversion of (2)-a-zingiberene and (2)-bsesquiphellandrene into (+)-a-turmerone and (+)-b-turmerone.
Because all three enzymes were very similar, P1 and P4 were selected for cloning from turmeric rhizome cDNA, yielding five paralogs, P1A, P1A2, P4, P4A and P4A2, which were expressed in yeast. Co-expression of these P450s with ST00A in EPY224 that was not expressing a cytochrome P450 reductase gene did not produce hydroxylated forms of (2)-a-zingiberene or (2)-bsesquiphellandrene. Co-expression of these P450s with ST00A and sweet basil P450 reductase (Ob_CPR) yielded some products that appear to be hydroxylated forms of (2)-a-zingiberene and (2)-b-sesquiphellandrene, according to comparison of mass spectra of these peaks with (2)-a-zingiberene, (2)-b-bisabolene and (2)-b-sesquiphellandrene ( Figure 9). However, these hydroxylated products are not registered in the NIST GC/MS library, and unambiguous identification will require significant work in the future. Nevertheless, we can draw conclusions using our current data that suggest that these are indeed the intermediates of interest for the biosynthesis of the turmerones.
In Figure 9H, peak 2 and 4 have m/z 220, which is the mass of hydroxylated forms of (2)-a-zingiberene, (2)-b-bisabolene or (2)b-sesquiphellandrene, and peak 1 and 3 have m/z 222. Although m/z 222 is not what would be expected for hydroxylated forms of (2)-a-zingiberene, a-bisabolol (MW 220) also contains m/z 222 in its mass spectrum. Other ions such as m/z 93, 119, 68, etc., also support the proposal that these peaks are likely to be hydroxylated forms of (2)-a-zingiberene, (2)-b-bisabolene or (2)-b-sesquiphellandrene. Because these peaks have different mass spectra than those of a-bisabolol, epi-a-bisabolol or b-bisabolol, it is likely that they are derived instead from (2)-a-zingiberene or (2)-bsesquiphellandrene. Thus, the biosyntheses of (+)-a-turmerone and (+)-b-turmerone likely originates by the action of azingiberene/b-sesquiphellandrene synthase, followed by the action of a-zingiberene/b-sesquiphellandrene hydroxylase (P1, P2, or P4) and is completed by the action of a dehydrogenase (still uncharacterized) that converts the secondary alcohols to the ketone forms.

Evolution of Terpene Synthases in Ginger and Turmeric Investigated through Protein Structural Modeling
We identified 25 mono-and 16 sesquiterpene synthases from ginger and turmeric and revealed the function of 13 mono-and 11 sesquiterpene synthases. The products of these TPSs and terpenoids in tissues correspond well except for a few instances, where for example high levels of citral (geranial+neral) were observed in ginger rhizomes and leaves, but no enzyme with geraniol synthase activity was observed. Geraniol is the precursor of citral. It is likely that one of the insoluble enzymes is a geraniol synthase.
Many of the identified TPSs are very similar to each other and are considered to be paralogs, of which some have conserved functions whereas others have very divergent functions. For example, MT06 and MT06A were 98.0% identical and only had differences in the transit peptide sequences with one gap. Also, ST00A and ST00B, with 98.4% similarity, synthesized the same products. However, although MT06B is 92.9% identical to MT06 and 90.8% to MT06A, the products of these enzymes are quite different. MT06 and MT06A can produce sesquiterpenes whereas MT06B cannot. When MT06 and MT06B protein structures were modeled using (4S)-limonene synthase from Mentha spicata [40] as a template, their backbone structures were found to be very similar but the side chains in the substrate binding pocket were different ( Figure 10). F327 of MT06B appears to prevent FPP binding whereas MT06 has a leucine at that position, which can allow FPP binding, leading to production of sesquiterpenes. Although MT06B has one extra amino acid in the loop (arrow in Figure 10B) when compared to MT06 and (4S)-limonene synthase, the tyrosines after the loop (Y576 in MT06, Y577 in MT06B) are aligned very well in the modeled structures and do not seem to affect protein function. These results suggest that replacement of F327 with a Leu may allow expanded substrate versatility and product production.
ST02B and ST02C are 95.7% identical. However, ST02C produces b-elemene (germacrene A) as a major product and ST02B produces a-elemol as a major product and b-elemene (germacrene A) as a minor product. When their protein structures were modeled based on (+)-d-cadinene synthase from Gossypium arboreum [41], we could not see side chain differences around the active sites in the ST02B and ST02C modeled structures ( Figure 11). According to the modeled structures, the main differences lie in the N-terminal loop. The N-terminal end loop of ST02B is closer to the C-terminal end loop than was predicted for ST02C. The difference in contact of the N-and C-terminal end loops may affect protein breathing and cause easy access for water molecules to quench the reaction. ST02A4 is also very similar to ST02B and ST02C: 95.5% and 96.6% similarity, respectively. ST02A4 produces (2)-neointermedeol as a major product and belemene (germacrene A) as a minor product. When the ST02A4 structure was modeled against the (+)-d-cadinene synthase structure and compared with the modeled ST02B and ST02C structures, there was no difference in side chains near the active site. Again, ST02A4 has a different loop structure at the Nterminal end. The expected structure for ST02A4 is more similar to ST02B than to ST02C because ST02C is expected to have an extended a-helix and shorter loop ( Figure 11). The loop around the tryptophan at the N-terminal end of the RRX 8 W motif is in contact with the C-terminal end of the helix ( Figure 11D) in all three modeled structures. This interaction may stabilize the Cterminal end structure near the active site. Although ST02A4 and ST02B synthesize different major products, both compounds are quenched by a water molecule, which can be explained by their similar loop structures at the N-terminal ends. ST02C and the template, (+)-d-cadinene synthase, have similar loop structures at the N-terminal ends, prolonged a-helices and shorter loops, and both produce the terpenes not quenched by water molecules.
ST02A4, ST02B and ST02C appear to have diverged recently and their modeled structures are very similar. They also produce similar products: ST02A4 produces (2)-neointermedeol, ST02B produces a-elemol and ST02C produces b-elemene (germacrene A) as major products. These compounds are synthesized via the same mechanistic pathway through (+)-germacrene A ( Figure 5).
(+)-a-Turmerone and (+)-b-turmerone are Derived from (2)-a-zingiberene and (2)-b-sesquiphellandrene During efforts to identify ginger and turmeric compounds eluted in our GC/MS analyses, four major sesquiterpenoids stood out as the most abundant in turmeric rhizomes. Two of these compounds were clearly (2)-a-zingiberene and (2)-b-sesquiphellandrene, also the major compounds in ginger rhizomes. The other two compounds were oxygenated, and had best hits to ''tumerone'' and ''curlone'' when searched against the NIST database. A search for tumerone (CAS# 180315-67-7) in SciFinder Scholar gave only two references: one from 1934 and the other in 1996 [23], which used analysis of fragmented masses in GC/MS but not NMR for structural characterization. No absolute configuration data were reported for this compound; thus the actual structure of tumerone was clearly in question. Our data suggest that the compound detected in our samples was instead (+)-a-turmerone, with absolute configuration of (6R, 7S). The absolute configuration of (2)-curlone (CAS# 87440-60-6), on the other hand, was proposed to be (6S, 7S) by dehydrogenation to form (+)-arturmerone (7S) and NOE correlations that were not shown in the original NMR-based characterization of that compound [42] that were said to indicate possible 6S configuration, based on some assumptions regarding molecule conformation that may not be valid. These results suggested that either (2)-curlone exists as a (6S, 7S) diastereomer of (+)-b-turmerone (6R, 7S), or it was incorrectly assigned the 6S configuration and (2)-curlone and (+)b-turmerone are one and the same compound. The latter is very likely the case [43].
When we compared the terpenoid profiles of ginger and turmeric, we observed that both ginger and turmeric produce (2)a-zingiberene and (2)-b-sesquiphellandrene, but only turmeric synthesizes the two oxygenated compounds. Ginger and turmeric produce more (2)-a-zingiberene than (2)-b-sesquiphellandrene and one oxygenated compound (a-turmerone) is also more abundant than the other oxygenated compound (b-turmerone, see Figure 1 and Figure 8). The ratios of these compounds are fairly consistent in all analyses. Indeed, when we view (2)-azingiberene (green) and (+)-a-turmerone (orange) as one group and (2)-b-sesquiphellandrene (blue) and (+)-b-turmerone (red) as a second group, the ratios of the first group to the second group are 2.7 and 2.9, respectively, in Figure 1D and 1E (in turmeric rhizomes), which is very similar to the ratio (3.2) of (2)-azingiberene to (2)-b-sesquiphellandrene in ginger rhizome ( Figure 1A). This suggested that the two oxygenated compounds ((+)-a-turmerone and (+)-b-turmerone) could indeed be derived from (2)-a-zingiberene and (2)-b-sesquiphellandrene, respectively, and, based on absolute configuration of the compounds in question (all four share the 6R, 7S configuration), and the mechanism whereby the ketones are formed from the sesquiterpenes, which would not be expected to lead to inversion of configuration at C6, the oxygenated compounds present in the turmeric lines that we tested must indeed be (+)-a-turmerone and (+)-b-turmerone, even though they were known as tumerone and curlone in the NIST database. These results again point to the care that must be taken in such metabolite profiling experiments, where one should not rely solely on database searches to identify compounds in complex plant samples. It is possible that some turmeric accessions or plants do contain (2)-curlone, the diastereomer of (+)-b-turmerone. However, since it seems very unlikely that (2)-b-sesquiphellandrene is converted into (2)curlone, either another route to (2)-curlone must exist in other turmeric lines, or the original paper describing (2)-curlone was incorrect in assignment of the configuration at C6. If (2)-curlone is present in other plants, its biosynthesis would involve the alternate intermediate, cyclohexene, 3-[(1S)-1,5-dimethyl-4-hexen-1-yl]-6methylene-, (3S)-(CAS# 251318-35-1) (Figure 8), which has only been reported as a synthetic molecule [44]. (2)-b-Sesquiphellandrene and (2)-a-zingiberene are both produced by the STPS through an (S)-bisabolyl cation followed by a C7/C1 1,3-hydride shift ( Figure 8). The precursor for (2)-curlone would have to be produced from the (R)-bisabolyl cation via a C7/C1 1,3-hydride shift. Thus, (2)-curlone synthesis (should it occur in nature) likely does not involve ST00A/ST00B, which appears to utilize only the (S)-bisabolyl cation as an enzyme-bound intermediate.

Table S3
Mono-and sesquiterpene synthases identified in the ginger and turmeric EST database created from cDNA libraries from different tissues: Rh, rhizome; R, root; L, leaf. The numbers in the turmeric and ginger columns represent EST number per cDNA library in the database for the corresponding unitrans. Four unitrans, MT10, MT13, MT14 and MT18 were considered to belong to other unitrans after close investigation. In the RACE column, 5 or 3 means that 59 or 39 RACE was required to obtain full-length clones and, bolded and underlined means RACE was finished. Each unitrans was cloned for further characterization from the grey boxed sample. MT00 and MT11 were subcloned from the original cDNA clones directly without requiring RT-PCR. (DOC)

Table S4
Expression levels of ginger and turmeric terpene synthase unitrans based on microarray data. Probes for microarrays were designed from partial sequences of unitrans before RACEs revealed full sequences. Abbreviations are; GY, Yellow Ginger; F, turmeric variety Fat Mild Orange (FMO); T, turmeric variety Thin Yellow Aromatic (TYA); Rh, rhizome; R, root; L, leaf. TYA barely produces sesquiterpenes and is used as a control for microarray experiments. The chemical profiles of the FMO variety have no differences from the turmeric variety, Hawaiian Red Turmeric (HRT) used to clone genes. Both were clonally derived from the same original line.

(DOC)
Table S5 Vectors used to express various ginger or turmeric TPS proteins in either E. coli or yeast cells. Solubility of expression in E. coli was checked in Coomassie-stained gels. Solubility of expression in yeast was checked by Western blotting. Solubility is the ratio of total and soluble fractions. n/a means that the solubility was not evaluated. The vector and cell combinations marked with "*" were used for further analysis to identify the functions of specific proteins as outlined in the text. (DOC) Results S1 Additional details regarding cloning, expression and characterization of specific TPS genes can be found here. (DOCX)