Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production.


Introduction
Actinomycetes are Gram-positive soil bacteria that produce various secondary metabolites. For example, metabolite production is well documented in Streptomyces species and many studies on secondary metabolite production have been conducted using Streptomyces strains [1,2]. Nocardia species, which are mostly opportunistic human and animal pathogens, are also members of the actinomycetes bacteria. Although secondary metabolite production in Nocardia species is less well studied, a recent genome-based analysis revealed that the number of secondary metabolite gene clusters in Nocardia species is comparable with that of Streptomyces species [3]. This suggests that Nocardia species may also be a sources of secondary metabolites.
Recently, Mukai et al. identified a thiopeptide compound, nocardithiocin, from Nocardia pseudobrasiliensis [4] (Fig 1).Thiopeptides (thiazolyl peptides) are highly modified sulfur-rich peptides synthesized by the ribosome. These compounds all contain a central nitrogen-containing six-membered ring, which serves as the scaffold for at least one macrocycle and a tail. These compounds possess a wide range of bioactivities, including antimicrobial, anticancer, and antiplasmodial effects. In addition to these clinically promising activities, this family of compounds was recently highlighted, because many members of thiopeptides show potent activity against drug-resistant pathogens, including methicillin-resistant Staphylococcus aureus, penicillin-resistant Streptococcus pneumoniae, and vancomycin-resistant enterococci [5][6][7][8][9]. Nocardithiocin also has relatively high antibiotic activity against acid-fast bacteria, and effectively suppresses growth of rifampicin-resistant Mycobacterium tuberculosis [4].
Peptide-based natural products are synthesized by ribosome pathways or nonribosomal peptide synthase pathways. Thiopeptides biosynthesis begins with ribosomally-synthesized precursor peptides and follows several modifications, such as cyclization and dehydration, to form the macrocycle structure. Further side-chain modifications produce the final complex products [10,11]. Several thiopeptide biosynthetic gene clusters have already been identified using the sequences of putative precursor peptides, cyclodehydratases, and modification enzymes as markers [12][13][14]. Other than the gene coding for precursor peptide, all thiopeptide gene clusters reported to date contain several core sets of genes that participate in the heterocyclization and dehydration of core peptides [11].
Here, we identified the gene cluster responsible for nocardithiocin biosynthesis in wholegenome sequence of N. pseudobrasiliensis. The relationship between nocardithiocin production and the predicted cluster was confirmed using gene disruption and complementation analyses.

Nocardithiocin production and detection
The strains were pre-cultured in 5 mL of BHIgg medium at 37°C for 72 h. The pre-cultures were then used to inoculate (1% of medium, v/v) 10 mL of CD medium and cultivated at 30°C for 7 days. Following cultivation, the culture supernatants were extracted with equal volumes of ethyl acetate, and the ethyl acetate layer was evaporated in vacuo. Amounts of nocardithiocin in the samples were analyzed by HPLC using a Wakopack Wakosil-II 3C18 HG column (Wako Pure Chemical Industries, Osaka, Japan). The column was developed using a gradient of 10-100% acetonitrile in water over 50 min at a flow rate of 1 mL/min and monitored by a fluorescence detector (excitation at 350 nm and emission at 450 nm) (RF-20A/RF-20Axs, Shimadzu Corp., Kyoto, Japan).

Prediction of the nocardithiocin biosynthetic gene cluster from genome sequence
An essential gene for thiopeptide biosynthesis, cyclodehydratase, was amplified by PCR using the genomic DNA as a template and degenerate primer set (Thio-F (5 0 -TACGAGACCTC CAAYGGNTGYGCN-3 0 ) and Thio-R (5 0 -GTGGCCRAASGTCATNGG-3 0 ), final concentration was 1 pmol/μL) [10]. PCR amplification was conducted as follow, 94°C for 2 min, followed by 35 cycles of 94°C for 30 sec, 52°C for 30 sec and 68°C for 30 sec using Quick Taq HS DyeMix (TOYOBO, Osaka, Japan). The amplified fragment was sequenced and its homologous region was found in the whole genome sequence of N. pseudobrasiliensis IFM 0761, and the flanking region of its homologous sequence was investigated to find other genes which may participate in nocardithiocin biosynthesis.

Construction of vectors for disruption and complementation of notL
The disruption vector for the notL (the putative cyclodehydratase) was constructed as follows: upstream and downstream fragments of notL were amplified using PCR using the notL-upF/-upR and notL-downF/-downR primer sets, respectively, and N. pseudobrasiliensis IFM 0761 DNA as the template. The kanamycin resistance gene aph was amplified using the Laph-F/-R primer set with pNV18.1 as the template. KOD-Plus-Neo (TOYOBO) was used as the enzyme for all PCRs. The three fragments were then ligated into the HincII site of pUC19 using GeneArt Seamless Cloning and Assembly Enzyme Mix (Life Technologies, Tokyo, Japan) according to the manufacturer's instructions.
For the notL complementation vector, notL was expressed under the control of the ermE promoter (PermE) and rrnB terminator (TrrnB). The PermE fragment was amplified using the primer set PermE-F/-R and pKU470 as the template. The notL fragment was amplified using the primer set notL-compF/-R and N. pseudobrasiliensis IFM 0761 DNA as the template. The TrrnB fragment was amplified from pRU1701 (Addgene, Cambridge, MA, USA) using the TrrnB-F/-R primer set. All the three PCR fragments were ligated and cloned into the HincII site of pNV38.1 (containing a chloramphenicol resistance gene) using GeneArt Seamless Cloning and Assembly Enzyme Mix. The disruption and complementation vectors were confirmed by sequencing. All primer sequences are listed in Table 2.
Transformation of N. pseudobrasiliensis N. pseudobrasiliensis was cultivated in 10 mL BHIgg medium at 37°C overnight. Cells were collected by centrifugation and rinsed with ice-cold water, followed by 10% glycerol, and suspended in ice-cold 10% glycerol. A 50-μL aliquot of cell suspension was then transferred to a chilled electroporation cuvette (2 mm gap width) and mixed with 1-μL of vector suspension. After pulsing at 12.5 kV/cm in a MicroPulser electroporator (Bio-Rad Laboratories, Hercules, CA, USA), the cuvette was placed on ice for 2 min, and then 900 μL of BHIgg was added to the cuvette. After incubating at 37°C for 3 h, cells were plated onto BHIgg plates containing 100 μg/mL of neomycin or chloramphenicol, and further incubated at 37°C for 2−3 days.
Resulting single cross-over strains were confirmed by colony PCR, and candidate colonies were cultivated in BHIgg medium without neomycin at 37°C for 2− 3 days to obtain the second cross-over strains. The cultivated cells were diluted and plated onto BHIgg plates without neomycin. The resulting colonies were picked and duplicate-plated onto BHIgg plates with and without neomycin. The neomycin-sensitive putative double cross-over colonies were confirmed by PCR and Southern blot analysis.

Transcriptional analysis
RNA was extracted from cells cultured under both nocardithiocin-producing and non-producing conditions. The RPMI 1640 medium (TaKaRa, Shiga, Japan) supplemented with 10% fetal bovine serum (FBS; Gibco Inc., Tokyo, Japan) was used for the nocardithiocin-producing conditions, and RPMI 1640 medium without FBS was used for the non-producing conditions. Cells pre-cultivated cells on BHIgg medium at 37°C for 72 h were inoculated into the nocardithiocin-producing and non-producing media and then incubated at 37°C under a 5% CO 2 atmosphere for 7 days. Following incubation, the cells were collected and suspended in ISO-GEN RNA extraction solution (Nippon Gene Co., Tokyo, Japan). Resuspended cells were homogenized using a MagNA Lyser (Roche Diagnostics, Tokyo, Japan) at a speed of 7,000 for 20 s, and RNA was extracted according to the ISOGEN protocol. Following DNase treatment, 2 μg of total RNA were treated with a Ribo-Zero Magnetic Kit (Epicentre, Madison, WI, USA) to remove ribosomal RNA. For the construction of the RNA-seq library, obtained mRNA samples were treated with a SureSelect Strand Specific RNA library preparation kit (Agilent Technologies) according to the manufacturer's protocol. Constructed libraries were sequenced with HiSeq system (Illumina) using TruSeq Rapid SR Cluster Kit-HS (Illumina). The resulting data were mapped against sequenced genome data of N. pseudobrasiliensis IFM0761 using the CLC Genomics Workbench.

Selection of a nocardithiocin high-producing strain
Prior to attempting to identify the nocardithiocin biosynthetic gene cluster, we confirmed the nocardithiocin production of 14 N. pseudobrasiliensis strains stocked at the Medical Mycology Research Center, Chiba University, Japan (Table 1). Under our experimental conditions, two strains did not produce nocardithiocin, while N. pseudobrasiliensis IFM 0761 produced the greatest amount of nocardithiocin, as estimated by HPLC fluorescent peaks (Fig 2). The nocardithiocin peak was confirmed by its UV and MS spectrum (S1 Fig). Because higher production levels detection of the compound easier, IFM 0761 was used in further experiments.

Identification of the nocardithiocin gene cluster
To locate the nocardithiocin biosynthetic gene cluster in the N. pseudobrasiliensis genome, we first tried to identify the cyclodehydratase genes, which are relatively conserved among bacterial strains and required for thiopeptide biosynthesis [10]. Using degenerate primer sets, amplified a putative cyclodehydratase gene fragment and then sequenced it. A search of the NCBI databases using the BLASTx algorithm confirmed that the 674-bp amplified fragment showed high homology to the cyclodehydratase gene of Nonomuraea species Bp3714-39 (TpdG: 55% identity) [15]. Based on this result, we obtained the sequences of the amplified region from the N. pseudobrasiliensis IFM 0761 genome, and tentatively predicted that a cluster of 12 genes (notA−notL) within a15.2-kb region was responsible for nocardithiocin biosynthesis (Fig 3A). This cluster contained one precursor peptide, one transcriptional regulator, and other modification enzymes needed for heterocyclization and side chain modification. Although gene organization within the cluster differed from other reported thiopeptide gene clusters, the genetic composition was similar ( S2A Fig). Furthermore, the amino acid sequence of the precursor peptide of the predicted gene cluster correlated with the nocardithiocin structure. Based on the predicted functions of the 12 genes, a biosynthetic scheme was proposed (Fig 3B).

Gene disruption and nocardithiocin production
To confirm the relationship between nocardithiocin production and the predicted gene cluster, the putative cyclodehydratase gene notL was disrupted by replacement with an antibiotic resistance gene. The genotypes of the resulting disruptants were confirmed by Southern blotting and sequencing (data not shown). Nocardithiocin production in the disruptants was analyzed by HPLC (Fig 4), which revealed complete loss of nocardithiocin production. Additionally, nocardithiocin production was restored in the complementation strains (Fig 4, lower panel), suggesting that the predicted gene cluster was responsible for nocardithiocin production.

Transcriptional analysis
Our preliminary study of cultivation conditions, revealed that addition of FBS to the RPMI 1640 medium improved nocardithiocin production in N. pseudobrasiliensis (S3 Fig). To further confirm the involvement of the predicted nocardithiocin biosynthetic gene cluster in nocardithiocin production, the expression levels of each genes was compared under nocardithiocinproducing (with FBS) and non-producing (without FBS) conditions. Expression levels were compared by reads per kilobase of exon per million mapped sequence read (RPKM) values by RNA-seq analysis. The RPKM is used to quantify gene expression from RNA-seq data by normalizing total read length and the number of mapped reads, so it can be used as the gene expression value. The fold changes in the RPKM values were calculated between nocardithiocin-producing and non-producing conditions (production/no-production). There were large Confirmation of nocardithiocin production. HPLC analysis of the nocardithiocin production in representative Nocardia pseudobrasiliensis IFM strains. In total, 14 strains were tested ( Table 1). The peak pattern did not differ considerably between the strains, therefore representative data are shown. The red line indicates the elution time of nocardithiocin.
doi:10.1371/journal.pone.0143264.g002 The dehydrogenation reaction forms a thiazole ring (green), dehydration produces dehydroamino acids (red), and the cyclization of the precursor peptide at serines S1 and S10 forms a pyridine ring. Further modifications by P450 monooxygenase (yellow) and methyltransferase (pink) are also indicated.
RPKM-fold changes of the predicted genes within the predicted gene cluster (notA-notL), while the fold changes of the flanking genes were very small (Fig 5). The pattern of gene expression clearly changed in the predicted cluster. These results supported that the predicted 12 genes formed the cluster participate in nocardithiocin biosynthesis.

Discussion
Nocardia species are Gram-positive soil bacteria and known opportunistic pathogens. Various species have been isolated from clinical samples [16][17][18]. Although Nocardia species are actinomycetes, there have been very few studies of their secondary metabolites, or the genes that synthesize them, in comparison with those conducted in Streptomyces species, mainly because of the lack of genetic tools. However, recent developments in Nocardia-E. coli shuttle vectors [19], and the availability of Nocardia genome sequences [3], prompted us to investigate the secondary metabolites of Nocardia species. Nocardithiocin was purified from N. pseudobrasiliensis as a novel thiopeptide antibiotic compound [4]. Thiopeptides are ribosomally synthesized and post-translationally highly modified peptide compounds. Since the first thiopeptide compound was discovered, almost 100 additional related compounds have been identified. The wide range of bioactivities and efficient antimicrobial activities against drug-resistant pathogens, such as methicillin-resistant S. aureus, penicillin-resistant S. pneumoniae, and vancomycin-resistant enterococci, make thiopeptides attractive lead compounds for developing novel therapeutic drugs [5][6][7][8][9]. However, poor pharmacokinetics and low water solubility haveprevented thiopeptides from being used in humans [20]. To overcome these defects, chemical syntheses have been conducted to produce useful derivatives [21], and with the discovery of the thiopeptide biosynthetic gene cluster, gene replacement becomes a promising approach [6,22].

Identification of Nocardithiocin Gene Cluster
In this study, we identified the nocardithiocin biosynthetic gene cluster in N. pseudobrasiliensis using the conserved sequence of an essential gene for thiopeptide biosynthesis (cyclodehydratase), along with whole genome sequence information. In the gene cluster, 12 genes, covering a 15.2-kb genomic region, were predicted to be involved in the nocardithiocin synthesis. Other reported thiopeptide gene clusters are in the range of 15.6−30.4 kb [10,[12][13][14][15], indicating our predicted cluster is a similar size. The genetic components of the predicted cluster were also similar to the previously reported thiopeptide gene clusters (S3A Fig). Previously reported precursor peptides are composed of 34−55 leading peptides at the N-terminus and 12 −17 structural peptides at the C-terminus [5], and the representative structure of precursor peptides were shown in S3B Fig The precursor peptide of the predicted nocardithiocin gene cluster consisting of 46 leading peptides and 13 structural peptides, which agrees with these previous reports.
Gene disruption and complementation experiments demonstrated that the predicted gene cluster is responsible for nocardithiocin production. The disruption of a putative precursor peptide gene (notG) also resulted in abolished nocardithiocin production (data not shown). These results strongly suggest that the identified gene cluster is responsible for nocardithiocin production.
The cyclodehydratase gene fragment was successfully amplified from all 14 N. pseudobrasiliensis strains examined in this study, except IFM 0756, using degenerate primer set Thio-F and Thio-R. This supports the lack of nocardithiocin production observed in strain IFM 0756. Although strain IFM 10700 did not produce nocardithiocin, the cyclodehydratase gene was successfully amplified from this strain. DNA-sequencing analysis has not yet been conducted in these two nocardithiocin non-producing strains, so we cannot explain the reason for the lack of nocardithiocin production. However, a deletion of the cyclodehydratase gene or within the nocardithiocin gene cluster in IFM 0756, and a mutation in a critical gene other than cyclodehydratase, in IFM 10700 could be the cause of the lack of nocardithiocin production in these strain.
The recent advance in experimental techniques for transcriptomic analyses, including microarrays and RNA-seq, allowed us to quantitate gene expression levels. Generally, genes responsible for secondary metabolite production are adjacently located, and are coordinately regulated for appropriate metabolite production. The transcriptional analysis in this study clearly confirmed the initial and terminal genes within the cluster. Identifying the coordinating gene expression pattern by comparing the gene expression levels under different cultivation conditions, such as conditions for producing different secondary metabolites, could be a useful way to predict novel gene clusters for production of secondary metabolites in this microorganism. Although nocardithiocin has a high level of antibiotic activity against rifampicin-resistant M. tuberculosis, the slight light sensitivity of this compound may prevent its clinical use [4]. The identification of the biosynthetic gene cluster expands the possibility for further structural modifications of nocardithiocin. Although further studies are needed, the genetic methodology used in this study along with the findings, might facilitate the use of Nocardia species as sources of secondary metabolites.