Metagenomic Screening for Aromatic Compound-Responsive Transcriptional Regulators

We applied a metagenomics approach to screen for transcriptional regulators that sense aromatic compounds. The library was constructed by cloning environmental DNA fragments into a promoter-less vector containing green fluorescence protein. Fluorescence-based screening was then performed in the presence of various aromatic compounds. A total of 12 clones were isolated that fluoresced in response to salicylate, 3-methyl catechol, 4-chlorocatechol and chlorohydroquinone. Sequence analysis revealed at least 1 putative transcriptional regulator, excluding 1 clone (CHLO8F). Deletion analysis identified compound-specific transcriptional regulators; namely, 8 LysR-types, 2 two-component-types and 1 AraC-type. Of these, 9 representative clones were selected and their reaction specificities to 18 aromatic compounds were investigated. Overall, our transcriptional regulators were functionally diverse in terms of both specificity and induction rates. LysR- and AraC- type regulators had relatively narrow specificities with high induction rates (5-50 fold), whereas two-component-types had wide specificities with low induction rates (3 fold). Numerous transcriptional regulators have been deposited in sequence databases, but their functions remain largely unknown. Thus, our results add valuable information regarding the sequence–function relationship of transcriptional regulators.


Introduction
Bacteria that degrade aromatic compounds are widely distributed in the environment and are important for breaking down both natural and xenobiotic compounds. Attempts have been made to screen for microorganisms that degrade aromatic compounds [1][2][3], as well as genes responsible for degradation [1,[3][4][5][6][7]. These studies revealed that the majority of reported bacterial aromatic degradation processes are aerobic [8] and comprise a series of enzymes that are usually categorized as either 'upper'-or 'lower'-pathway enzymes [9]. In the upper pathway, aromatic compounds are transformed into aromatic vicinal diols, which is performed by a monoxygenase or dioxygenase [10]. The aromatic vicinal diols are then converted to dihydroxy compounds by dihydrodiol dehydrogenase in the upper pathway. In the lower pathway, the resulting dihydroxylated aromatic compounds are transformed into ring-cleavage products by either extradiol dioxygenases or intradiol dioxygenases. The subsequent metabolic steps are referred to as meta-or ortho-pathways.
The ring-cleavage products are further degraded into compounds that can enter the tricarboxylic acid cycle. These studies revealed large diversity in the degradation pathways and enzymes that depend on compounds and microbial origins. The advent of metagenomic approaches has revealed an even higher degree of diversity [11][12][13][14][15]. We have previously targeted extradiol dioxygenases, which convert colorless catecholic compounds to yellowish ring-opened products, to screen degradation pathways and identify novel enzymes belonging to new subfamilies [11], as well as novel arrangements in the degradation pathway genes [16,17].
In addition to these enzyme-encoding genes involved in aromatic compound degradation, we have previously screened metagenomic libraries for regulatory elements that sense aromatic compounds [18]. Using the fluorescence-based reporter assay system designated SIGEX (Substrate-induced Gene Expression), we identified transcriptional regulators that sense benzoate (8 clones) and naphthalene (2 clones) [18]. In this study, we applied SIGEX to screen for transcriptional regulators in the same library using salicylate, 3-methyl catechol, 4-chlorocatechol and chlorohydroquinone as inducers. These compounds can be used to screen various degradation pathways for aromatic compounds and should provide us with a comprehensive view of the transcriptional regulators responsible for aromatic compound degradation.

Reagents
Restriction enzymes and DNA ligase were purchased from Takara Bio (Shiga, Japan). Aromatic compounds were purchased from Tokyo Chemical Industry (Tokyo, Japan). Media and agar were purchased from BD Diagnostics (Sparks, MD).

Library construction and screening
The metagenomic library was constructed using environmental DNA extracted from groundwater contaminated with crude oil, as described previously [18]. The library consisted of approximately 152,000 clones with an average insert size of ~7 kb. The library cells were grown in the presence of 0.5 mM isopropyl-β-D-thiogalactopyranoside (IPTG), sorted on a FACSVantage SE (Becton Dickinson, Franklin Lakes, NJ) fluorescence activated cell sorter, and nonfluorescent cells were recovered. These cells were then grown in the presence of 2 mM induction compound, after which fluorescent cells were recovered and singly isolated by growing on LB/Amp agar plates at 37°C overnight. A total of 96 colonies were picked from the plates, resuspended in separate wells of 96-deep-well plates containing 1 mL of LB/Amp, and grown at 37°C overnight. An aliquot of the cells was transferred to 1 mL of fresh dLB/Amp and grown with vigorous shaking at 1,200 rpm at 37°C for 6 h. Cells were divided into 0.5-mL aliquots in 96-deep-well plates containing 0.5 mL of fresh dLB/Amp with or without 2 mM of test compound. Cells were then grown at 37°C for 24 h by shaking the 96-deep-well plates at 1,200 rpm, pelleted by centrifugation (1,500 g, 5 min), washed with distilled water, and resuspended in 200 µL of distilled water. The cell suspension (100 µL) from each well was transferred to the wells of a black, clear-bottomed 96-well plate. GFP fluorescence was measured on a SPECTRAmax Gemini XS (Molecular Devices, Sunnyvale, CA) spectrofluorimeter at excitation and emission wavelengths of 488 and 520 nm, respectively. Cell density (OD 600 ) was measured on a VERSAMax (Molecular Devices) UV-Vis microplate reader. Fluorescence was normalized to the cell density (OD 600 of 1.0). Positive clones showing an induction rate (ratio of fluorescence intensity +inducer/-inducer) higher than 3.0 were collected. These clones were then subjected to restriction fragment length polymorphism (RFLP) analysis using EcoRI or HindIII to check for possible duplication.

DNA sequencing and sequence analysis
DNA sequencing was performed using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA) and an ABI3730XL sequencer (Applied Biosystems). Open reading frame analysis was performed using a web-based ORF Finder software at http:// www.ncbi.nlm.nih.gov/. Homology searches were performed using BLASTX at http://blast.ncbi.nlm.nih.gov/ with the default parameter settings. Amino acid sequence alignment was performed using the web-based ClustalW version 1.83 software tool at http://clustalw.ddbj.nig.ac.jp/top-j.html, and the results were refined by visual inspection. Neighbor-joining trees were constructed using FigTree, version 1.3.1 (http:// tree.bio.ed.ac.uk/software/Figtree/). Other sequence analyses were performed using the GENETYX-MAC version 15.0.0 software package (GENETYX Co., Tokyo, Japan).

Deletion analysis
Deletion analysis was performed using restriction digestion and self-ligation. Restriction enzymes used for the deletion are shown in Figure 1. The resulting ligation mixture was used to transform E. coli JM109 by electroporation.

Nucleotide sequences accession number
Nucleotide sequences reported in this paper have been submitted to the GenBank/EMBL/DDBJ under accession Nos. AB828163 to AB828174.

Screening for transcriptional regulators from the metagenomic library
A metagenomic library was constructed using groundwater contaminated with crude oil. In our previous study, we screened the library for transcriptional activators that specifically sensed benzoate and naphthalene [18]. In this study, we screened the same library using different aromatic inducing compounds; namely, salicylate, 3-methylcatechol and 4-chlorocatechol. Salicylate is a key metabolic intermediate in polycyclic aromatic hydrocarbon catabolic pathways [19,20]. 3-Methylcatechol is also a key metabolic intermediate in the oxylene, m-xylene and toluene catabolic pathways [2,21], while 4-chlorocatechol is a key metabolic intermediate of several chlorophenoxyacetic acid herbicides [22]. These compounds play a role in the degradation pathways of various aromatic compounds.
SIGEX was applied to screen the library and yielded 120 positive clones: 54 clones for salicylate, 30 for 3methylcatechol and 36 for 4-chlorocatechol. The sorting ratio (ratio of the number of fluorescence cells to total cells subjected to cell sorting) was 1.4×10 -4 , which was similar to the value (2.3×10 -4 ) obtained in our previous experiment using benzoate and naphthalene as inducing compounds [18].
Since the library was amplified in liquid medium, RFLP analysis was conducted to remove redundant clones. Based on the restriction pattern, salicylate-inducible clones were divided into five types (SAL1A, SAL1H, SAL6F, SAL7A, SAL10D), 3methylcatechol-inducible clones into two types (MECA2G, MECA5D), and 4-chlorocatechol-inducible clones into three types (CHLO4C, CHLO6C, CHLO8F). These clones were tested for cross reactivity, which revealed that SALM2B had dual specificity for salicylate and 3-methylcatechol, and MECA7B to 3-methylcatechol and 4-chlorocatechol. In total, we recovered 12 types of clone ( Figure 1, Table S1). The maximum induction rate was observed for MECA7B when 3methylcatechol was used as an inducer (39.3-fold induction). Minimum induction rates were observed for SALM2B (3.0-fold induction with salicylate) and MECA5D (3.0-fold induction with 3-methylcatechol). Positive clones did not degrade the inducible compounds based on HPLC analysis.

Sequence analysis of the metagenomic fragments
To determine whether selected clones contained transcriptional regulators, we performed DNA sequencing analysis of the 12 positive clones (Figure 1, Table S1). We found that all but CHLO8F carried ORFs classified into known families of transcriptional regulators (Figure 1, Table S2).

Functional identification of regulatory elements by deletion analysis
To functionally identify the ORFs responsible for transcriptional activation, we created a series of deletion derivatives that were subjected to induction tests ( Figure 2).
In contrast, the deletion derivative of a receptor histidine kinase homolog MECA5D_ORF3 (pMECA5D/SphI) constitutively expressed GFP. In this variant, responsive regulator MECA5D_ORF4, which is located adjacent to the histidine kinase, might have lost its activity, i.e., transcriptional repression, due to the inactivation of the cognate kinase.
Regarding CHLO8F, a series of deletion mutants (pCHLO8F/ SalI, pCHLO8F/EcoRV and pCHLO8F/HincII) was constructed, but all of them had induction properties similar to the original clone. Most of the ORFs included in pCHLO8F were flagellar proteins (Table S1), and we could not identify protein elements responsible for the specific induction by 4-chlorocatechol.

Induction specificity of the metagenomic transcriptional regulators
We next used the nine deletion derivative clones to test the induction specificity towards 18 aromatic compounds. Induction was performed in dLB medium in the presence and absence of 2 mM test compound. As shown in Figure 3, our transcriptional regulators retrieved from the metagenome are functionally diverse both in specificity and induction rate.
The phylogenetic relationship of metagenomics and functionally characterized AraC-type transcriptional regulators is shown in Figure S2. SAL10D_ORF7 had no significant similarity to known AraC-type transcriptional regulators of aromatic hydrocarbon degradation pathway operons.
The phylogenetic relationship of metagenomic and functionally characterized two-component regulatory systemtype transcriptional regulators is shown in Figure S3. Some toluene degradation operons and styrene degradation operons are regulated by two-component signal transduction systems. Their response regulators belong to the NarL-like helix-turnhelix (HTH) family [27], but SALM2B_ORF6 and MECA5D_ORF4 belonged to OmpR-like HTH family, which lack homology to known regulators of aromatic compound degradation. Among the nearest neighbors, SALM2B_ORF6 shared 53% amino acid sequence identity with PmrA (Q02FP6) and SALM2B_ORF5 shared 30% identity with a PmrB (Q02FP5). The PmrA-PmrB system is activated by cationic antimicrobial peptides, which are involved in the regulation of resistance to polymyxin B and cationic antimicrobial peptides in P. aeruginosa [28]. However, it is not known whether this system is regulated by aromatic compounds.
MECA5D_ORF4 (response regulator) shared 72% amino acid sequence identity with CopR (1909226A) and MECA5D_ORF3 (histidine kinase) shared 37% amino acid sequence identity with CopS (1909226B). The CopR-CopS system is involved in the regulation of copper resistance in P. syringae [29]. CopR directly regulates the copper resistance operon and CopS is activated by high copper concentrations, but how this system responds to aromatic compounds remains unknown.

Conclusion
Using reporter assay-based screening of a metagenomic library, we successfully identified transcriptional regulators with different compound specificities and induction rates. The majority of these compounds were classified into known regulator types, including LysR, AraC, and two-component. In general, LysR-type regulators had relatively narrow specificities with high induction rates, while two-component-types had wide specificities with low induction rates. A large number of transcriptional regulators have been deposited in sequence databases, but their functions (i.e., compound specificity and induction rate) remain unknown. Our results increase our understanding of the sequence-function relationship of transcriptional regulators.

Author Contributions
Conceived and designed the experiments: TU KM. Performed the experiments: TU KM. Analyzed the data: TU KM.