Identification and Characterization of Two Novel bla KLUC Resistance Genes through Large-Scale Resistance Plasmids Sequencing

Plasmids are important antibiotic resistance determinant carriers that can disseminate various drug resistance genes among species or genera. By using a high throughput sequencing approach, two groups of plasmids of Escherichia coli (named E1 and E2, each consisting of 160 clinical E. coli strains isolated from different periods of time) were sequenced and analyzed. A total of 20 million reads were obtained and mapped onto the known resistance gene sequences. As a result, a total of 9 classes, including 36 types of antibiotic resistant genes, were identified. Among these genes, 25 and 27 single nucleotide polymorphisms (SNPs) appeared, of which 9 and 12 SNPs are nonsynonymous substitutions in the E1 and E2 samples. It is interesting to find that a novel genotype of bla KLUC, whose close relatives, bla KLUC-1 and bla KLUC-2, have been previously reported as carried on the Kluyvera cryocrescens chromosome and Enterobacter cloacae plasmid, was identified. It shares 99% and 98% amino acid identities with Kluc-1 and Kluc-2, respectively. Further PCR screening of 608 Enterobacteriaceae family isolates yielded a second variant (named bla KLUC-4). It was interesting to find that Kluc-3 showed resistance to several cephalosporins including cefotaxime, whereas bla KLUC-4 did not show any resistance to the antibiotics tested. This may be due to a positively charged residue, Arg, replaced by a neutral residue, Leu, at position 167, which is located within an omega-loop. This work represents large-scale studies on resistance gene distribution, diversification and genetic variation in pooled multi-drug resistance plasmids, and provides insight into the use of high throughput sequencing technology for microbial resistance gene detection.


Introduction
Multidrug resistant E.coli, a clinically significant pathogen, has become a major threat to human health all over the world [1][2][3]. It has become a major cause of hospital-acquired infections worldwide mostly due to rapid acquisition of resistance determinants by horizontal gene transfer via mobile genetic elements such as plasmids, integrons and transposons [4]. More and more chromosomally encoded antibiotic resistance genes emerge on mobile genetic elements, especially those carried on plasmids which are thus easily disseminated. Examples include the CTX-Ms family extend-spectrum b-lactamases (ESBLs), which are mainly produced by Enterobacteriaceae plasmids and have become extensively widespread enzymes in the past two decades [5][6][7]. Quinolone resistance in Enterobacteriaceae is commonly considered to be the result of chromosomal mutations [8]. However, recently, plasmid-mediated quinolone resistance (PMQR) has been discovered, which is related to a variety of genes, including qepA, qnr, oqxAB, aac(6')-Ib-cr; this resistance occurs through plasmid transfer among different Enterobacteriaceae strains [9,10]. Many of these PMQR genes are accompanied by ESBLs and/or aminoglycoside resistance genes on the same plasmid [8]. The increase in number as well as the complicated composition and constitution of resistance genes on plasmids poses a potential threat for empirical treatment of infections. Although plasmids are the most prevalent carriers of antimicrobial resistance determinants, the total resistance genotypes and gene abundance on them are difficult to assess unless complete plasmid sequences are available. Compared with large scale detection of resistance gene profiles, PCR methods appear to be cumbersome and time consuming. Furthermore, they are unsuitable for detecting the relative abundance of resistance genes in mixed samples which could provide insight into resistance gene epidemic tendency.
The second-generation sequencing technologies are generally used to resequence genomes whose reference sequences are available [11,12]. Solexa sequencing, as one of the secondgeneration sequencing techniques, has the property of high throughput and can be used to simultaneously sequence templates at a very large scale [13]. It has also been successfully used in de novo sequencing and the assembly of large eukaryotic and bacteria genomes [11,[14][15][16]. In this study, we applied high throughput parallel sequencing to investigate resistance gene distribution, diversification and genetic variation of plasmids of 320 strains of multidrug resistance E. coli (named E1 and E2, each consisting of 160 clinical E. coli strains isolated from different time periods). Comparative genomics analyses led us to identify all the genes associated with previously known antibiotic resistance, which in turn permitted investigation of new subtypes of resistance genes.

Bacterial Strains Collection, Plasmid Extraction and High Throughput Sequencing
A total of 928 clinical strains of the Enterobacteriaceae family, isolated from sputum, urine, pus or blood samples of patients, were collected in the First Affiliated Hospital of Wenzhou Medical College, spanning the years 2002 to 2010. Among these isolates, 320 strains are E. coli which isolated from the years 2002-2003 (160 strains) and the years 2008-2009 (160 strains). The other 608 strains included K. pneumoniae (113 strains), S. Marcescens (132 strains),E. cloacae (190 strains), E. aerogenes (99 strains) and C. Freundii (74 strains). The bacterial samples in this study were collected from an large anonymous database according to the protocols of Wenzhou Medical College Ethics Committee so that the detailed participants information could not be obtained. Initially, all the participants orally consented that the isolates can be anonymously used in scientific study. The strains were identified by the Vitek-60 microorganism auto-analysis system (BioMerieux Corporate, France). For the pooled plasmids sequencing, each clinical E.coli strain was overnight incubated independently in 5 ml Luria-Bertani broth at 37uC for about 16 hours to obtain the concentration of an optimum optical density (OD 600 = 1.560.2). The cultures were pooled and 100 ml of mixed bacteria was used to extract the plasmids. Plasmids were extracted by alkaline lysis method described as previously [17]. According to the isolated periods, the bacteria were pooled as E1 (160 E.coli strains isolated from the years 2002-2003) and E2. (160 E.coli strains isolated from the years 2008-2009) and their plasmids were subsequently used for high-throughput sequencing by Illumina/Solexa technology, respectively.

Reference Resistant Gene Sequence Collection, Sequencing Read Mapping and SNP Detection
Antibiotic resistant protein sequences were collected from ARDB (Antibiotic Resistance Gene database, http://ardb.cbcb. umd.edu) [18]. CD-HIT (http://bioinformatics.ljcrf.edu/cd-hit) was used for clustering protein sequences [19]. TBLASTN was used to compare protein sequences with a nucleotide collection database of NCBI. Comparison of nucleotide sequences was made using BLASTN [20]. The collected gene sequences were assembled using Phred/Phrap/Consed software package [21]. Mapping sequencing reads onto references was performed using SOAPaligner 2.20 (2 mismatch per read permitted), and SOAPsnp1.03 was employed to detect and annotate SNPs [22,23]. A SNP was identified when at least 3 same high quality bases different from other bases at the same locus were detected. The relative abundance (sequencing depth) for a certain gene was calculated through the cumulative nucleotide length of the mapped reads on the gene divided by the gene size. Other bioinformatics tools used in this study were written by Perl and BioPerl (http://www.perl.org/). bla KLUC-3 and bla KLUC-4 Positive Strain Screening, Cloning and Sequence Determination The primers for positive strain screening and complete ORF cloning were designed according to the concensus sequence of bla KLUC-3. The screening primers were 59-CGCTAAGCGTA-GAGCAGAAACT-39 and 59-TCAGGTCACCCTTTTT-GATCTC -39 with a product of 228 bp in length. The primers for complete ORF cloning were 59-GGGATCCATGGTTAAAA AATCATTACGCC -39 and 59-GAGATCTCTATAATCCCT-CAGTGACGATT -39 with a pair of flanking restriction endonuclease adapters (Bam HI for the forth primer and Bgl II for the reverse primer). Whole genomic DNAs of 928 clinical isolates of Enterobacteriaceae family were used as templates for PCRs. The complete ORF fragment of the PCR product was agarose-gel isolated and cloned into a pMD18 vector (TaKaRa). The recombinant clones were picked and sequenced by an ABI 3730 automated sequencer. The recombinant plasmid (pMD18::bla K-LUC ) was digested with Bam HI and Bgl II and the ORF fragment was recovered and further cloned into a pET28a vector (TaKaRa). Finally, the recombinant plasmids (pET28a::bla KLUC ) were transformed into the host strain BL21. The complete ORF sequence of bla KLUC-3 and bla KLUC-4 have been deposited to GenBank with the accession numbers, JX185316 and JX185317, respectively.

Conjugation and Susceptibility Test
Conjugation experiments were performed in mixed broth cultures. E. coli D41 and E.cloacae Y214 were used as the donors and rifampin-resistant EC600 Rifr was used as the recipient. Overnight cultures (incubated at 37uC with shaking) of the donor strain (500 ml) and the recipient strain (500 ml) were mixed together in 4 ml fresh Luria-Bertani broth and incubated for 6 hours at 37uC. The mixture was then inoculated onto a Trypticase soy agar (TSA) plate containing rifampin (Sigma; 512 mg/L) plus ceftriaxone (Roche; 2 mg/L) for 18 hours at 37uC. The colonies that grew on the selecting medium were picked and identified using the Vitek-60 system. The transferred plasmid was extracted and identified by PCR with bla KLUC screening primers.
Minimal inhibitory concentrations (MICs) of 18 antibiotics or complex antimicrobial drugs were determined by the agar dilution method for two donors, two transconjugants, the recipient strain BL21, BL21[pET28a::bla KLUC-3 ] and BL21[pET28a::bla  in accordance with the guidelines of the Clinical and Laboratory Standards Institute (CLSI). Antimicrobial agents were obtained from the National Institute for the Control of Pharmaceutical and Biological Products (NICPBP) and pharmaceutical companies in China. E. coli ATCC 25922 was used as a quality control for the MIC determinations.

Antibiotic Resistance Gene Distribution
To obtain relatively comprehensive reference sequences for the resistance genes, we collected the protein sequences from ARDB and our laboratory to complete a TBLASTN search against the nucleotide collection database (nt database) at NCBI with a constant E-value (1e-80). The internal fragments of nucleotide sequences which matched the proteins were then extracted. They were complete or partial Open Reading Frames (ORFs) of antibiotic resistance genes. To eliminate redundant members of this data set, we assembled these sequences and obtained 492 contigs (Table S1). They were used as ''bait'' references.
The samples E1 and E2, sequenced by Illumina Genome Analyzer, generated 819 and 694 million nucleotides, respectively. All the reads ranged from 73 to 75 nucleotides in length. Mapping reads onto the references yielded resistance genes and the quantity of mapped reads on a specific reference could also suggest the relative abundance of them in the sequenced samples. It showed that these two samples contained a total of 36 hits related to the resistance genes of b-lactams, aminoglycosides, macrolides, fluoroquinolones, sulfonamides, tetracyclines, chloramphenicol and rifampin. The most abundant gene was blaTEM with a sequence depth of 5496 folds (E1+E2) and the average sequence depth was 727 folds (Table 1). According to the ratio of mapped reads to the references, we found that resistance genes related to b-lactams and aminoglycosides were the most prevalent, not only in their total abundance, but also in their category of the corresponding genotypes. Of the 36 identified resistance genotypes, bla PSE , bla KLUC , qnrA and tetM were found only in the sample E1, while the other 32 types of resistance genes consistently appeared in both samples. The genotypes bla TEM , strB, strA, floP, aacC2 and sulI were the most abundant genes in two samples.

Polymorphism of the Resistance Genes
Polymorphism analyses revealed that 25 and 27 SNPs distributed in 9 and 10 resistance genes in E1 and E2 plasmid samples, respectively (Table 2). Among these 9 and 10 resistance genes, more than one half (5 and 7 genes, respectively) are associated with aminoglycosides resistance. Of the remaining 4 and 3 genes, 3 and 1 are involved in b-lactams resistance. Surprisingly, the genotype aacC2 (also known as aac(3)-IIa) in the E1 and E2 samples was scattered with 10 and 11 SNPs, respectively, whereas the sulI gene had an abundance approximately identical to aacC2 in the corresponding samples and did not show any SNP. Nine SNPs (36.0%) in E1 and twelve (44.4%) in E2 are nonsynonymous substitutions ( Table 2). To demonstrate whether the sequencing reads were sufficient to reflect the SNP profiles, we used different fraction of total reads to identify SNPs. With the increased number of sequence reads mapped onto the references, the SNPs reached a stable stage (maximal number of SNPs corresponded to each sequencing library) when 60% of the total reads were used. It suggested that the sequencing data was sufficiently in depth to reflect a majority of the genetic variations from the two pooled plasmid samples.

Identification of Two Novel bla KLUC Genes
Kluc-1 has been identified as a close relative of CTX-M type class A ESBLs, sharing ,85% amino acid identity with CTX-M-1 group members. It was first found in Kluyvera cryocrescens chromosome DNA [24]. Its variant, Kluc-2, one amino acid different from Kluc-1 at position 118, was demonstrated as a plasmid-mediated ESBL hosted in E.cloacae 7506 [25]. Mapping results indicated that some reads in the E1 sample matched to bla KLUC reference with an average depth of 50.7. Further analyses showed that the potential bla KLUC had 2 and 3 amino acids different from (sharing 99% and 98% amino acid identities with) Kluc-1 (AAK08976) and Kluc-2 (ABM73648), respectively. PCRs were performed against all 160 strains of the E1 sample to determine which strains carried this determinant. It showed that only strain E. coli D41 had a positive result. Full length ORF of the new bla KLUC gene (named bla KLUC-3 , GenBank Accession No. JX185316) was cloned into pMD18 vector and sequenced. The sequencing result is entirely identical to the mapping result of Solexa reads.
CTX-Ms, as one of the most prevalent b-lactamases with 90 known members, has been found in various bacteria of the Enterobacteriaceae family including E. coli, K. pneumoniae, E. cloacae, S. enterica, E. aerogenes, C. Freundii, S. Marcescens, P. mirabilis, etc [26].
ctx-m-9 group 508 C C/T \ No

Resistance Activities of bla KLUC-3 and bla KLUC-4
To detect resistance activities of bla KLUC-3 and bla KLUC-4 , complete ORFs of these two novel resistant genes were cloned into pET28a vector and transformed into E.coli BL21. The MICs of the donors, the transconjugants, the transformants and the recipient controls against a group of antimicrobial drugs were detected (Table 3). BL21[pET28a::bla KLUC-3 ] showed resistance to several extend-spectrum b-lactams, including ceftriaxone, cefazolin and cefotaxime, but not ceftazidime. However, it is very interesting to find that the transformant BL21[pET28a::bla KLUC-4 ] did not show any resistance to the antimicrobial drugs examined. This may be largely attributed to the amino acid change at position 167, a substitution of Arg by Leu (R167L) in Kluc-4, leading to disappearance of the resistance activity (Figure 1). The blactamases inhibitor, for example tazobactam, could strongly reduce the activity of Kluc-3. This was consistent with previously described resistant activity of Kluc-1 [24]. The original E. coli D41 and E. cloacae Y214 have stronger resistance to and wider resistant spectrum of the antibiotics examined. Besides b-lactams, they are also resistant to tetracyclines, kanamycin and chloramphenicol.

Discussion
In this work, we conducted a novel approach to detect resistance gene profiles in mixed E.coli plasmids isolated from two different periods of time. In order to effectively illustrate the composition of the resistome of the bacterial populations, 492 contigs as the universal ''bait'' references have been established.  Because of the large number of resistance genotypes, as well as their subtypes or variants, in particular those with high identities, it would lead to formation of gene chimeras during the assembly process of the bait references and subsequently interfere with the effective estimation on genotypes and their SNP distributions. Therefore, once the reads mapped on bait reference, it should be manually checked and parsed, and the explicit genotype should be used for calculating resistance gene abundance and assigning SNPs, as listed in Table 1 column 5. On the other hand, the reads generated by Solexa sequencing is shorter in their sizes and the complete sequence of a certain resistance gene could not be entirely covered by single reads. Thus, the genotypes which harbored SNPs might be considered as heterozygous genotypes in the two pooled plasmid samples. The interested genes, such as bla KLUC-3 , and their sequence characters would be screened and determined by PCR combined with Sanger sequencing. Due to a strong ability to hydrolyze cefotaxime and high amino acid sequence similarities with CTX-M family enzymes, bla KLUC was classified as one group of CTX-Ms family. Other groups of this family include CTX-M-1, CTX-M-9, CTX-M-8, CTX-M-25 and CTX-M-2, and the members in the same group share .94% identity, whereas #90% identity is observed between the members belongs to distinct groups [26]. Compared with many other class A ESBLs, such as those of TEMs, the CTX-Ms showed lower hydrolytic activities against penicillins and narrower anti-drug spectrum against cephalosporins. Its much lower hydrolytic activity against ceftazidime clearly distinguishes it from most enzymes in TEM and SHV families [26]. From our resistance susceptible test, we observed that cloned bla KLUC-3 showed resistance to several extended-spectrum cephalosporins, such as cefazolin, ceftriaxone, cefuroxime and cefotaxime, but not cefepime, ceftazidime, meropenem and aztreonam. Unlike bla KLUC-3 , bla KLUC-4 under the same conditions as bla KLUC-3 did not show any resistance to the corresponding extended-spectrum cephalosporins. The MIC values for the examined b-lactams were the same as those of the recipient BL21 (Table 3). The obvious susceptibility difference to the b-lactams between two novel variants is probably related to a positively charged Arg residue replaced by a neutral residue Leu at position 167 which is an omega-loop (V-loop) comprising locus. The V-loop is a nonregular secondary structure most frequently occurring at the surface of a globular protein. It is involved in many protein functions, such as ligand, substrate or inhibitor-binding, tyrosine sulfation, as well as prohormonal cleavage [27][28][29]. More recently, a similar phenomenon has been observed from CTX-M-93 where only a single amino acid divergence from CTX-M-27 at position 169 (L169Q) located at an V-loop caused significantly decreased hydrolytic ability against its best substrates, such as cefotaxime, but led to enhanced resistance to ceftazidime [30].
It has been proposed that the bla KLUC genes seem to be acquired from its chromosomal progenitor of the CTX-M-1 group [4]. The first discovered bla KLUC-1 was demonstrated encoded on chromosome, but the later identified variants, bla KLUC-2, bla KLUC-3 and bla KLUC-4 were all harbored on the plasmids. This indicated that the movable DNA elements such as integrons, insert elements, transposons and plasmids might play essential roles in transposition and the horizontal gene transfer of the bla KLUC resistance genes. It has been reported that ISEcp1 elements, the 42,266 bp insertion sequences, could be involved in mobilization of CTX-M enzyme encoding genes [31,32]. To date, at least 90 types of b-lactams resistant genes and 1100 variants have been discovered [4]. The increased quantity of novel variants of each type sometimes creates different resistance phenotypes of decreased or increased susceptibilities to their substrates which have been demonstrated by their previously characterized relatives. In this work, we successfully found two novel bla KLUC group members, and provided an example of a new resistance gene subtype screening and resistome investigation from pooled clinical isolates. Many mutagenesis-prone-to-happen sites of the gene locus were also detected and are mainly associated with b-lactams and aminoglycosides resistance. The merits of costeffective and high-throughput of second-generation sequencing technology may have potential as a substitute, or a companion at least, for microarray in the field of large scale antimicrobial resistance gene detection.

Supporting Information
Table S1 Available as supplementary data for this article. (RAR)