Next generation sequencing reveals the antibiotic resistant variants in the genome of Pseudomonas aeruginosa

Rapid progress in next generation sequencing and allied computational tools have aided in identification of single nucleotide variants in genomes of several organisms. In the present study, we have investigated single nucleotide polymorphism (SNP) in ten multi-antibiotic resistant Pseudomonas aeruginosa clinical isolates. All the draft genomes were submitted to Rapid Annotations using Subsystems Technology (RAST) web server and the predicted protein sequences were used for comparison. Non-synonymous single nucleotide polymorphism (nsSNP) found in the clinical isolates compared to the reference genome (PAO1), and the comparison of nsSNPs between antibiotic resistant and susceptible clinical isolates revealed insights into the genome variation. These nsSNPs identified in the multi-drug resistant clinical isolates were found to be altering a single amino acid in several antibiotic resistant genes. We found mutations in genes encoding efflux pump systems, cell wall, DNA replication and genes involved in repair mechanism. In addition, nucleotide deletions in the genome and mutations leading to generation of stop codons were also observed in the antibiotic resistant clinical isolates. Next generation sequencing is a powerful tool to compare the whole genomes and analyse the single base pair variations found within the antibiotic resistant genes. We identified specific mutations within antibiotic resistant genes compared to the susceptible strain of the same bacterial species and these findings may provide insights to understand the role of single nucleotide variants in antibiotic resistance.


Background
Pseudomonas aeruginosa is a major opportunistic pathogen causing acute and chronic infections in human community. It is a fascinating, ubiquitous, gram-negative bacterium that can PLOS

Antimicrobial susceptibility testing
The antimicrobial susceptibility of isolates was tested using the disc diffusion method (E-test1) against following nine different antibiotics; piperacillin/tazobactam (PPT), ceftazidime (CAZ), aztreonam (AZT), amikacin (AK), gentamicin (GN), ciprofloxacin (CIP), imipenem (IMP), meropenem (MPM) and colistin (CL). The reference strain P. aeruginosa ATCC # 27853 was used to validate the technique. The calibrated carrier strip and the intersection of the strip to the edge of the inhibitory zone indicates the minimum inhibitory concentration (MIC) [19]. The clinical isolates were chosen for whole genome sequencing based on the antimicrobial susceptibility levels screened against different antibiotics.

High-throughput whole genome sequencing
Genomic DNA was isolated from P. aeruginosa clinical isolates using the DNeasy DNA extraction kit (Qiagen, USA) according to the manufacturer's instructions. The samples were then quantified using Qubit. All genomic DNA were fragmented using Covaris S2 at the temperature of 5.5 to 6˚C for 40 seconds. The fragmented DNA was end repaired, added with dA base and ligated with Illumina indexed adapters. Size selections of the samples were performed using Invitrogen 2% agarose E-gels. The selected DNA fragments with adapter molecules on both ends underwent 10 cycles of PCR amplification and was sequenced using whole genome shotgun Illumina Hiseq 2000 flow cell V3 with paired-end libraries (~200 base-pair insert size).

Read pre-processing
Adapter sequences and low quality reads were removed with quality score filter of ! 30 using PRINSEQ version 0.20.3 [20]. The following types of reads were removed: reads having 'N' in more than 10% of the total bases of that read, reads with Phred quality score less than 20 and reads shorter than 50 bp.

Genome assembly
The sequenced reads were assembled using SPAdes Assembler version 3.8.1 [21]. Assembler was run using iterative kmer lengths ranging from 27 to 77. The assemblies were mapped against P. aeruginosa reference genome POA1 to evaluate the core genome average identities and completeness. The removal of probable contaminants was performed by BLAST against the assembly sequences of mitochondrial and primates'/rodents chromosome databases. The finalized draft genomes were submitted to Rapid Automated using Subsystem Technology (RAST) [22] server for gene predictions and annotations. Prokka Version 1.11 (Prokaryotic annotation) was used to perform the gene prediction [23]. The prediction is relied on the existing annotation resources such proteins and coding DNA sequences (CDS).

Analysis of single nucleotide polymorphism (SNP)
To identify genetic differences at genome level, the sequencing reads of each strain were mapped to its corresponding reference genome P. aeruginosa POA1 using Bowtie v0.12.0 software [24]. High-confidence SNP variants data sets were created by applying series of filters. The Variants were identified and extracted using Samtools with the following parameters: each variant must be supported by at least 10 reads (-d = 10), only the variants that were supported by maximum of 10,000 reads were considered for downstream analysis (-D = 10,000) and the reads must be supported by mapping quality of equal or larger than 25 (MQ = 25). Variants harbored by more than 90% of reads were included for further analysis and the depth of coverage was 25x. The effects of nsSNPs were annotated using SnpEFF.

Genome comparison
Protein Basic Local Alignment Search Tool (BLASTp) was used to match the sequence similarities between genomes of clinical isolates and the P. aeruginosa reference genome PAO1 [25]. Genome visualization of ten clinical isolates and the similarity between P. aeruginosa reference genome (NC_002516.2) was performed using BLAST Ring Image Generator (BRIG) [26].

Heat map
Heat maps were generated using Complex Heat map package from Bioconductor in R [27]. Clusters are predicted using Euclidean distance method.

Results
Antibiotic-resistance profiles of P. aeruginosa clinical isolates A total of ten P. aeruginosa clinical isolates with resistant to three or more of the nine antibiotics tested were chosen for the study. The MIC values of nine antibiotics tested against the bacterial isolates are shown in Table 2. Isolates PAS2 and PAS10 were highly resistant to all the antibiotics tested. In contrary, isolates PAS4 and PAS8 were susceptible to at least 3 antibiotics (Imipenem, Meropenem and Collistin). PAS1 and PAS7 were only susceptible to Amikacin and Piperacillin/Tazobactam respectively. All ten isolates were resistant to Ceftazidime, Aztreonam, Gentamicin and Ciprofloxacin.

Genome sequencing and assembly
The sequencing consisted of 1 lane 100 bp paired-end reads, yielded approximately 0.8 Gbp to 2.8 Gbp for each clinical isolates. The draft genome assemblies for the ten Pseudomonas clinical isolates have been submitted to NCBI bioproject under the project accession number PRJNA388357 (http://www.ncbi.nlm.nih.gov/bioproject/388357). More than 80% of the reads are above Phred quality score of 30 indicating high quality sequencing data. The sequenced data were subjected to contamination screening and the genome sizes of final assemblies were ranging from 6.6 Mbp to 7.0 Mbp. The de novo assembly of genome sequence data revealed that the number of contigs (>200 bp) varied from 185 to 492 for each genome. These contigs were aligned to the reference genome of P. aeruginosa PAO1, and manual genome finishing was performed. The maximum contig size among the genomes was 485,560 bp aligned to PAS4. Genomes with the mean identity ranged from 98.77 for PAS 7 to 98.83 for PAS 9. The GC content ranged from 65.79 for PAS 1 to 66.19 for PAS 2. A summary of the genome sequence data and assembly are shown in Table 3.

Genome architecture
Fig 1 contains the circular map of the ten newly sequenced clinical isolates compared to the reference genome PAO1. The inner most circle represents the reference genome of P. aeruginosa (NC_002516.2) and outer most circle with labels represent the protein-coding regions (CDS) with annotation on the genome. The shared identity of each isolate with the reference genome is represented in different colors, which denotes the BLASTn matches between 70% to 100% nucleotide identities. The blank spaces in the rings represent the matches less than 70% or no BLAST matches to the reference genome. The nsSNPs harbored on the genes compared to the reference genome are represented as individual gene names.

Genome polymorphism
We focused our analysis on identifying nsSNPs that results in amino acid changes within the functional gene sequences. In addition, we also screened for point mutations that leads to premature termination of codons or nucleotide deletions that create frame-shift mutations. We initially screened the nsSNPs in clinical isolates compared to the P. aeruginosa reference genome, PAO1.  Application of whole genome sequencing to identify the single nucleotide polymorphism in (Fig 2). Majority of the nsSNPs were present within the PA2018, PA2019 and PA5471 loci that code for aminoglycoside resistant genes mexAB-oprM and mexXY-oprM responsible for the regulation of multi-drug efflux pump (Table 4).
Then we compared the resistant clinical isolates with the susceptible strains to extract specific nsSNPs found only in resistant isolates. The antibiotic profiles of Ceftazidime, Aztreonam, Gentamicin and Ciprofloxacin were not included as all 10 isolates were resistant to these antibiotics. The isolates classified as resistant for 5 different antibiotics were screened further to identify the nsSNPs. For example, PAS 1 was susceptible to Amikacin and 9 isolates (PAS 2-10) were resistant to the antibiotic. Hence we compared the nsSNPs present within resistant isolates. As shown in S1 Table, the amikacin resistant isolates possess two unique single nucleotide deletions that results in a frameshift mutation that truncates the protein products. Deletion at nucleotide position 1275766 results in a frameshift mutation and truncation of protein NapA (periplasmic nitrate reductase protein). Similarly, deletion at nucleotide position 1402975 results in truncation of 3-mercaptopyruvate sulfurtransferase. The detailed list of unique nsSNPs identified in the resistant clinical isolates compared to susceptible isolates for 5 different antibiotics is shown in the S1 Table. S2 Table, S3 Table, S4 Table and S5 Table. Differential mutation screening between the isolates were performed to identify specific nsSNPs associated with the particular antibiotic resistances.
In order to investigate nsSNPs in genes annotated to be associated in antibiotic resistance [8] or virulence factors [28], we specifically extracted the nsSNPs presented in various antibiotic resistant genes of resistant and susceptible clinical isolates. The number of nsSNPs shared within a gene across all clinical isolates is shown in Fig 3. For example, 6 nsSNPs were found within the colistin resistant isolate (PAS1 and 2) when compared to colistin susceptible isolates (PAS 3 and 8) at the gene PA3292 that codes for a hypothetical protein.
Similarly, 6 nsSNPs were also seen in all the isolates except PAS 8 at the PA1923 gene that Application of whole genome sequencing to identify the single nucleotide polymorphism in Pseudomonas aeruginosa clinical isolates codes for cobaltochelatase subunit CobN. Many of the polymorphic genes among these clinical isolates were mainly associated with cell wall, multi-drug efflux pump, protein Application of whole genome sequencing to identify the single nucleotide polymorphism in Pseudomonas aeruginosa clinical isolates secretion, DNA replication and genes involved in repair mechanism. In addition, nsSNPs were found within genes responsible for resistance to β-lactams; transporter (PA4218), transcriptional regulators (PA1184, PA0032, PA5293 and PA2383) and a hypothetical protein (PA3693). These were all collectively reported to play a role in decreased antibiotic uptake and affecting cell wall permeability [29]. Notably, a SNP at position 2213177 in all antibiotic resistant isolates changed tryptophan into a stop codon. Furthermore, two frame shift mutations at the napA and PA1292 genes of Amikacin resistant genes and a mutation resulting in stop codon at amino acid position Q65 in the Meropenem resistant isolates might have implications in the drug resistant mechanism. These findings indicate that the nsSNPs identified in the clinical isolates may contribute towards antibiotic resistance between the resistant clinical isolates examined in the study. Further targeted evidence using a large cohort of clinical isolates and site directed mutational analysis of these antibiotic resistant genes will provide further insights in the antibiotic resistant mechanism of the P. aeruginosa. Application of whole genome sequencing to identify the single nucleotide polymorphism in Pseudomonas aeruginosa clinical isolates

Discussion
Whole genome sequencing promises high resolution, high-throughput genome sequencing with insights to the repertoire of genetic polymorphism [3,11]. This approach can be used to identify the single nucleotide polymorphism that either alters a single amino acid or leads to a stop codon, and frame-shift mutations that alters a gene sequence itself resulting in truncated proteins. Here, we have investigated the genetic variation in P. aeruginosa hospital isolates by high-throughput NGS technology and comparative genomics.
The clinical isolates included in this study were selected based on their multidrug resistant profiles against a total of nine antibiotics. To study the genomic variation among the isolates with varying antibiotic resistance characteristics, we analyzed the genomic sequences of the isolates using WGS technique and compared the data with P. aeruginosa reference genome PAO1. Application of whole genome sequencing to identify the single nucleotide polymorphism in Several non-synonymous SNPs that may play an important role in antibiotic resistance have been identified, with insights to enhance our current understanding of the factors that influence the antibiotic resistance regime of hospital isolates. Though we have studied the nsSNPs of entire genome of all hospital isolates, we specifically extracted nsSNPs that were in genes reported to play a major implication in antibiotic resistance [28] within the antibiotic resistant and susceptible clinical isolates. Several mechanisms of antibiotic resistance in P. aeruginosa have been reported: by altering porins or outer membrane properties, resistance through enzymatic modification of drugs, and resistance through efflux systems and chromosomal mutations within regulatory genes [30][31][32][33].
We have identified specific nsSNPs within the mexXY-oprM and mexAB-oprM genes, which are known to play a major role in antibiotic resistance through efflux pumps and both these genes part the same efflux porin OprM [34,35]. In addition, the efflux system Mex-XY-OprM is also recognized to be contributing in aminoglycoside resistance in P. aeruginosa, and MexAB-OprM is reported to confer resistance to β-lactams, aminoglycosides and fluoroquinolones resistance [31,36]. Also a polymorphism resulting in a stop codon at locus PA2020 corresponding to a mexZ transcriptional regulator may have implications in drug resistance [37]. Collectively, P. aeruginosa may use a combination of all these mechanisms to achieve multi-drug antibiotic resistance.
The unique nsSNPs identified within the genes coding for aminoglycoside modifying enzymes (AME's) and 16S rRNA methylase genes, mutations leading to colistin resistance by altering the bacterial outer membrane, and resistant to β lactam by hyper production of β-lactamase AmpC may together play a crucial role in multi-drug resistance. Our results confirm the assumption of other authors that the drug resistance may be the result of a pool of pathogenicity-related genes that interact in various combinations [29,38,39]. In addition to the unique mutations, there is high degree of sequence conservation between resistant and susceptible clinical isolates. This conservation within the virulence genes of different P. aeruginosa clinical isolates may be a factor that is challenging to device strategies for the treatment of infections. In addition, the extensive conservation of virulence genes in the genomes regardless of drug resistance may be due to the fact that the disease-causing ability of this opportunistic pathogen relies on a set of highly conserved pathogenic mechanisms [29]. Moreover, the resistance of one isolate for several antibiotics may be due to the pleiotropic effects of resistance causing gene [39].
Drug resistance to several antibiotics may be due to a combination of mutations resulting in overexpression of several multi-drug efflux pumps, alteration of expression of enzymes and structural components involved in peptidoglycan outer membrane stability, mutations affecting gyrase activity and mutations taking effect in aminoglycoside phosphotransferases [29]. In our study, the analysis of the genomes of drug resistant clinical isolates possessed non-synonymous mutations in at least one gene known to be involved in antibiotic resistance (for example, oprD, gyrA or B, mex-type efflux systems, parC, rpoB, baeS). The genes carrying these mutations are coding for proteins involved in outer membrane permeability, multi-drug efflux pumps, gyrases and drug modifying enzymes (aminoglycoside modification enzymes and/ or β -lactamases) [39]. However, a limitation of this study is that further targeted evidence with site directed mutational analysis on the reported nsSNPs harboring the resistant genes is respect to their antibiotic resistant profile is depicted in the heat-map. The red colour indicates there are 6 nsSNPs harbouring in that gene among different isolates and a blue colour represents there are no nsSNPs within the gene among isolates. https://doi.org/10.1371/journal.pone.0182524.g003 Application of whole genome sequencing to identify the single nucleotide polymorphism in Pseudomonas aeruginosa clinical isolates required. Such analysis will provide further insights to the overall phenotype alteration and synergistic effect of these nsSNPs in the antibiotic resistant mechanism.
In conclusion, this study demonstrates the potential of next generation high-throughput whole genome sequencing to compare the genome polymorphism between clinical isolates. We have identified a number of key mutations that may play a key role in altering antibiotic resistant genes leading to genetic polymorphism. This should form the basis of our future research to study the prevalence of such multidrug resistance P. aeruginosa in the Asia pacific region. This technology can be utilised in conjunction with current epidemiological studies, diagnostic assays and antimicrobial susceptibility tests to reveal the genetic variation and pathogen biology of "high-risk" pathogens including P. aeruginosa [11].
Supporting information S1