Using comparative genomics to understand molecular features of carbapenem-resistant Acinetobacter baumannii from South Korea causing invasive infections and their clinical implications

Acinetobacter baumannii is a highly potent nosocomial pathogen that is associated with increased in-hospital mortality. Here, we investigated the changes in molecular characteristics of carbapenem-resistant A. baumannii (CRAB) isolated from the blood samples of patients admitted to a tertiary hospital in South Korea from January 2009 to July 2015. Whole genome sequencing using the Illumina MiSeq platform and multi-locus sequence typing (MLST) were performed for 98 CRAB clinical isolates. In silico analyses for the prediction of antimicrobial resistance and virulence factor genes were performed. Plasmid sequences, including complete forms, were reconstructed from the sequence reads. Epidemiologic data were collected from the hospital database. MLST using the Oxford scheme revealed 10 sequence types of CRAB, of which ST191 was the dominant type (n = 59). Although blaOXA-23 was shared by most analysed strains, the compositions of antimicrobial resistance determinants differed among sequence types. ST447 and ST451/ST1809 with a few resistance genes were isolated during the later years of the study period. The number of virulence genes increased, while that of ST191 did not change significantly over the investigation period. Intriguingly MLST types, compositions of antimicrobial resistance genes, and virulence genes had no association with clinical outcomes of CRAB bacteraemia. In conclusion, active changes in or accumulations of antimicrobial resistance determinants and virulence genes in CRAB were not observed during the research period. Molecular characteristics of CRAB had no association with clinical outcomes of CRAB bacteraemia.

Introduction CRAB bacteraemia was defined as one or more positive blood cultures for carbapenem-resistant A. baumannii and the presence of clinical features compatible with systemic inflammatory response syndrome. Preliminary species identification and antimicrobial susceptibility tests were conducted with VITEK II system (bioMérieux, Marcy l'Etoile, France). Disc-diffusion testing using antimicrobial discs (Becton Dickinson, Sparks, MD, USA) on cationic-adjusted Mueller-Hinton (MH) agar (Difco Laboratories, Detroit, MI, USA) was subsequently performed as per CLSI guidelines [17]. Minimum inhibitory concentrations (MICs) to three drugs including tigecycline, minocycline and colistin were assessed using VITEK II system. The MICs of imipenem and meropenem were determined with the agar dilution method using MH agar following CLSI guidelines [17]. Additionally, modified Hodge tests and double-disk synergy tests were performed to screen carbapenemase and metallo-β-lactamase activity, respectively.
We collected clinical data, including patients' age and sex, length of ICU stay, length of hospital stay, pre-existing chronic comorbidities (diabetes, chronic heart failure, chronic liver disease, chronic renal disease, and chronic pulmonary disease), sequential organ failure assessment score, episodes of hospital admission, previous invasive procedures (central line insertion, intubation, continuous renal replacement therapy, and surgery under general anaesthesia), as well as lengths and types of antibiotic treatments. The origin of bacteraemia was determined upon identification of precedent CRAB isolation or evidence of infection before the event of bacteraemia.

Whole genome sequencing (WGS), annotation, prediction of antimicrobial resistance determinants, and identification of virulence genes
The isolates were cultured in Luria Bertani (LB) broth, and genomic DNA was extracted with a PureHelix™ Genomic DNA Prep Kit (cat. no. GCTN100, Nanohelix, Daejeon, Republic of Korea). Paired end sequencing of 98 A. baumannii clinical isolates was carried out using the Illumina MiSeq platform at ChunLab (Seoul, Republic of Korea). Specifically, TruSeq Nano DNA Library Preparation Kit (550 bp library size) and MiSeq Reagent Kit v3 (600 cycles) were used. Using CLC Genomics Workbench v11.0.1 (https://www.qiagenbioinformatics.com/), reads were trimmed (quality limit 0.01, ambiguous limit 1, min length 120 bp, discarding orphan reads) and were de novo assembled. Word and bubble sizes were 64 and 100, respectively. The assembled genome sequences were annotated using NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) v4.5 [18]. GenBank files produced from PGAP were converted to GFF format, and then subjected to Roary [19] to obtain core genome sequences and their codon-aware alignments, followed by a heuristic method-based automated trimming using tri-mAl [20]. A phylogenetic tree was constructed using FastTree2 [21] and visualised using iTOL server [22].
To identify antimicrobial resistance genes, the contig sequences were queried against the ResFinder database (https://cge.cbs.dtu.dk/services/data.php; downloaded on June 4, 2018) using Find Resistance v1.01 from the Microbial Genomics Module of CLC Genomics Workbench.
A comprehensive list of virulence factors from A. baumannii was compiled as per the previous studies [23][24][25][26]. The list of 182 virulence genes is shown in S1 Table. Distribution of virulence genes across all genomes was obtained using the large-scale blast score ratio (LS-BSR) [27]. A cut-off value of blast score ratio 0.7 was arbitrarily selected to evaluate the existence of each virulence gene.
For comprehensive genomic epidemiological analysis, we used BacWGSTdb (http://bacdb. org/BacWGSTdb) [28] which allows us to find closest isolates that are currently deposited in NCBI GenBank database. Scheme core genome multi-locus sequence typing (cgMLST) was used with a 50 threshold for single nucleotide polymorphism (SNP) and a 50 threshold for MSLT.

Multi-locus sequence typing (MLST) and detection of single nucleotide polymorphism (SNP)
Allelic profiles or STs of each isolate were determined by submitting contig sequences to the PubMLST site (https://pubmlst.org/abaumannii/) according to the Institute Pasteur scheme (MLST-IP) and Oxford Database (MLST-OD).
SNPs were identified by mapping raw sequencing reads to the complete chromosome sequence of A. baumannii strain KBN10P02143 (GenBank CP013924.1), the multidrug strain that completely sequenced for the first time in South Korea, using Snippy (https://github.com/ tseemann/snippy).

Analysis of plasmid sequences
Plasmid sequences were constructed either by plasmidSPAdes [29] assembler that uses the read coverage of contigs, or by a custom three-step method consisting of (i) running Plasmid Profiler [30] that first identifies putative plasmid hits from the plasmid database, (ii) mapping of the entire reads to chosen plasmid sequences using SMALT (https://www.sanger.ac.uk/ science/tools/smalt-0) and extracting paired reads where at least one end mapped using SAM-TOOLS (http://www.htslib.org/), and finally, (iii) assembly of the extracted reads using UniCycler [31]. Assembly statistics and whole-genome assembly results are shown in S2 Table. Complete (also circularised) plasmid sequences were identified from UniCycler assemblies by visualization of assembly graphs using Bandage [32]). Putative protein coding genes were predicted using GeneMarkS [33], and rep genes were identified by hmmsearch (http://hmmer. org/) against Pfam database v31.0 (https://pfam.xfam.org/).

Identification of insertion sequences and transposons
The bla OXA-23 is a major determinant of nosocomial outbreaks of IC II CRAB. The expression of this gene is presumably regulated by several insertion sequences, including ISAba1, which is thought to play the most crucial role in gene expression [34]. ISMapper [35] was used to identify ISAba1 insertion sites on the reference genome of the strain KBN10P02143. Insertion sites of ISAba1 relative to the chromosome of KBN10P02143 are shown in S3 Table. The presence of bla OXA-23 -carrying transposons, including Tn2006, Tn2007, Tn2008, and Tn2009, was screened using Primersearch from the EMBOSS package (http://emboss.sourceforge.net/), with primer sets suggested by Chen et al. [36].

Statistical analysis
Baseline characteristics were compared using Mann-Whitney U test or independent samples t-test for continuous variables, and by χ 2 test or Fisher's exact test for categorical variables. Continuous variables were expressed as means or medians (interquartile ranges), while categorical variables were expressed as numbers with percentages for the description of baseline characteristics. Statistical analyses were performed using SPSS version 21.0 (IBM Corp., Armonk, NY, USA).

Data availability
Genome sequences were submitted to GenBank under BioProject no. PRJNA448358 (https:// www.ncbi.nlm.nih.gov/bioproject/PRJNA448358). The complete list of strain names and accession numbers are shown in S2

Overview of A. baumannii isolates
A total of 109 CRAB blood isolates were used in this study. Of these, three samples with possible contamination and eight samples with insufficient clinical data were excluded. The remaining 98 samples were consequently included in the analysis.  Table).

Core genome-based phylogeny and SNP-based inference of recombination
A core genome alignment-based phylogenetic tree was constructed to delineate the in-depth relationships between analysed strains (Fig 2). The analysis resulted in the classification of 98 analysed strains into four major clades. SNP identification relied on read mapping on the reference genome sequence of strain KBN10P02143. Pairwise SNP distances among all strains ranged from 39 to 83,304 bp (5,368 bp median) (S5 Table). Relationships between SNP-based phylogeny and STs were searched. Clade 1 was mostly composed of ST191. The majority of strains classified as clade 2 (ST368, ST357, and ST858) were isolated in the former part of our study period, while those of clade 3 (ST208/ST1806 and ST369) were isolated in the latter part. However, chronological correlation among clades was not identified overall.

Differences in patient characteristics among STs
Patients were hospitalised in different types of ICUs, and the locations of each isolated ST are shown in Table 1. Most strains were identified in MICU, and no notable relationship between isolated locations and STs was reported. The mean age of the enrolled patients was 60.62 ± 14.63 years, and male patients predominated (70.4%, n = 69). Chronic obstructive pulmonary disease was the most commonly found underlying morbidity (93.9%, n = 92), and chemotherapy was the most common cause of immunosuppression (39.8%, n = 39). Ventilator care was performed in 81.6% of patients (n = 80). The most common source of bacteraemia was the respiratory tract (79.6%, n = 78). The median number of hospitalisation days was 30 [19-64.2], and patients used antibiotics for a median treatment length of 30.5 [16-59.3] Table 2).

Differences in antimicrobial resistance determinants according to STs and year of isolation
Among 29 AMR genes investigated, an average of 16.64 genes were possessed by each strain. The gene bla OXA-23 was identified in all strains except for ABAY11010 (ST369) and ABAY12016 (ST208/ST1806). Among class D carbapenem-hydrolysing class D β-lactamase genes, bla OXA-82 and bla OXA-66 were found in ABAY11010 and ABAY12016, respectively.

Differences in virulence genes according to STs and year of isolation
From the literature survey, we compiled a list of 182 potential virulence genes [37]. Of all genes identified, 107 genes were possessed by all of the isolated strains. The most differences among STs were found in the ACICU_00075-ACICU_00087 (immune evasion), bap, csuA, casuA/B, csuB, csuC, csuD, csuE, pgaD (biofilm formation), and vgrG1, vgrG2, and vgrG4 (Type IV protein secretion system) genes (S6 Table). We separated ST191 and searched for epidemiologic differences to find that only carO, related to porin, was more frequently detected in the samples originating from the respiratory tract. No statistically significant chronological change was observed ( Table 5). The ST that possessed the most abundant virulence genes was ST208/ ST1809, which had more ACICU_00075-ACICU_00087, pgaD, bla PER-1 , and hcp genes compared to the others. The second most common type was ST451/ST1809, which possessed more ACICU_00075-ACICU_00080, ACICU_00086, ACICU_00087, pgaD, pclC1, and hcp. ST552 had the fewest virulence genes and was devoid of bla OXA-24 , ABUW_1966, ACICU_00074-A-CICU_00087, fhaC, pgaD, bla PER-1 , omp33-36, plcC1, and hcp. The ST that had the second fewest virulence genes was ST447 (S6 Table).
The number of virulence genes was compared according to the year of isolation. The number increased as research period elapsed. The strains isolated in the year 2009 carried fewer virulence genes, while those isolated in 2015 exhibited more virulence genes compared to the rest. We singled out ST191 and observed statistically insignificant results (Fig 4A and 4B).

Difference in ISAba1 and carbapenemase-encoding transposons on CRAB
ISAba1 belongs to the IS4 family and has been detected upstream of the ampC, bla OXA-23 , bla OXA-27 , and sul2 antibiotic resistance genes in Acinetobacter species [34]. Insertion sites of ISAba1 were different for each strain, and no common insertion sites were conserved aac (6'    throughout all strains. Insertion sites of ISAba1 that were shared by more than 70% of strains were designated as IS � 1, IS � 2, IS � 3, IS � 4, IS � 5, IS � 6, and IS � 7. The compositions of IS � 1 through IS � 7 differed between STs (S7 Table). ST447 contained fewer insertion sequences compared to the rest, while ST191 carried comparatively more insertion sequences; these incidences remained constant throughout the study period for each insertion sequence (S1 Fig). We could detect common bla OXA-23 -ΔATPase modules from all isolates with the exception of ABAY11010 (ST369) and ABAY12016 (ST208/ST1806), which implied that Tn2006, Tn2008, or Tn2009 was widespread among these strains. However, although antimicrobial resistance (AMR) genes were predicted using three sets of contig sequences, which include assemblies obtained by CLC Genomics Workbench (whole-genome assembly), by plasmid SPAdes, and by plasmid profiler-based process, these cannot determine the exact location of AMR genes. ABAY11010 and ABAY12016 originated in the respiratory tract, and had larger number of virulence and AMR genes than average (virulence genes, 160 and 161 copies each; AMR genes, 16 and 23 copies each). Both strains were isolated at different places (MICU D and MICU C each) and times (2011 and 2012 each).

Resistance islands
AbaR1 is a commonly reported sequence from strains isolated in South Korea [38,39]. We searched contig sequences using the 87.7-kb AbaR1 sequence from the strain AYE (CU459141.1) [40] as a query, and found 70 strains to carry a very short AbaR-like sequence. These were only 10,641 bp long after including the disrupted ATPase gene (comM') at both ends. AbaR-like islands were similar to the 14.5-kb long Tn6021 that is devoid of the multiple antibiotic resistance region found in ATCC 17978 (GenBank CP00521.1). When read mapping on the complete AbaR-like sequence extracted from ABAY09001 (GenBank QHGS01000008.1; 56,153-66,793 bp) was applied, all strains except ABAY13016, ABAY14002, ABAY14004, and ABAY14008 (all categorised as ST447) possessed AbaR-like islands. Also, 75 ABAY strains harbor class I integron carrying cat2, aadA1, sul1 and armA. It has been reported that two consecutive copies of integrons are located within a putative resistance island, but read mapping-based analysis (this study) could not reveal how integrons are situated in ABAY chromosomes. The gene organization is similar with class I integron identified from NCGM 237 [41]. Changes in genetic composition of A. baumannii

Reconstruction of plasmid sequence
Assembly-based methods produced putative plasmidic contigs from all 98 isolates: average total lengths of plasmidic contigs predicted by plasmiSPAdes and plasmid profiler-based unicycler were 129,195.1 bp and 141,235 bp, respectively. Standard deviations of their total lengths were 136,737.5 bp and 71,471.1 bp, respectively. The latter method, which relies on graphbased assembly followed by circularization, identified 115 complete plasmid sequences along with linear and complex type assemblies. We clustered plasmid sequences into groups based on sequence similarity and length, which resulted in five groups ranging from 110,967 bp    Table). To estimate the presence of all five groups of plasmids across strains, sequencing reads were re-mapped on the representative plasmid sequences (S9 Table). Unexpectedly, we could not find any antimicrobial resistance genes from the five representative plasmid sequences. However, unresolved plasmids in the plasmid profiler-based unicycler assemblies might contain not-yet-identified ones that harbor multiple AMR determinants. We identified ISAba1, which is known to promote expression of downstream genes, from four complete plasmids in group B. However, no bla OXA-23 was found in their vicinity or elsewhere in the complete plasmid sequences.

Discussion
Since the 1970s, there has been a progressive increase in the antimicrobial resistance for the majority of A. baumannii strains, which were otherwise sensitive to the commonly used antibiotics [42]. By 2007, up to 70% of isolates in certain settings had developed multidrug resistance including resistance to carbapenems, which were once considered as the mainstay against multidrug-resistant A. baumannii infections [43]. In our study, ST191 was the dominant ST during the 7 years of our study and showed stable genomic variations. However, when other STs were combined, a tendency toward increasing virulence genes was observed without additional changes in antimicrobial resistance of CRAB in restricted hospital environments.
ST191 is a predominant strain isolated in South Korea [44] known for expressing bla OXA- 23 , which is responsible for the high rate of carbapenem resistance. The trait that allows this strain to emerge as a highly successful nosocomial pathogen through frequent modification of genetic components was also manifested in our study. ST191 had higher insertion sequences compared to the other types, indicating the frequent recombination and gene rearrangement events. Also, only 60% of strains involved in our study had closed related strains identified in previous studies. However, no cumulative change in AMR genes and virulence genes was observed over time. The only gene that showed increased prevalence was strA. This gene is related to aminoglycoside resistance and is frequently found in AbaR-like islands along with other genes such as tetA(B), tetR(B), CR2, strB, and orf4b [45]. We observed no simultaneous increase in other AMR genes in this study. It is worth noting that heterogenous groups of ST were isolated simultaneously. ST208/ ST1806 was the sequence type with the most abundant AMR genes; bla PER-1 was found only in this sequence type. Although it is a relatively uncommon gene for A. baumannii, bla PER-1 is frequently reported as being implicated in various virulence mechanisms [46,47]. In our study, this strain was isolated mostly in 2012, and additional spread of this gene was not noted. ST447, which was categorised into international clone 1 (IC I), was isolated in the latter part of our study period. ST447 possessed fewer AMR genes. Since this strain was more frequently isolated from patients with shorter hospital stays, it was introduced from the outside rather than being an indigenous strain. This type was equipped with β-lactam resistance genes without any other resistance genes. As with other strains, ST447 also had bla OXA-23 -ΔATPase modules, indicating that it carries Tn2006, Tn2008, or Tn2009. While Tn2006 was mostly transferred by mobile elements, the spread of Tn2008 and Tn2009 was entirely dependent on the clonal dissemination of the bacterial host [44]. In light of ST447 containing all of these transposons, this type might have been under selective antibiotic pressure for a long period of time. It is possible that the previously susceptible strain may have gained genes only for carbapenem resistance during the study period, but the widespread use of carbapenems may have contributed to the emergence of this type. The density of virulence genes was still low, and this type of strain had relatively lower insertion sequences, which are commonly found in other STs. This defies the concern posed by a previous study in South Korea which reported the emergence of a new strain that would provoke the dissemination of more antibacterial resistance [48].
Despite sporadic studies suggesting correlation between the number of virulence genes with organisms' phenotypic differences [16], the significance of number of virulence genes in association with clinical outcomes and epidemiology has not been investigated. Therefore, the increased number of virulence genes over time shown in our study demands immediate attention. Although previous studies suggested certain associations between antimicrobial resistance and virulence [49], our study demonstrated that linear correlation between the two did not exist. It is worth noting that more than half of the highly significant virulence genes were possessed by all of these carbapenem-resistant strains. A slight increase in the number of virulence genes was reported in 2015, probably due to the emergence of ST451/ST1809. ST451/ ST1809 possessed higher number of virulence genes than average (165 [159-166] vs. 158[138-173], p<0.001). A gene specifically related to the functioning of porin (carO) [50] had decreased, which may be attributed to enhanced antimicrobial resistance. However, genes that were previously considered important in forming biofilms and iron utilization (bauA and pgaC) [49,51] decreased in number. The increased genes were ACICU_00075-ACICU_00080, ACICU_00086, and ACICU_00087, which are presumably associated with immune evasion [52]. Although further evaluation is warranted regarding the authentic effects of these putative virulence genes, the gradual accumulation of virulence genes that are transmitted by horizontal transfer emphasises the need for more intensive infection control strategies to prevent reinfections of CRAB.
We failed to observe any significant differences in patient outcomes according to STs. As more than 80% of patients died within 28 days of the first isolation of the organism, it is possible that the patients were in serious condition and hence died before the analysis reached a conclusive stage. However, previous studies also suggested that phenotypical characteristics are not in accordance with genotypical results. It is possible that individual virulence factors may not be important for A. baumannii virulence in human hosts [53], and the same virulence factor may play different roles in different habitats [54]. The expression of virulence-associated genes could be under different regulation in pathogenic and non-pathogenic species [55].
We found no discrepancies between phenotypic and genotypic results in terms of carbapenem resistance. This observation confirmed the results of a previous study, in which WGS accurately identified resistance to β-lactam antibiotics for gram-negative bacterial pathogens [56]. Whether this observation holds true for other types of antimicrobial agents warrants further studies.
The current study had a few limitations. First, only patients who were admitted to ICUs were involved, limiting further epidemiological investigation. Second, the relationships between strains, such as evolutionary linkages, were not determined. Third, detailed comparative genomics to verify mechanisms of acquisition of virulence or AMR genes acquisition were not analysed.
In conclusion, this study confirms the absence of accumulation of antimicrobial resistance determinants while the number of virulence genes increases in CRAB. Genetic transfer between subtypes cannot be ruled out; hence, it is important to be cautious about reinfections.
Supporting information S1  We would also like to thank e-World Editing (https://eworldediting.com/index.php) for providing English language editing service.