Genotypic Diversity of Staphylococcus aureus α-Hemolysin Gene (hla) and Its Association with Clonal Background: Implications for Vaccine Development

The α-hemolysin, encoded by the hla gene, is a major virulence factor in S. aureus infections. Changes in key amino acid residues of α-hemolysin can result in reduction, or even loss, of toxicity. The aim of this study was to investigate the diversity of the hla gene sequence and the relationship of hla variants to the clonal background of S. aureus isolates. A total of 47 clinical isolates from China were used in this study, supplemented with in silico analysis of 318 well-characterized whole genome sequences from globally distributed isolates. A total of 28 hla genotypes were found, including three unique to isolates from China, 20 found only in the global genomes and five found in both. The hla genotype generally correlated with the clonal background, particularly the multilocus sequence type, but was not related to geographic origin, host source or methicillin-resistance phenotype. In addition, the hla gene showed greater diversity than the seven loci utilized in the MLST scheme for S. aureus. Our investigation has provided genetic data which may be useful for future studies of toxicity, immunogenicity and vaccine development.


Introduction
Staphylococcus aureus is one of the leading human pathogens worldwide. It causes a broad range of diseases from superficial infections to life-threatening invasive diseases. Antimicrobial therapy is sometimes ineffective, owing to the development of antimicrobial-resistant strains, such as methicillin-resistant S. aureus (MRSA) [1]. S. aureus expresses various virulence factors, including a broad range of exotoxins. Some of these toxins, such as most of the phenolsoluble modulins and α-hemolysin (also known as α-toxin), are encoded in the core-genome; while others, such as Panton-Valentine leucocidin, are encoded by acquired mobile genetic elements [2,3].
The α-hemolysin, which belongs to a class of small β-barrel pore-forming cytotoxins, is a major virulence factor in S. aureus infections [2,4]. It is encoded by the 960-bp hla gene, which is initially produced as a 319-residue precursor, then processed to a 293-residue (approximately 33 kDa) mature toxin [4,5]. Previous studies have proven that changes in key amino acid residues of α-hemolysin, such as a histidine substitution at amino acid 35, can result in reduction or even loss of virulence [6,7].
In this study, we determined the genotypes of hla in 47 S. aureus isolates collected in China, and compared this with the clonal background of the isolates. For comparison, hla genotype and clonal background were also determined for well-characterized and published whole genome sequences of 318 global strains by in silico analysis.

Methods Ethics
The study was approved by the Human Research Ethics Committee of Peking Union Medical College Hospital (No. S-263). Written consent was obtained from patients, and the study was carried out in accordance with approved guidelines.

Molecular identification of isolates and mecA gene detection
DNA extraction of S. aureus isolates was performed as previously described [8]. A multiplex PCR was used for simultaneous amplification of 16S rRNA, femA and mecA genes for identification and differentiation of methicillin-susceptible S. aureus (MSSA) and MRSA isolates [9]. and hlaR1 (5'-CATAATTAATACCCTTTTTCTC-3') on the DNA analyzer ABI 3730XL system (Applied Biosystems, Foster City, CA).
The obtained 960-bp hla gene sequences were compared to a wild-type reference sequence from S. aureus strain WOOD 46 (GenBank accession no. X01645) [10], aligned using CLC sequence viewer (version 7, QIAGEN Aarhus, Denmark) to detect single nucleotide polymerases (SNPs), and designate genotypes (the hla DNA sequence was identical for all isolates belonging to a given genotype). Further, the α-hemolysin peptide sequences of isolates were deduced and aligned to determine the presence of amino acid substitutions.

Assignment of clonal background
All isolates were analyzed by multilocus sequence typing (MLST) and spa typing using previously established methods [11,12]. Assignment of related sequence types (STs) into clonal complexes (CCs) was conducted using eBURST [13]. In addition, all MRSA isolates were characterized by staphylococcal cassette chromosome mec (SCCmec) typing as described by Chen et al. [14]. S. aureus clones were named in the format of ST-spa type or ST-SCCmec type-spa type (e.g. ST5-t002 for a MSSA clone, or ST239-III-t030 for a MRSA clone).

In silico analysis of published whole genome sequences
Because only limited number of S. aureus isolates was involved in the present study, to better define the hla gene diversity and association with clonal background amongst S. aureus with more different genetic background, 318 selected well-characterized published genomes derived from other geographic regions were further studied. The 318 selected published genomes included i) genetic information of all S. aureus isolates with complete assembled whole genome sequences as at July 8 2015 were obtained from the NCBI Genome database (70 isolates, S1 Table), and ii) S. aureus genome sequences from four previous publications (248 isolates, S2 Table) [15][16][17][18]. These genomes comprised examples from the major worldwide lineages of MRSA, e.g. CC8 (including USA-300), CC1 (including USA 400), CC22 (including EMRSA15), CC93 (including Queensland CA-MRSA), CC30 (including EMRSA16) and CC121 isolates.
The MLST STs and hla genotypes of the above isolates were determined by in silico mapping the paired-end reads of 318 isolates to seven gene loci sequences of S. aureus ST1 and wild-type hla reference sequence (GenBank accession no. X01645), respectively, using Burrows-Wheeler Alignment [19]. Base coverage of each position of genes was assessed using SAMtools mpileup packages (http://samtools.sourceforge.net/mpileup.shtml). The support number of reference base (ref) and alternative base (alt) are examined at each position in each strain. High quality SNPs were defined when SNPs satisfied the criteria of alt/(alt+ref)>0.8. According to these SNPs, the ST and hla genotype for each isolate was identified.
MLST STs and hla genotypes were entered into BioNumerics software v7.5 (Applied Maths, Austin, TX) for minimum-spanning-tree analysis. The diversity of the sequences of the hla gene and seven genes utilized in the MLST scheme for S. aureus was analyzed by DnaSP (version 5.1, University of Barcelona, Spain).

Molecular identification and detection of the mecA gene
All 47 isolates were confirmed as S. aureus by molecular methods. Thirty-three isolates (70.2%) were determined to be MRSA by detection of the mecA gene.

Genotypes of the 960-bp hla gene
Amongst the 47 isolates from China, a total of eight hla genotypes (genotype 1 to 8) and six peptide sequence types were identified (Tables 1 and 2). Of the eight hla genotypes, genotype 1 was predominant (n = 27, 57% of 47 isolates), followed by genotype 3 (n = 7, 15%) and genotype 5 (n = 3, 6%). The remaining four genotypes were rare, with one or two isolates belonging to each (Table 1).
In comparison, the distribution of MSSA clones was more diverse, with no clone comprised of more than three isolates. ST5-t002 was represented by three MRSA and three MSSA isolates in this study, and ST59 was represented by one MRSA and one MSSA isolates with differing spa types. No other ST or CC was common to both MRSA and MSSA isolates ( Table 2). Relationship between hla genotypes and clonal background A strong correlation was observed between both hla genotypes and α-hemolysin peptide sequence types, and the clonal background of isolates from China. Of the eight hla genotypes, six were restricted to either MRSA (genotypes 1 and 2) or MSSA (genotypes 5, 6, 7 and 8) strains. The predominant MRSA clone, ST239-MRSA-III-t030, was represented by hla genotype 1 (13/15 isolates, 86.7%) or genotype 2 (2/15 isolates, 13.3%). All of the remaining ST239-MRSA-III isolates possessed hla genotype 1. Both genotype 1 and 2 hla genes encoded α-hemolysin of peptide sequence type 1 ( Table 2). hla genotypes 5 and 6 were found in one and four MSSA clones, respectively. Of the two ST188-MSSA-t189 isolates, one possessed hla genotype 7 and the other genotype 8, which differed by one deletion mutation at nucleotide position 479 ( Table 1).

The in silico analysis of 318 well-characterized genomes
Amongst 318 well-characterized S. aureus genomes, 25 hla genotypes were identified, including 20 genotypes not found in the 47 isolates from China (S1 and S2 Tables), for a total of 28 hla genotypes identified in this study. The SNPs identified for all clinical isolates and published genomes are summarized in S3 Table. Substantial diversity amongst S. aureus hla gene was found. Compared to the seven loci utilized in the MLST, the hla gene had higher nucleotide diversity (0.0256 vs. 0.0042-0.0119), more haplotypes identified (28 vs. 10-19), greater haplotype diversity (0.899 vs. 0.517-0.804) and higher non-synonymous polymorphisms/ synonymous sites ratio (3.598 vs. 3.059-3.569) ( Table 3). Of note, 79 of 107 ST22 S. aureus isolates and one ST188 isolate (hla genotype 8) had one to 16 deletion mutations in their hla sequences (S3 Table). In addition, all 20 ST36 isolates had a SNP C!T at sequence position 259, which resulted in a premature stop codon (S3 Table). These mutations would presumably inhibit production of the toxin protein.
In combining the 47 isolates from China and the 318 globally distributed genomes for composite analysis, it was again noted that hla genotype was closely related to the clonal background of the isolate, in particular the ST, with little association with the geographic origin, host source or methicillin-resistance phenotype ( Table 2, S1 and S2 Tables and Fig 1). The minimum-spanning tree analysis of MLST data shown that for 30 of 33 STs identified in the present study, isolates belonging to the same ST shared a unique hla genotype, and ST22 was  the only sequence type that comprised more than three (seven in all) hla genotypes (Fig 1). In addition, STs belonging to the same CC frequently shared the same or close-related hla genotype, e.g. ST5, ST105, ST225 and ST228 of CC5 was comprised of 24 isolates belonging to hla genotype 3, ST8 and ST250 of CC8 was comprised of 18 isolates belonging to hla genotype 7, and ST95, ST121 and ST123 of CC121 was comprised of seven isolates belonging to hla genotype 26 (Fig 1).
Although the same hla genotype may be shared by unrelated STs, this was observed uncommonly. For instance, one isolate each that belonged to ST72 and ST25 isolates were hla genotype 3, which was mostly associated with CC5 S. aureus isolates. Likewise, one ST188 isolate was identified as hla genotype 25, which was mostly associated with non-ST239 CC8 isolates.

Discussion
Infections due to antimicrobial-resistant pathogens are a growing problem all over the world. In developing countries like India and China, antimicrobial resistance is particularly prevalent, owing to previous unregulated overuse of antimicrobials [20,21]. S. aureus is one of the commonest Gram-positive bacterial pathogens, and in many places, the majority of S. aureus infections are now caused by multidrug-resistant strains, including MRSA and vancomycinresistant S. aureus (VRSA). Immunotherapies are now being investigated as an alternative therapeutic options for staphylococcal infections in the hope that these may avoid the selection pressure associated with the use of antimicrobials [3,22].
The S. aureus α-hemolysin was the first described of a family of bacterial pore-forming βbarrel toxins, which play an important role in the pathogenesis of staphylococcal disease [4,23]. As such, it was chosen as a potential target for the development of vaccines to combat S. aureus infections, and positive results have been obtained in some preclinical trials targeting pneumonia and skin and soft tissue infections [23][24][25][26]. It has been noted that substitutions in amino acid residues may reduce the activity of α-hemolysin. For instance, a α-hemolysin mutant with a H35L substitution was found to have no hemolytic or lethal activity, despite retaining the ability to bind to target cells [6,7]. The EMRSA-16 CC30 S. aureus isolates were another example. As observed in the present study, and as reported elsewhere, CC30-ST36 isolates had a SNP C!T at nucleotide sequence position 259, which resulted in a premature stop codon [27,28]. It has been proven that CC30 isolates possessed this SNP had significantly reduced toxin production and decreased lethality in a mouse model [27]. These α-hemolysin mutants could be considered as candidate immunogens in prototypic S. aureus vaccines [23][24][25][26]29].
Despite this work, little has been described regarding the genetic polymorphism of the hla gene in S. aureus. This is an important consideration, since variation in α-hemolysin peptide sequences could potentially lead to failure in antigen-antibody binding and thus compromise vaccine efficacy. In this study, we have illustrated the diversity of the hla gene in S. aureus and the relationship of hla sequence with clonal background, using 47 S. aureus clinical isolates from China supplemented with 318 well-characterized and globally distributed isolates with published whole genome sequences.
All ST239 from China were MRSA, and carried either genotype 1 or genotype 2 hla. These two genotypes differed by just one synonymous nucleotide mutation, (peptide sequence type 1). Amongst the 70 global S. aureus isolates with published genomes, seven isolates were ST239-MRSA-III, all of which also carried genotype 1 hla, regardless of the isolates' geographic origins. The ST239-MRSA-III clone has been reported largely to be hospital-acquired and widely disseminated in Brazil, Australia, New Zealand and many Asian countries in the past decade, although the prevalence different spa types within this clone (e.g. spa type t030 and t037) vary in different regions [8,30,31]. In a previous genome-based phylogeographic analysis, it was determined that human movement played an important role in the global dissemination of ST239-MRSA-III [32]. Therefore, the consistent hla genotype of this clone across different regions is not surprising.
Interestingly, all of the 14 ST5 S. aureus isolates (six clinical isolates from China and eight global strains), including nine ST5-MRSA-II and five ST5-MSSA strains, possessed genotype 3 hla. Genotype 3 hla was also found in other CC5 S. aureus clones, including ST228-MRSA-I-t041 (n = 8), ST105-MRSA-II-t002 (n = 1) and ST225-MRSA-II-t003 (n = 1). Likewise all six ST59 isolates analyzed in this study, despite diverse methicillin-resistance phenotypes, SCCmec and spa types, carried genotype 4 hla. The ST59 lineage is primarily a community-acquired MRSA clone predominant in China and several other Asian countries [33]. These results again indicate that the hla genotype correlated closely with the ST.
Meanwhile, ST22 isolates shown significantly higher hla genotype diversity (comprised seven hla genotypes in all) than other S. aureus clones. Only occasional discrepancies between hla genotype and ST were observed. Future vaccine development will need to account for the influence of this diversity on vaccine effect.

Conclusion
We have found substantial diversity amongst S. aureus hla gene and amino acid sequences. Strong correlations between hla genotypes and clonal background were found in S. aureus, regardless of the isolates' geographic origins and methicillin-resistance phenotype. Although the relative virulence of different hla genotypes remain undetermined, our investigation has provided some preliminary epidemiologic data which will be essential for future vaccine development.
Supporting Information S1 Table. In silico analysis of hla genotype and clonal background of 70 complete assembled whole genome sequences of S. aureus. (DOCX) S2 Table. In silico analysis of hla genotype and clonal background of and 248 S. aureus genomes from four previous publications.