Figures
Abstract
Bacillus cereus is a food contaminant with widely varying enterotoxic potential due to its virulence proteins. In this article, phylogenetic analysis of the amino acid sequences from the whole-genomes of 41 strains, evolutionary distance calculation of the amino acid sequences of the virulence genes, and functional and structural predictions of the virulence proteins were performed to reveal the taxonomically diverse distribution of virulence factors. The genome evolution of the strains showed a clustering trend based on the protein-coding virulence genes. The strains of B. cereus have evolved into non-toxic risk and toxic risk clusters with medium-high- and medium-low-risk subclusters. The evolutionary transfer distances of incomplete virulence genes relative to housekeeping genes were greater than those of complete virulence genes, and the distance values of HblACD were higher than those of nheABC and CytK among the complete virulence genes. Cytoplasmic localization was impossible for all the virulence proteins, and NheB, NheC, Hbl-B, and Hbl-L1 were predicted to be extracellular. Nhe and Hbl proteins except CytK had similar spatial structures. The predicted structures of Nhe and Hbl mainly showed ‘head’ and ‘tail’ domains. The ‘head’ of NheA and Hbl-B, including two α-helices separated by β-tongue strands, might play a special role in the formation of Nhe trimers and Hbl trimers, respectively. The ‘cap’ of CytK, which includes two ‘latches’ with many β-sheets, formed a β-barrel structure with pores, and a ‘rim’ balanced the structure. The evolution of B. cereus strains showed a clustering tendency based on the protein-coding virulence genes, and the complete virulence-gene operon combination had higher relative genetic stability. The beta-tongue or latch associated with β-sheet folding might play an important role in the binding of virulence structures and pore-forming toxins in B. cereus.
Citation: Zhang M, Liu J, Yin Z, Zhang L (2022) Phylogenetic and protein prediction analysis reveals the taxonomically diverse distribution of virulence factors in Bacillus cereus strains. PLoS ONE 17(5): e0262974. https://doi.org/10.1371/journal.pone.0262974
Editor: Fatih Oz, Ataturk Universitesi, TURKEY
Received: January 7, 2022; Accepted: April 21, 2022; Published: May 19, 2022
Copyright: © 2022 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All files are available from the NCBI database: https://www.ncbi.nlm.nih.gov/genome/?term=Bacillus+cereus.
Funding: The authors received “Doctor foundation project of nanfang college guangzhou (2020BQ23)” and “Science and technology projects in guangzhou (2022-1980)” fundings for this work.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: ANI, average nucleotide identity; MLSA, multilocus sequence analysis; Nhe, nonhemolytic enterotoxin; Hbl, hemolysin BL; CytK, cytotoxin K; DV, distance value; CV, composition vector; GMQE, global model quality estimate; NCBI, National Center for Biotechnology Information; adk, adenylate kinase; ccpA, catabolite control protein A; glpF, glycerol uptake facilitator protein; glpT, glycerol-3-phosphate transporter; panC, pantoate-beta-alanine ligase; pta, phosphate acetyltransferase; pyc, pyruvate carboxylase
Introduction
Bacillus cereus (B. cereus), which is one of twelve closely related species in the Bacillus cereus group [1], is a Gram-positive bacterium occurring ubiquitously in nature with widely varying pathogenic potential [2]. Cells are rod-shaped, with some in chains or occasionally long filaments, and are aerobic or facultative anaerobic. Most species will grow on common media such as nutrient agar and blood agar. They are characteristically large (2–7 mm in diameter) and vary in shape from circular to irregular, with matt or granular textures, while smooth and moist colonies are also common. B. cereus, the spores of which can survive at high temperatures and germinated vegetative cells of which can multiply and produce toxins under favorable conditions, is recognized as the most frequent cause of food-borne disease [3]. Its toxins cause two distinct forms of food poisoning, the emetic type (uncommon) and the diarrheal type (common). Diarrheal strains produce three enterotoxins, which belong to the family of pore-forming toxins: nonhemolytic enterotoxin (Nhe), hemolysin BL (Hbl), and cytotoxin K (CytK). Nhe comprises three proteins, NheA, NheB, and NheC, encoded by one operon containing one of three genes, namely, nheA, nheB, and nheC, respectively. Hbl consists of a single B-component (encoded by hblA) and two L-components, L1 (hblC) and L2 (hblD), all of which are essential for activity, with no individual or pairwise activity [4]. CytK (cytK) is a single-component toxin [5]. The genes that encode NheABC can be detected in nearly all enteropathogenic B. cereus strains, hblACD can be detected in approximately 45% to 65% of such strains, and cytK is less prevalent [6,7]. B. cereus species, which were compared on the basis of 16S rRNA (identity values >98%), were closely homologous to each other [8]. Nevertheless, the suitability of this marker for the classification of B. cereus might be limited, as it is unable to effectively distinguish between the closely related species [9]. Some papers have reported that the species affiliation of B. cereus group, which could lead to an exchange of virulence plasmids between species, often does not match patterns of phylogenetic relatedness [10,11]. While the enterotoxins of B. cereus are chromosome-coded, the unique characteristics are observed for plasmids and are thus present throughout the B. cereus group [12]. Lapidus et al. reported a large plasmid with an operon encoding all three Nhe components in a B. cereus strain [13]. There is evidence that extensive gene exchange occurs between plasmids and the chromosome during the evolution of the B. cereus group [14]. Therefore, some genes encoded on plasmids can spread via horizontal gene transfer among B. cereus and the transfer of a single plasmid from one species to another [15]. Didelot et al. detected three phylogenetic groups (clades) in a study on the evolution of pathogenicity in the B. cereus group [16]. B. cereus, as a genomospecies, could be mainly found in clade two based MLSA and multiple comparative analysis of ANI values [17]. Later, seven major phylogenetic groups with ecological differences were identified in the B. cereus group [10]. A recent study suggested that nine phylogenetic clades of isolates may be better for assessing the risk of diarrheal foodborne disease caused by B. cereus group isolates [18]. These studies of virulence factors of B. cereus concern the evolutionary classification of virulence genes, and there have been few comparative analyses of the relative evolutionary distance of virulence genes and the prediction of virulence protein function and structure. In this study, the genome, virulence gene sequences and predicted virulence proteins of 41 B. cereus strains were comparatively analyzed. This work aims to examine the species diversity of B. cereus strains and the phylogenetic relationships among virulence factors, to systematically evaluate the distribution of virulence genes, and to comparatively analyze the structures and functions of virulence proteins.
Materials and methods
Characterization of B. cereus strains
Forty-one strains of B. cereus with complete- and chromosome-level assemblies in the National Center for Biotechnology Information (NCBI) database were selected for comparative analysis. Datasets that passed the completeness test (acceptable level is >85%) and contamination test (acceptable level is <5%) were composed of sequences submitted on deadline March 5, 2021. Thirty-one pathogenic and ten nonpathogenic strains that were isolated from food, patients, the environment, and unknown sources, were eligible, along with a control strain, Sporolactobacillus terrae 70–3 (S. terrae 70–3), that belonged to a different genus. In terms of evolution, S. terrae which has a defined taxonomic and phylogenetic status, is closely related to B cereus in Bacillales. The sequences and annotation information for the stains were downloaded from the NCBI (details in Table 1).
Quality assessment of genomic sequences
The contamination and completeness of the metagenomic sequences were evaluated by CheckM software version v1.1.3 [19].
Phylogenetic and average nucleotide identity (ANI) analysis
The first phylogenetic tree, based on whole-genome amino acid sequences of each strain, was constructed by using the CVTree4 webserver (http://cvtree.online/v4/prok/index.html), which constructs whole-genome-based phylogenetic trees without sequence alignment by using a composition vector (CV) approach, and the K-tuple length was 6 [20]. Every genome sequence was represented by a composition vector, which was calculated as the difference between the frequencies of k-strings and the prediction frequencies by the Markov model [21]. The shape and text content of the phylogenetic tree were modified by Molecular Evolutionary Genetics Analysis (MEGA-X version 10.2.2) [22]. In this study, the same genome sequence data were subjected to ANI analysis to verify the significance of the first phylogenetic tree. ANI analysis was performed using JSpeciesWS Online Service (http://jspecies.ribohost.com/jspeciesws/) as described by Richter et al. [23]. The distance matrix, which was calculated by the distance value (DV) using the formula DV = 1-[ANIb value], was used to construct the second phylogenetic tree, which was generated from the resulting Newick format file using Njplot [24]. The formula was balanced using the mean value method and was subjected to calculation using DrawGram in the PHYLIP package version 3.695 [25]. To determine the associations between each protein-coding gene and the different clusters, statistical enrichment analyses were conducted with PhyloGLM V2.6 [17].
Multilocus sequence analysis (MLSA)
A total of forty-one strains containing gene sequences, which were downloaded from the NCBI, were found and further analyzed for the presence of seven housekeeping and three enterotoxin genes. The housekeeping genes adenylate kinase (adk), catabolite control protein A (ccpA), glycerol uptake facilitator protein (glpF), glycerol-3-phosphate transporter (glpT), pantoate-beta-alanine ligase (panC), phosphate acetyltransferase (pta), and pyruvate carboxylase (pyc) were chosen to calculate the basic evolutionary distances of the species. These housekeeping genes, scattered across the entire chromosome, are suitable for MLSA [26]. The types of enterotoxin genes (nhe, hbl, and cytK) were divided into different groups, and the base sequences were concatenated for further MLSA. Thus, rearrangement of genes was unnecessary because the order of the genes within the operons was conserved in all strains. The distances of concatenated genes were calculated in MEGA X using the maximum likelihood (ML) algorithms, which are based on the Tamura-Nei model with a discrete gamma distribution [22]. The model applied for MLSA of DVs was the ideal substitution model according to the ‘find best DNA/Protein models’ function [27]. The housekeeping genes were of the same length in all strains, as were the different virulence genes. The same settings for the calculation of all phylogenetic DVs were used to ensure comparability of the results. We calculated the relative changes in genetic DVs between the virulence genes, which were concatenated housekeeping genes minus the simple housekeeping gene, representing the change in virulence gene transfer.
Prediction of virulence protein function and structure
SMART software (http://smart.embl-heidelberg.de/), which is a simple modular architecture research tool, was used to predict the domain architecture of the virulence proteins in this study [28]. PSORT (http://www.psort.org/psortb2) and TMHMM (http://www.cbs.dtu.dk/services/TMHMM) software were employed to predict the subcellular location and transmembrane helices of virulence proteins, respectively [29,30]. The SIGNALP-5.0 (http://www.cbs.dtu.dk/services/SignalP/), SWISS-MODEL (http://swissmodel.expasy.org) and AlphaFold v2.1.1 (https://github.com/deepmind/alphafold) servers were used to predict the signal peptide cleavage and three-dimensional (3-D) structures of the enterotoxin proteins, respectively [31–34]. The amino acid sequences of the virulence proteins analyzed were submitted in FASTA format. To predict structure, we performed homology modeling to generate 3-D virulence protein structures.
Results
General genome characteristics and quality assessment of sequences
A summary of the features of the forty-one genomes of B. cereus and the control genome of the closely related species S. terrae is provided in Table 1. The genome sizes of B. cereus strains varied from 5.20 to 6.47 MB. The G+C contents of the forty-one genomes ranged from 34.70% to 35.75%. Compared with the control genome from S. terrae 70–3 (3.31 MB and 45.30%), the genomes of B. cereus were much larger and had lower G+C contents. The contamination and completeness of the sequences were 0–2.02% and 89.81%-98.99%, respectively (shown in Table 2). These results suggested that these sequences are of high quality, have low contamination (values <2.02%, acceptable level is <5%), and have high completeness (values >89.81%, acceptable level is >85%); thus, they were appropriate for analysis. In this study, the strains originated from food (5/41), the clinic (12/41), the environment (17/41), and undetermined sources (7/41). The enterotoxic risk potential based on the virulence genes of forty-one B. cereus strains is listed in Table 2. Enterotoxicity, which was reflected by virulence gene numbers, was categorized into levels of three types (10/41), two types (14/41), one type (7/41), and no types (10/41) levels. The genes detected as enterotoxic were nheABC (29/41), hblACD (19/41), cytk (17/41), nheAB (10/41), hblCD (6/41), hblAD (2/41), and hblD (2/41) (shown in Table 2).
Phylogenetic analysis based on whole amino acid sequences
Two whole-genome-based methods were used to construct phylogenetic trees. The first phylogenetic tree was constructed with the CV method using the whole amino acid sequences of forty-one B. cereus strains and the outgroup species S. terrae 70–3 [35]. To ensure the accuracy of the results, we added the inbuilt sequence AH1273 and the sequence of S. terrae 70–3 (control) from the webserver database (Fig 1). According to enterotoxic risk potential, the forty-one strains of B. cereus had evolved into five distinct clusters, which were likely risk regions I, IV and V and nonrisk regions II and III. However, there were individual nonconformities, such as nontoxicity of BDRD ST196 in region V. Region I was dominated by medium- and high-risk strains (15/17) but also included two low-risk strains (BHU1 and CO1-1), region II and III included only nonrisk strains (9/9), and regions IV and V were dominated by medium- and low-risk strains (13/15) but also included AH820 (high-risk strain) and BDRD ST196 (nonrisk strain), respectively.
Two additional inbuilt strains from the software were used as internal controls. H, M, L, and "-” symbols indicate high (three types of virulence genes), middle (two types of virulence genes), low (one type of virulence gene), and no enterotoxic risk potential.
To verify the above results and obtain more accurate molecular evolutionary relationships, we established a second phylogenetic tree based on ANI analysis (Fig 2). According to enterotoxic risk potential, the forty-one strains of B. cereus had evolved into six distinct regions: likely risk regions A, C, E, and D2 and nonrisk regions B and D1. The two phylogenetic trees were similar in terms of the regions where enterotoxic risk was likely. The only difference between the two phylogenetic trees was a change in the evolutionary cluster of two strains. The ATCC 10987 and ATCC 4342 strains, which belonged region IV in the first tree, were assigned to region D in the second tree. Region A, which was dominated by medium-high-risk strains (15/17) but also included two low-risk strains (BHU1 and CO1-1), was the same as region I. Regions B and D1, which included only nonrisk strains (9/9), were the same as regions II and III. Regions C, D2 and E, which were also dominated by medium- and low-risk strains (13/15) but included AH820 (high-risk strain in C) and BDRD ST196 (nonrisk strain in E), were the same as regions IV and V. By taking advantage of the updated enterotoxic risk regions found in the current B. cereus strains, we decided to use a statistical approach to evaluate whether the occurrence of virulence factor-encoding genes (detailed in Table 3) correlate with a particular region. We observed that nheABC was significantly present in region C (p < 0.05), and hblA and nheC were significantly present in region A and C (p < 0.05), respectively. The results showed that nheABC and nheC were significantly enriched in the medium-low-risk region, and hblA was significantly enriched in the medium-high-risk region.
H, M, L, and "-” have the same meanings as in Table 2.
Phylogenetic distance analysis based on concatenated housekeeping and virulence genes
To analyze the evolution and phylogenetic relationships of virulence gene transfer in relation to DVs, the nheABC, hblACD, and cytK genes of the forty-one strains, which need to be compared to the housekeeping genes of the strains, were studied. To this end, we concatenated the sequences of virulence proteins from the strains and seven housekeeping proteins (Adk-CcpA-GlpF-GlpT-PanC-Pta-Pyc) from the B. cereus core genome. The genetic DV of virulence gene transfer was evaluated by calculating the average difference in the phylogenetic DV of the ATCC14579 strain compared with forty other strains. The hblD and hblAD virulence proteins, which were observed in only two strains, were excluded. As shown in Table 2, the relative genetic DVs were calculated for nheAB (10/41), hblCD (6/41), nheABC (29/41), hblACD (19/41), and CytK (17/41). As shown in Fig 3, the average evolutionary DVs of virulence gene transfer from high to low were 0.015 (nheAB), 0.012 (hblCD), 0.005 (hblACD), 0.003 (nheABC) and 0.001 (cytK). The DVs of incomplete virulence genes (nheAB and hblCD) were higher than those of complete virulence genes (nheABC, hblACD, and CytK). The average evolutionary DV of nheAB was higher than the DV of hblCD among the incomplete virulence genes; the DV of hblACD was the highest and that of cytK was the lowest among the complete virulence genes.
Comparative prediction analysis of the function and structure of virulence proteins
As shown in Table 4 and Fig 4, we obtained the scores of the seven virulence proteins for subcellular localization prediction. The scores of NheB, NheC, Hbl-B, and Hbl-L1 were all 9.73, and that of CytK was 9.98, all consistent with extracellular localization. The localization of NheA and Hbl-L2 was unknown because the scores were all lower in the cytoplasmic membrane (3.33/4.6), cell wall (3.33/2.48), and extracellular space (3.33/2.92), making it impossible for the virulence proteins to appear in the cytoplasm. NheB and Hbl-L1 had two helices, which were transmembrane region sequences 235-257/267-286 and 239-261/268-290, and NheC had only one helix, of which the transmembrane region was 228–250, but the others had none. The virulence protein cleavage sites of all strains were in the sequence 30–32 with 0.93–0.99 likelihood levels, except NheA, for which the site was in the 26–27 sequence (0.81 likelihood level). The amino acid sequences of Nhe, Hbl and CytK contained N-terminal signal peptides for secretion (< 31 amino acids). The signal peptide start-end was between 1 and 31 sequences but not found for Hbl-B and Hbl-L1 were not found, and the domain start-end was between 35 and 329 sequences.
Examination of the phylogenetic tree constructed using the Hbl, Nhe, and CytK sequences of ATCC 14579 (Fig 5) showed that NheA and Hbl-L2, as well as NheBC and Hbl-L1, were more closely related to one another than to the other components, and CytK was the least evolutionarily related. This result was also reflected in the evaluation parameters of the 3-D enterotoxin protein structures. As shown in Table 5, the closest template for NheB and NheC was Hbl-L1 (sequence identity of 40.82% and 36.83%, respectively), and that for Hbl-L2 was NheA (24.85%). The closest template for CytK was alpha-hemolysis (30.39%), as expected, with considerable amino acid sequence homology to S. aureus leukocidin [36]. The templates of NheA, and Hbl-L1 were included in the SWISS-MODEL server with high sequence identity (97.22% and 99.73%, respectively), and Hbl-B had acceptable sequence identity (71.99%). The sequence coverage and range of all structures were 0.71–0.93 and 33–439, respectively, with GMQE evaluation values (0.54–0.88) above 0.5, which indicated reliable model construction. Each residue is allotted a reliability score between 0 and 1, indicating the expected resemblance to the native structure. Higher numbers represent higher reliability of the residues [37]. In general, a sequence identity of >30% for each template was acceptable based on the SWISS-MODEL server. To verify the above results (especially the sequence identity of Hbl-L2, which was 24.85%) and obtain the predicted Nhe-trimer and Hbl-trimer structures, we used AlphaFold software for secondary structural prediction. The results were acceptable, the predicted local-distance difference test (plDDt) values of monomers were 81.90–94.14, and the scores of Nhe trimers and Hbl trimers were 0.68 and 0.36 (ipTM+pTM), respectively [33,34].
Due to the sequence similarity of NheB and NheC with Hbl-B, homology models based on the Hbl-B structure were established. As shown in Fig 6, NheA and Hbl-B had highly similar structures (Fig 6A), and the NheA, NheB, NheC, HBl-B, HBl-L1, and HBl-L2 structures showed that there were two main domains, a ‘head’ and ‘tail’ (Figs 6B–6D, 6F–6H, and 7A–7F). The main body of the structure was formed by the ‘tail’ domain, which consisted of five major helices, and the ‘head’ domain of NheA included two long α-helices separated by β-tongue strands (Figs 6B and 7A). Multiple β-tongue strands were detected in Hbl-B (Fig 7D) but are not shown in Fig 6F because of the prediction method. Another difference was the ‘head’ of Hbl-L2, possibly related to the low sequence identity (Figs 6H and 7F). The ‘latch’ with many β-sheets of CytK folded the ‘cap’ domain, which was the toxic area (Figs 6E and 7G). The amino ‘latch’, which included a short helix in all known pore structures, was observed on the top of the conformation, which extended into the pore to form a β-barrel and was folded into a stranded antiparallel β-sheet in the monomer. Although the amino ‘latch’ protrudes and interacts with the adjacent protomer in the pore, it is located at the edge of the β-sheet of the ‘cap’ region [39]. The ‘rim’ domain, which was composed of three strands of short β-sheets, formed the main body of the balanced structure. The trimers of Nhe and Hbl were horizontally arranged. The Hbl trimer (arranged in the sequence B, -L1, -L2) was more similar than the Nhe trimer (arranged in the sequence A, B, C) based on the structural features. The β-tongue strands of Hbl-B and NheA might play an important structural and functional role in the formation of trimers.
a shows the superposition of the structures of Hbl-B (green) and NheA (burgundy) [40]. b, c, d, f, g, and h show the structures of NheA, NheB, NheC, Hbl-B, Hbl-L1 and Hbl-L2, which are annotated with the ‘head’ and ‘tail’, respectively. b shows a beta-tongue in the ‘head’ region. e shows the structure of CytK, which is annotated with ‘latch’, ‘cap’ and ‘rim’.
A, B, C, D, E, and F are the structures of NheA, NheB, NheC, Hbl-B, Hbl-L1, and Hbl-L2, which are annotated with the ‘head’ and ‘tail’, respectively. A and D show beta-tongues in the ‘head’ region. G is the structure of CytK, which is annotated with ‘latch’, ‘cap’ and ‘rim’. H and I show the trimers of the structures of NheA and Hbl-B (green), NheB and Hbl-L2 (wathet), and NheC and Hbl-L1 (pink).
Discussion
In this study, forty-one strains of B. cereus were subjected to phylogenetic analyses based on whole amino acid sequences. Enterotoxicity, which was evaluated on the basis of nheABC, hblACD, and cytK gene expression, was classified into levels of three types, two types, one type, and no types. In terms of evolutionary relationships, clusters of virulence and nonvirulence gene strains were evident, and the regional distribution of the number of types of virulence genes was also presented, further confirmed by ANI-based phylogenetic analyses. We found that the two phylogenetic trees were similar. All non-toxic-risk strains were concentrated in two clusters, and all but two of the medium-high- and medium-low-toxic-risk strains formed clusters. The results suggest the possibility of virulence gene transfer, which may be related to frequent exchange of pathogenicity factors during B. cereus virulence evolution, including so-called probiotic or nonpathogenic species [15]. Previous taxonomic results for the B. cereus group are largely based on inadequate criteria such as virulence characteristics, which residing on virulence plasmids [9]. Due to rampant horizontal gene transfer in bacterial ecosystems, increasing numbers of “core” genes should be found and defined based on refined species classification [41]. Recently, phylogeny-aware methods based on linear regression models were applied at the whole-genome scale to study the genomes of bacteria [17]. By using this statistical approach, we hereby observed that the virulence genes nheABC and nheC positively correlated with enrichment in the medium-low-risk cluster, and hblA was found in the medium-high-risk cluster. The inconsistent evolutionary distribution of individual virulence genes may be due to other factors, which needs further study.
The Bacillus hemolytic and nonhemolytic enterotoxin family of proteins consists of several Bacillus enterotoxins, which can cause food poisoning in humans [42]. Hemolytic BL and cytotoxin K (encoded by hblACD and cytK) and nonhemolytic enterotoxin (encoded by nheABC) represent the significant enterotoxins produced by B. cereus. Cardazzo et al. detected horizontal gene transfer in the evolution of enterotoxins within B. cereus strains [43]. Our MLSA results showed that in the process of toxin molecular evolution, there were differences between the results for complete and incomplete virulence proteins, and two toxic-type genes had a more significant effect in relation to DVs than three toxic-type genes. The results suggested that the complete virulence-gene operon combination has higher relative genetic stability. The DV of hemolysin Bl was greater than that of nonhemolytic cytotoxin K. nheABC, which was responsible for most of the cytotoxic activity of B. cereus isolates, showed stable, strictly vertical inheritance [44]. In contrast to hbl, duplication or deletion of nhe, which was almost exclusively transmitted vertically, was rarely observed, and cytK, a one-type gene, had the highest relative genetic stability [15].
Bazinet revealed significant associations of particular genes with phenotypic traits shared by groups of taxa [41]. Currently, it is commonly accepted that the toxicity potential of B. cereus is not driven by enterotoxin gene types because the expression of enterotoxin genes is highly complex and probably strain-specifically affected by transcription, posttranscriptional and posttranslational modification [45–47]. Carroll et al. suggested that further classification and descriptions of phenotypes should be added on the basis of genotype classification in B. cereus [2]. In this study, both Nhe and Hbl are three-component cytotoxins composed of binding components A and B and two lytic components B, C and -L1, -L2, with all three subunits acting synergically to cause illness. The amino acid sequences of all Nhe, Hbl and Cytk components containing N-terminal signal peptides indicated toxin secretion via the secretory translocation pathway. The final positions of Hbl-B, Hbl-L1, NheB, NheC, and CytK were all extracellular and did not appear in the cytoplasm, and the two transmembrane regions of NheB and Hbl-L1 might be responsible for transporting the assembled three-component cytotoxins across the membrane to complete the toxic effect. Dietrich et al. found that the factor triggering enterotoxin production under simulated intestinal conditions by various cell lines from different organisms and compartments was independent of cell differentiation [48]. Similarities were found when predicted transmembrane helices were compared. NheA and Hbl-B had no such helices, and NheB and Hbl-L1 had two that may play an important role in molecular docking and transmembrane activities. Furthermore, the Nhe components seem to be additionally processed in the extracellular space after separation from the signal peptide for secretion [48]. The difference is that NheC had one such component and Hbl B had none, which may strengthen the secretion of the Nhe protein.
The Nhe and Hbl proteins share sequence similarities, both between the three components of each complex and between the two enterotoxin complexes [12]. The structural and functional properties were consistent with those of the superfamily of pore-forming cytotoxins of Hbl and Nhe [49–51]. The NheA, NheB, NheC, HBl-B, HBl-L1, and HBl-L2 structures showed two main domains, a ‘head’ and ‘tail’. The ‘heads’ of NheA and Hbl-B, including two α-helices separated by β-tongue strands, play a special role in Nhe trimers and Hbl trimers, respectively. Upon contact with lipids, cell membranes or detergents, the protein oligomerizes and forms ring-shaped structures acting as transmembrane pores [52,53], and the hydrophobic β-tongue is assumed to be inserted into the membrane first [54]. It is worth noting that NheB, NheC, Hbl-L1, and Hbl-L2 had few or no β-strands in the ‘head’, which were either responsible for conformational changes of NheA and Hbl-B or for the stabilization of the ‘head’ domain [50] or might lead to reduced toxicity or only ligand function [41]. The difference was reflected in the triplet prediction results, i.e., a significant difference in structural arrangement compactness between Nhe trimers (noncompact type) and Hbl trimers (compact type). Ganash et al. speculated that the Nhe trimer requires interaction with unknown proteins of an additional function [41]. A specific binding order of the three Nhe and Hbl components is also necessary for pore formation [55,56]. We found that NheB and Hbl-L1 were close to NheA and Hbl-B in the predicted trimer structure. Didier et al. found that NheA is important for attaching to cell-bound NheB and NheC and that NheB is the main interaction partner of NheA [57], and a further correlation was found for the amounts of Hbl B and Hbl L1 [46]. Cytotoxin K is a single protein with β-barrel pore-forming toxin in contrast to the tripartite toxin complexes Hbl and Nhe. The CytK structure, which exhibits two ‘latches’ with many β-sheets folded beside the ‘cap’ domain forming a β-barrel, was the pore structure on top of the conformation. The ‘rim’ region, which was folded into a three-stranded antiparallel β-sheet, balanced the structure in the monomer. The predicted structure revealed that CytK was likely to belong to the leukocyte toxin family. These monomers diffuse to target cells and are attached to them by specific receivers [58], which are lipids and proteins that cause lysis of red blood cells by destroying their cell membrane [59].
Conclusion
In this study, we describe the molecular evolution, function and structural diversity of virulence factors in B. cereus strains. The evolution of B. cereus strains showed a clustering trend based on the coding virulence genes. The complete virulence gene operon combination had higher relative genetic stability than the incomplete operon. The two α-helices in the ‘head’ of the NheA and Hbl-B structures, which are separated by β-tongue strands, and two ‘latches’ with many β-sheets folded beside the ‘cap’ of the CytK structure might play a special role in the binding of virulence structures and pore-forming toxins in B. cereus. Overall, the exact mechanism by which B. cereus causes diarrhea remains unknown, but our results provide helpful information for better understanding the taxonomically diverse distribution of virulence factors in B. cereus strains.
References
- 1. Liu Y, Du J, Lai Q, Zeng RY, Ye DZ, Xu J, et al. Proposal of nine novel species of the Bacillus cereus group. Int J syst Evol Microbiol. 2017; 67: 2499–2508. pmid:28792367
- 2. Carroll LM, Wiedmann M, Kovac J. Proposal of a taxonomic nomenclature for the Bacillus cereus group which reconciles genomic definitions of bacterial species with clinical and industrial phenotypes. mBio. 2020; 11(1): e00034–20. pmid:32098810
- 3. Webb MD, Barker GC, Goodburn KE, Peck MW. Risk presented to minimally processed chilled foods by psychrotrophic Bacillus cereus. Trends Food Sci Technol. 2019; 93: 94–105. pmid:31764911
- 4.
Bhunia AK. Foodborne microbial pathogens: Mechanisms and pathogenesis. Springer, New York; 2008.
- 5. Rajkovic A, Jovanovic J, Monteiro S, Decleer M, Andjelkovic M, Foubert A, et al. Detection of toxins involved in foodborne diseases caused by Gram-positive bacteria. Compr Rev Food Sci Food Saf. 2020; 19(4):1605–1657. pmid:33337102
- 6. Jessberger N, Dietrich R, Schwemmer S, Tausch F, Schwenk V, Didier A, et al. Binding to the target cell surface is the crucial step in pore formation of hemolysin BL from Bacillus cereus. Toxins. 2019; 11(5): 281–297. pmid:31137585
- 7. Koné KM, Douamba Z, Halleux MD, Bougoudogo F, Mahillon J. Prevalence and diversity of the thermotolerant bacterium Bacillus cytotoxicus among dried food products. J Food Protec. 2019; 82(7): 1210–1216. pmid:31233363
- 8. Thompson CC, Chimetto L, Edwards RA, Swings J, Stackebrandt E, Thompson FL. Microbial genomic taxonomy. BMC Genomics. 2013; 14: 913. pmid:24365132
- 9. Liu Y, Lai QL, Göker M, Meier-Kolthoff JP, Wang M, Sun YM, et al. Genomic insights into the taxonomic status of Bacillus cereus group. Sci Rep 2015; 5: 14082. pmid:26373441
- 10. Guinebretière MH, Thompson FL, Sorokin A, Normand P, Dawyndt P, Ehling-Schulz M, et al. Ecological diversification in the Bacillus cereus Group. Environ Microbiol. 2008; 10(4): 851–865. pmid:18036180
- 11. Helgason E, Økstad OA, Caugant DA, Johansen HA, Fouet A, Mock M, et al. Bacillus anthracis, Bacillus cereus, and Bacillus thuringiensis-one species on the basis of genetic evidence. Appl Environ Microbiol. 2000; 66(6): 2627–2630. pmid:10831447
- 12. Dietrich R, Jessberger N, Ehling-Schulz M, Märtlbauer E, Granum PE. The food poisoning toxins of Bacillus cereus. Toxins. 2021; 13(2): 98. pmid:33525722
- 13. Lapidus A, Goltsman E, Auger S, Galleron N, Ségurens B, Dossat C, et al. Extending the Bacillus cereus group genomics to putative food-borne pathogens of different toxicity. Chem. Interact. 2008; 171(2): 236–249. pmid:17434157
- 14. Zheng J, Guan Z, Cao S, Peng D, Ruan L, Jiang D, Sun M. Plasmids are vectors for redundant chromosomal genes in the Bacillus cereus group. BMC Genom. 2015; 16(1): 1–10. pmid:25608745
- 15. Böhm ME, Huptas C, Krey VM, Scherer S. Massive horizontal gene transfer, strictly vertical inheritance and ancient duplications differentially shape the evolution of Bacillus cereus enterotoxin operons hbl, cytK and nhe. BMC Evol Biol. 2015; 15(1): 246. pmid:26555390
- 16. Didelot X, Barker M, Falush D, Priest FG. Evolution of pathogenicity in the Bacillus cereus group. Syst Appl Microbiol. 2009; 32(2): 81–90. pmid:19200684
- 17. Torres Manno MA, Repizo GD, Magni C. Dunlap CA, Espariz M. The assessment of leading traits in the taxonomy of the Bacillus cereus group. Antonie van Leeuwenhoek. 2020; 113: 2223–2242. pmid:33179199
- 18. Kovac J, Miller RA, Carroll LM, Kent DJ, Jian JH, Beno SM, et al. Production of hemolysin BL by Bacillus cereus group isolates of dairy origin is associated with whole-genome phylogenetic clade. BMC genomics. 2016; 17: 581. pmid:27507015
- 19. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015; 25(7):1043–1055. pmid:25977477
- 20. Zuo G, Hao B. CVTree3 web server for whole-genome-based and alignment-free prokaryotic phylogeny and taxonomy. Genom Proteom Bioinf. 2015; 13: 321–331. pmid:26563468
- 21. Zuo G. CVTree: a parallel alignment-free phylogeny and taxonomy tool based on composition vectors of genomes. Genom Proteom Bioinf. 2021; 19: 1–6. pmid:34119695
- 22. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018; 35(6): 1547–1549. pmid:29722887
- 23. Richter M, Rosselló-Móra R, Glöckner FO, Peplies J. JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison. Bioinformatics. 2016; 32(6): 929–931. pmid:26576653
- 24. Perrière G, Gouy M. WWW-query: an on-line retrieval system for biological sequence banks. Biochimie. 1996; 78(5): 364–369. pmid:8905155
- 25. Baum BR. PHYLIP: phylogeny inference package. version 3.2. Joel Felsenstein. Biology. 1989; 64(4): 539–541.
- 26. Tourasse NJ, Helgason E, Økstad OA, Hegna IK, Kolst AB. The Bacillus cereus group: novel aspects of population structure and genome dynamics. J Appl Microbiol. 2006; 101(3): 579–593. pmid:16907808
- 27. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013; 30(12): 2725–2729. pmid:24132122
- 28. Letunic I, Khedkar S, Bork P. SMART: recent updates, new developments and status in 2020. Nucleic Acids Res. 2021; 49(D1): D458–D460. pmid:33104802
- 29. Peabody MA, Laird MR, Vlasschaert C, Lo R, Brinkman FSL. PSORTdb: expanding the bacteria and archaea protein subcellular localization database to better reflect diversity in cell envelope structures. Nucleic Acids Res. 2016; 44(D1): D663–D668. pmid:26602691
- 30. Möller S, Croning MDR, Apweiler R. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics. 2001; 17(7): 646–653. pmid:11448883
- 31. Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019; 37(4): 420–423. pmid:30778233
- 32. Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, et al. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res. 2014; 42: 252–258. pmid:24782522
- 33. Jumper J, Evans R, Pritzel A, Green T, Flgurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596(7873): 583–589. pmid:34265844
- 34. Evans R, O’Neill M, Pritzel A, Antropova N, Senior A, Green T, et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv preprint. 2021 October 4.
- 35. Thitiprasert S, Piluk J, Tolieng V, Tanaka N, Shiwa Y, Fujita N, et al. Draft genome sequencing of Sporolactobacillus terrae SBT-1, an efficient bacterium to ferment concentrated sugar to d-lactic acid. Arch Microbiol. 2021; 203: 3577–3590. pmid:33961074
- 36. Sugawara T, Yamashita D, Kato K, Zhao P, Ueda J, Kaneko J, et al. Structural basis for pore-forming mechanism of staphylococcal α-hemolysin. Toxicon. 2015; 108: 226–231. pmid:26428390
- 37. Sharma A, Ponmariappan S, Sarita R, Alam SI, kamboj DV, Shukla S. Identification of cross reactive antigens of C. botulinum types A, B, E & F by immunoproteomic approach. Curr Microbiol. 2018; 75(5): 531–540. pmid:29332140
- 38. Mariani V, Biasini M, Barbato A, Schwede T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013; 29(21): 2722–2728. pmid:23986568
- 39. Takaki S, Daichi Y, Koji K, Zhao P, Junki U, Jun K, et al. Structural basis for pore-forming mechanism of staphylococcal α-hemolysin. Toxicon. 2015; 108: 226–231. pmid:26428390
- 40. Ganash M, Phung D, Sedelnikova SE, Lindbäck T, Granum PE, Artymiuk PJ. Structure of the NheA component of the Nhe toxin from Bacillus cereus: Implications for function. PLoS ONE. 2013; 8(9): e74748. pmid:24040335
- 41. Bazinet AL. Pan-genome and phylogeny of Bacillus cereus sensu lato. BMC Evol Biol. 2017; 17:176. pmid:28768476
- 42. Phelps RJ, McKillip JL. Enterotoxin production in natural isolates of Bacillaceae outside the Bacillus cereus group. Appl Environ Microbiol. 2002; 68(6): 3147–3151. pmid:12039781
- 43. Cardazzo B, Negrisolo E, Carraro L, Alberghini L, Patarnello T, Giaccone V. Multiple-locus sequence typing and analysis oftoxin genes in Bacillus cereus food-borne isolates. Appl Environ Microbiol. 2008; 74(3): 850–860. pmid:18083872
- 44. Moravek M, Dietrich R, Buerk C, Broussolle V, Guinebretière MH, Granum PE, et al. Determination of the toxic potential of Bacillus cereus isolates by quantitative enterotoxin analyses. FEMS Microbiol Lett. 2006; 257(2): 293–298. pmid:16553866
- 45. Dietrich R, Moravek M, Bürk C, Granum PE, Märtlbauer E. Production and characterization of antibodies against each of the three subunits of the Bacillus cereus nonhemolytic enterotoxin complex. Appl Environ Microbiol. 2005; 71(12): 8214–8220. pmid:16332805
- 46. Jeßberger N, Dietrich R, Bock S, Didier A, Märtlbauer E. Bacillus cereus enterotoxins act as major virulence factors and exhibit distinct cytotoxicity to different human cell lines. Toxicon. 2014; 77(1): 49–57. pmid:24211313
- 47. Jeßberger N, Krey VM, Rademacher C, Böhm ME, Mohr AK, Ehling-Schulz M, et al. From genometo toxicity: a combinatory approach highlights the complexity of enterotoxin production in Bacillus cereus. Front. Microbiol. 2015; 6(6): 560. pmid:26113843
- 48. Dietrich R, Jeßberger N, Ehling-Schulz M, Märtlbauer E, Granum PE. The food poisoning toxins of Bacillus cereus. Toxins. 2021; 13(2): 98. pmid:33525722
- 49. Fagerlund A, Lindbäck T, Storset AK, Granum PE, Hardy SP. Bacillus cereus Nhe is a pore-forming toxin with structural and functional properties similar to the ClyA (HlyE, SheA) family of haemolysins, able to induce osmotic lysis in epithelia. Microbiol. 2008; 154 (Pt 3): 693–704. pmid:18310016
- 50. Madegowda M, Eswaramoorthy S, Burley SK, Swaminathan S. X-ray crystal structure of the B component of Hemolysin BL from Bacillus cereus. Proteins Struct Funct & Bioinform. 2008: 71(2): 534–540. pmid:18175317
- 51. Phung D, Ganash M, Sedelnikova SE, Lindbäck T, Granum PE, Artymiuk PJ. Crystallization and preliminary crystallographic analysis of the NheA component of the Nhe toxin from Bacillus cereus. Acta Cryst. 2012; 68 (Pt 9): 1073–1076. pmid:22949198
- 52. Wallace AJ, Stillman TJ, Atkins A, Jamieson SJ, Bullough PA, Green J, et al. E. coli Hemolysin E (HlyE, ClyA, SheA). Cell. 2000; 100 (2): 265–276. pmid:10660049
- 53. Eifler N, Vetsch M, Gregorini M, Ringler P, Chami M, Philippsen A, et al. Cytotoxin ClyA from Escherichia coli assembles to a 13-meric pore independent of its redox-state. EMBO J. 2006; 25 (11): 2652–2661. pmid:16688219
- 54. Mueller M, Grauschopf U, Maier T, Glockshuber R, Ban N. The structure of a cytolytic α-helical toxin pore reveals its assembly mechanism. Nature. 2009: 459: 726–730. pmid:19421192
- 55. Worthy HL, Williamson LJ, Auhim HS, Leppla SH, Sastalla I, Jones DD, et al. The crystal structure of Bacillus cereus HblL1. Toxins. 2021; 13(4): 253–266. pmid:33807365
- 56. Lindbäck T, Fagerlund A, Rødland MS, Granum PE. Characterization of the Bacillus cereus Nhe enterotoxin. Microbi. 2004; 150(Pt 12): 3959–3967. pmid:15583149
- 57. Didier A, Dietrich R, Märtlbauer E. Antibody binding studies reveal conformational flexibility of the Bacillus cereus nonhemolytic enterotoxin (Nhe) A-component. PLoS One. 2016; 11(10): e0165135. pmid:27768742
- 58. Thompson JR, Cronin B, Bayley H, Wallace MI. Rapid assembly of a multimeric membrane protein pore. Biophys J. 2011; 101(11): 2679–2683. pmid:22261056
- 59. Stipcevic T, Piljac T, Isseroff RR. Di-rhamnolipid from Pseudomonas aeruginosa displays differential effects on human keratinocyte and fibroblast cultures. J Dermatol Sci. 2005; 40(2): 141–143. pmid:16199139