Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium

  • Nan Jia,

    Affiliations Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China, SynBio Research Platform, Collaborative Innovation Centre of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China

  • Jin Du,

    Affiliations Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China, SynBio Research Platform, Collaborative Innovation Centre of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China

  • Ming-Zhu Ding,

    Affiliations Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China, SynBio Research Platform, Collaborative Innovation Centre of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China

  • Feng Gao , (FG); (YJY)

    Affiliation Department of Physics, Tianjin University, Tianjin, 300072, PR China

  • Ying-Jin Yuan (FG); (YJY)

    Affiliations Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China, SynBio Research Platform, Collaborative Innovation Centre of Chemical Science and Engineering (Tianjin), School of Chemical Engineering and Technology, Tianjin University, Tianjin, 300072, PR China

Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium

  • Nan Jia, 
  • Jin Du, 
  • Ming-Zhu Ding, 
  • Feng Gao, 
  • Ying-Jin Yuan


Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C.


The microbial ecosystem of Ketogulonigenium vulgare and Bacillus strains has been widely used in the two-steps vitamin C fermentation processes [1]. In bacterial communities, K. vulgare is responsible for the conversion of sorbose to 2-keto-L-gulonic acid (2-KLG), the precursor of vitamin C. Bacillus strains (e.g., B. megaterium, B. cereus and B. thuringiensis) are co-cultured to stimulate the growth of K. vulgare [2]. Moreover, different 2-KLG yields and productivities were observed in the consortium with different companion strains [3]. Clearly, the varied growth characteristics of the different companion strains might produce different effects on the fermentation process. Researchers are always looking for the best strains to cooperate with K. vulgare, and we identified that B. endophyticus Hbe603 is the appropriate strain because the product conversion rate exceeded 90% in industrial vitamin C fermentation. B. endophyticus is an aerobic, Gram-positive, non-motile, rod-shaped, endospore-forming bacterium, which was first isolated from the inner tissues of cotton plants [4]. It has been extensively applied for promoting plant-growth [5] and decolorizing textile effluents [6]. Knowledge on industrial strains will help us further understand the natural variation and the possible differences among Bacillus strains and their communication with K. vulgare.

The interaction and communication between Bacillus strains and K. vulgare have been investigated by metabolomic and proteomic analysis [710]. Further analysis of the genetic makeup and complementation are needed to understand the consortium. Genome sequence analysis could provide further information to distinguish the differences between strains and determinate the symbiotic relationship between the microorganisms at the gene level. For example, the genome analysis of the UCYN-A cyanobacteria found the absence of numerous major metabolic pathways and the necessary electron transport capacity to generate energy, which suggests that this strain must depend on other organisms to obtain critical nutrients [11]. The genome analysis of Syntrophus aciditrophicus provided a glimpse on its composition and identified that the electron transfer and energy transducing systems were used for the syntrophic life [12].

Currently, the genome-wide research on the B. endophyticus strain is still scarce and only one draft genome sequence of B. endophyticus 2102 has been published [13]. Here, we report a 4.87 Mb circular chromosome of B. endophyticus Hbe603, which is used as the companion strain in the vitamin C industrial fermentation process. Through the comparative genome analysis of B. endophyticus with other species, we found evidence of its special features, such as sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular protein release and nutrients synthesis. Likewise, we speculate its companion mechanism in the K. vulgare-Bacillus strain consortium.

Materials and Methods

Strains and cultivation conditions

The B. endophyticus HBe603 strain was cultured in 250 mL flasks with 50 mL of seed medium (30°C, 250 rpm) supplied with D-sorbitol (2%) for 35 h to determine the sporulation and growth curve. The seed medium contains 3 g/L beef extract, 3 g/L yeast powder, 3 g/L corn steep liquor, 0.2 g/L MgSO4, 1 g/L KH2PO4, 1 g/L urea and 10 g/L peptone.

Measurement of cell density and analysis of D-sorbitol

The cell density was measured as optical density at 600 nm (OD600) with a spectrophotometer, and cells were observed under a phase contrast microscope. D-sorbitol in the broth was analyzed by HPLC (Waters Corp., Massachusetts, USA) with a refractive index detector. In addition, 5 mM H2SO4 was used as the eluent on the Aminex HPX-87H column (BioRad, CA) at 65°C with a flow rate of 0.6 mL/min.

DNA extraction and quality control

A genome sample was extracted using a Bacteria DNA Kit (QIAGEN) according to the manufacturer’s instructions. Briefly, cells were lysed with lysozyme and treated with proteinase K. The lysate was then treated with 20% sodium dodecyl sulfate and cetyltrimethylammonium bromide. Afterwards, the DNA was extracted with phenol/ chloroform. Then, the DNA was precipitated with ethanol and sodium acetate and it was washed twice with 70% ethanol. Each sample was treated with RNaseA at 37°C for 30 min to allow RNA degradation. The quality of the DNA was assessed by spectrophotometer and gel electrophoresis. DNA samples with a 260/280 nm absorbance ratio of 1.8–2.0 and a 260/230 nm absorbance ratio of 2.0–2.2 were considered pure. Only high molecular weight pure DNA samples were used for the construction of the library and sequencing.

Sequencing and assembly

Each SMART bell sequencing library was constructed using 500 ng size-selected DNA with the Pacific Biosciences DNA Template Prep Kit 2.0. The binding of SMRT bell templates to polymerases was conducted using the DNA/Polymerase Binding Kit P5 and v2 primers. Sequencing was carried out on the Pacific Bioscience RS II platform using C3 reagents with 120 min movies. The .h5 files resulting from the PacBio sequencing were used directly for the assembly process. The raw reads were processed into subreads by removing the adaptors and filtered using SMRT Analysis 2.2 ( with minSubReadLength = 500 and readScore > 0.75. The filtered subreads were used in the HGAP assembly process. An in-house Perl script was used to calculate the distribution of subread lengths and identify the range of lengths that would give a coverage around 10. These length values were chosen as the seed length in the HGAP assembly process [14]. For B. endophyticus HBe603, seed length 6K-14K was chosen. A separate assembly process was done for each seed length. The HGAP assembly process was done as follows: 1) Reads shorter than the seed length were aligned to the longer reads using BLASR [15]. The errors on the long reads were corrected using the aligned reads; 2) The high quality corrected reads were assembled based on overlapping sequences to obtain a draft assembly; 3) All the reads were mapped to the draft assembly, which polished the assembly to obtain the final genomic sequence. The HGAP parameters used were genomeSize = 5000000, xCoverge = 15, defaultFrgMinLen = 500, ovlErrorRate = 0.06, ovlMinLen = 40, merSize = 14. The seed length that gave the least contigs were chosen as the final assembly. The assembled sequences were checked by BLAST to the NCBI database whether the contigs show similarity to known genomes or plasmids. For circular chromosome, we ran BLAST against itself to identify the redundant sequences at the end. The redundant sequences from the 3’ end were clipped and the connected part was examined by PCR.

Genome annotation and bioinformatics analysis

The de novo gene prediction of the genome sequence was performed by GeneMarkS [16]. The gene function was annotated by using BLAST [17] against Kyoto Encyclopedia of Genes and Genomes database KEGG [18], SWISS-PROT [19] and Clusters of Orthologous Groups of proteins database (COG) [20]. The tRNAs and rRNAs were predicted by tRNAscan-SE [21] and RNAmmer [22], respectively. The essential genes were predicted by ZCURVE 3.0 [23] and DEG 10 [24], respectively. The subcellular location of proteins and the signal peptides were commented by PSORT [25] and SignalP 4.0 [26], respectively. The origin of replication (oriC) and putative DnaA boxes were identified using Ori-Finder [27]. CVTree, a whole genome-based, alignment-free composition vector (CV) method was performed for the phylogenetic analysis [28], and a phylogenetic tree was generated using the MEGA program [29]. The GC-Profile was used to compute the GC content variation in DNA sequences and predict the genomic islands [30]. The circular chromosome map was created using the program CGView [31]. The sequence similarity was analyzed using ACT (the Artemis Comparison Tool) [32].

Nucleotide sequence accession numbers

The sequence of the B. endophyticus Hbe603 chromosome has been deposited in GenBank under the accession number CP011974.

Results and Discussion

General genomic properties

The B. endophyticus Hbe603 chromosome is 4.87 Mb with GC content of 36.64% and contains 5,038 annotated genes (Fig 1, Table 1 and S1 Table). We detected four prophages in B. endophyticus Hbe603 using PHAge Search Tool (PHAST) [33] (S1 Fig). In the four prophages, most of the small proteins are annotated as hypothetical proteins that may play important roles in response to specific environmental stresses and host adaptation [34]. The other functional genes encode 59 phage-like proteins, two phage integrases and two transposases. Besides the prophage regions, the complete chromosome sequence of B. endophyticus Hbe603 has the high consistency with the draft sequence of B. endophyticus 2102 (S2 Fig). In addition to the published companion strains B. thuringiensis [35], B. cereus [36] and B. megaterium [37], we identified that B. endophyticus Hbe603 is the appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Through a whole genome-based phylogenetic analysis, we can conclude that B. endophyticus is a closer companion strain to B. megaterium QM B1551 [38] than B. cereus ATCC 14579 [39] and B. thuringiensis Al Hakam [40] (Fig 2). By comparing the distribution of COG classification among the four strains, we could assess their gene function distributions and their genetic relationships (Fig 3). In the B. endophyticus Hbe603 genome, the number of genes related to cell envelope biogenesis (M) and signal transduction mechanisms (T) is lower than that in the other three strains, while the number of genes related to carbohydrate transport and metabolism (G), energy production and conversion (C) and lipid transport and metabolism (I) is similar to that in B. megaterium and higher than those in the other two strains (S2 Table). Overall, B. endophyticus Hbe603 has unique properties with regards to protein function and is more similar to B. megaterium than the other strains. Interestingly, B. megaterium has been used for industrial vitamin C production in Jiangshan Pharmaceutical Co. Ltd., China [41]. Since both strains can become industrial companion strains, they presumably show common characteristics to have a better interaction with K. vulgare.

Fig 1. Circular genome visualization of B. endophyticus Hbe603.

Circles from the outside to the inside show the positions of protein-coding genes (blue), tRNA genes (red) and rRNA genes (pink) on the positive (circle 1), and negative (circle 2) strands. Circles 3–5 show the positions of BLAST hits detected through blastx comparisons of B. endophyticus Hbe603 against B. megaterium QM B1551 (circle 3), B. megaterium DSM 319 (circle 4) and B. megaterium WSH-002 (circle 5). The height of the shading in the BLAST results rings is proportional to the percentage of identity of the hit. Circles 6 and 7 show plots of GC content and GC skew plotted as the deviation from the average for the entire sequence.

Fig 2. Phylogenetic analysis of B. endophyticus Hbe603 with other species.

The phylogenetic tree of B. endophyticus Hbe603 was constructed using CVTree with parameters K = 6 and Type = aa. The neighbor-joining tree was constructed using the MEGA5 program based on the CVTree results. Note that Geobacillus kaustophilus HTA426 was included as an outgroup.

Fig 3. COG analysis of B. endophyticus Hbe603 with other Bacillus species.

The information presented here corresponds to the original annotation. Alterations could occur due to possible updates. The information is in accordance to the genome information provided in the corresponding NCBI .gbk files. COG designations are described as follows: C, Energy production and conversion; D, Cell division and chromosome partitioning; E, Amino acid transport and metabolism; F, Nucleotide transport and metabolism; G, Carbohydrate transport and metabolism; H, Coenzyme metabolism; I, Lipid metabolism; J, Translation, ribosomal structure and biogenesis; K, Transcription; L, DNA replication, recombination, and repair; M, Cell envelope biogenesis, outer membrane; N, Cell motility and secretion; O, Posttranslational modification, protein turnover, chaperones; P, Inorganic ion transport and metabolism; Q, Secondary metabolite biosynthesis, transport and catabolism; R, General function prediction only; S, Function unknown; T, Signal transduction mechanisms; U, Intracellular trafficking and secretion; V, Defense mechanisms.

Table 1. General features of the genome sequence of B. endophyticus Hbe603 and 2102.

Genetic analysis of B. endophyticus’ companion effect on K. vulgare

Genes related to the sporulation process.

Several researchers have indicated that the spore stability of Bacillus strains plays an important role in stimulating the propagation of K. vulgare and the accumulation of 2-KLG [9,42]. During the process of the spore formation, cells burst and release intracellular metabolites that significantly promote the growth of K. vulgare. Thus, we analyzed the genes related to the different sporulation stages to understand the sporulation process and the regulation mechanism of B. endophyticus Hbe603 (S3 Table). Current research on the process and mechanism of sporulation mainly focus on the model strain B. subtilis. The lifecycle of B. subtilis is generally summarized in seven steps: vegetation (stage zero and I), stage II, stage III, stage IV, stage V, spore maturation (stage VI and VII) and spore germination [43,44]. About 140 genes related to the sporulation cycle were identified by the genome annotation of B. endophyticus Hbe603, and most of them have a high similarity to those in B. subtilis. These data confirm the complete sporulation ability of B. endophyticus Hbe603. At the initial stage of the spore formation, spo0H and spo0A encode a related regulatory factor, which is capable of regulating the cell growth and initializing the spore formation [45]. The histidine kinases KinA, KinD and KinE [46,47] respond to environmental stimulation and then phosphorylate Spo0A to form a two-component sensing system until the spore formation process begins. In addition, the genes related to spore coat formation in B. endophyticus Hbe603 were compared with those in other Bacillus strains to analyze the properties of the spores. Among the genes related to the outside spore coat, B. endophyticus Hbe603 only has cotA and cotE, and lacks cotB, cotC, cotG, cotM, cotO, cotY and ytxO, which are annotated in B. subtilis. B. megaterium only has cotB and cotE, and many similarities exist between the two species with regards to the structure of the outside spore coat. Among the inside spore coat genes, B. endophyticus Hbe603 has cotD, cotJA, cotJB, cotJC, cotF, yutH, yaaH, yheC and yheD, and lacks cotH, ymaG, cotT, yxeE, yeeK and ysnD, which are annotated in B. subtilis. In addition, there are three operons, cgeAB, cgeCDE and spsABCDEFGHIJKL, which encode a glycosyl transferase in B. subtilis and participate in the spore coat glycosylation [48]. B. thuringiensis lacks spsD, and B. cereus only has spsI, spsJ and spsK [49]. B. endophyticus Hbe603 and B. megaterium completely lack these three operons, and that deficiency may improve the hydrophobicity of spores and their gathering ability, thus, enhancing the affinity between spores and nonspecific surfaces [50]. B. endophyticus Hbe603 and B. megaterium lack related genes rendering this type of spore characteristics potential beneficial effects in synergistic actions.

Genes related to the regulation of transcription.

Compared to K. vulgare, companion Bacillus strains have a stronger ability of responding and adapting to environmental changes, and the transcriptional regulation system plays an important role. B. endophyticus Hbe603 has nearly 300 genes related to regulation, including 17 sigma factor encoding genes (Table 2). As a general regulatory factor, sigma-B controls a large number of pressure-responsive related proteins. Previous research has reported two types of regulation mechanisms of sigma-B in Bacillus strains, i.e., that of B. subtilis [51] and that of B. cereus [49]. The genes related to the regulation of sigma-B in B. endophyticus Hbe603 are similar to those in B. Subtilis. During unstressed conditions, the anti-sigma factor RsbW directly combines with sigma-B, while the anti-anti-sigma factor RsbV is in the phosphorylated state and is unable to combine with RsbW [52]. In addition, RsbU dephosphorylates RsbV and releases sigma-B to initiate its transcriptional activity at ambient state. Likewise, a series of cascade factors can regulate the activity of RsbU phosphorylation, such as RsbX, RsbT, RsbS and the RsbR family of proteins (RsbRA, RsbRB and RsbRD). However, we could not find the regulatory factor RsbP in B. endophyticus Hbe603, which is responsible for the energy pressure in B. subtilis [53]. The sigma factor ECF (extracytoplasmic function) can induct extracellular environment stress and regulate the signal response. A total of seven related genes were detected in B. endophyticus Hbe603. Similarly, B. subtilis has seven genes, B. cereus has ten and B. thuringiensis has thirteen genes [49]. Among the seven sigma factors, we found two sigma-M factors, which can respond to high salt concentration and regulate the strain to adapt to high osmotic pressures in the environment [54]. Sigma-C, Sigma-V, Sigma-X and Sigma-W respond to temperature, lysozyme, iron and bacteriocin toxins, respectively. In addition of being important regulation factors, the Rap family proteins commonly exist in Bacillus strains and are combined with the signal peptide Phr to form the Rap-Phr signal cascade system [55]. This signal cascade system responds to cell density and regulates the initiation of sporulation. B. subtilis contains eleven Rap-encoding genes and seven Phr-encoding genes, and the number of related genes is slightly lower in B. cereus and B. thuringiensis. Nonetheless, only one related protein PhrA was detected in B. endophyticus Hbe603 and it has a high similarity with that of Agrobacterium tumefaciens. Hence, B. endophyticus Hbe603 may contain other pathways to respond to cell density and to initiate spore formation. These characteristics might be attributed to its specific communication pattern and its better companion ability.

Genes related to Environmental resistance.

Previous research identified that reduced glutathione could significantly improve the growth of K. vulgare [56], and a proteomic analysis revealed its high demand for antioxidant protection [10]. B. endophyticus Hbe603 has a strong environmental resistance and relieves the stress of K. vulgare [9]. B. endophyticus Hbe603 contains a complete heat shock system, Clp, which is associated with high temperature tolerance. That system contains the chaperone ClpB, ATPase subunit ClpE [57], ClpP, ClpX [58], protein degradation subunits ClpY and ClpQ, and the CtsR global response protein [59]. Moreover, B. endophyticus Hbe603 has eight Na+/H+ antiporter related genes, the cluster mrpABCDEFG and nhaC. The mrp complex contains seven Na+/H+ antiporter subunits, which are associated with cell tolerance in alkaline environments. This complex responds to proton motive force in the cell membrane, where H+ is transported to the inside of the cells, and Na+ is pumped out [60]. The NhaC protein plays an important role in maintaining a stable pH environment, and it has a high similarity with that of Bacillus pseudofirmus OF4. This strain is an alkali resistant microorganism that can grow in pH ranging from 7.5 to 11.4 [61]. In addition, the yhaU/khtT gene clusters were detected in B. endophyticus Hbe603 that encode K+/H+ antiporters and pump out K+ to maintain a stable pH in alkaline environments [61]. The ability of B. endophyticus Hbe603 to adapt the alkaline environment of the industrial fermentation process might be related with the above mechanism. Microorganisms also need to absorb large quantities of K+ to maintain an osmotic balance in a high permeability pressure environment. B. endophyticus Hbe603 has the complete Ktr system to perform this function, which includes the ktrAB, ktrC and ktrD operon [62]. Several studies have shown that B. megaterium increases the proline synthesis pathway in high salt conditions [63]. Accordingly, the proHJA gene cluster is present in the B. endophyticus Hbe603 genome and has the ability to complete the synthesis of proline. In addition, glycine betaine is an effective protective agent against osmotic pressure. Interestingly, B. endophyticus Hbe603 contains two copies of glycine betaine synthetic enzymes GbsA and a GbsB, two copies of the glycine betaine transporter OpuD, and two operons encoding the glycine betaine/choline ABC transporter. Based on this complex system, B. endophyticus Hbe603 could be adapted to highly variable environments.

Genes related to the membrane transport system.

The metabolic cooperation in the K. vulgare-B. megaterium consortium has been investigated by cultivating them in the same soft agar plate [64]. We found that B. megaterium swarmed along the trace of K. vulgare on the agar plate. A metabolomics analysis has detected a number of metabolites exchange among K. vulgare and the Bacillus strain [8], where the transport system of the membranes plays an important role [65]. B. endophyticus Hbe603 contains 31 phosphotransferase system (PTS) related genes, which are used for carbohydrate transportation. That number of genes is greater than those in B. subtilis (25 genes), B. cereus (18 genes) and B. thuringiensis (20 genes) [49]. The phosphotransferase system of B. endophyticus Hbe603 includes three copies of the Crh catabolite repression protein (HPr- like protein) [66], HPr kinase PtsH [67] and HprK [68]. Other proteins are included in the Glc, Lac, Fru, Man and other families (Table 3). It is interesting to remark that B. endophyticus Hbe603 shows a good growth on seed medium supplied with D-sorbitol (2%) as the sole source of carbon and energy (Fig 4). We annotated the D-sorbitol dehydrogenases and a glucitol/sorbitol-specific transport protein adjacent to it. Furthermore, Sorbose reductase is also annotated and has a high similarity with that of Candida albicans. We speculate that the reductase may react with D-sorbitol as well. As the substrate of vitamin C fermentation, D-sorbitol can be consumed by B. endophyticus and may have an important influence on the final conversion rate. Hence, further research on these enzymes will be important to facilitate molecular modifications. Moreover, B. endophyticus Hbe603 contains almost 130 ABC transporter related proteins that are mainly used for transportation of peptides (15 proteins), amino acids (15 proteins), ions (35 proteins) and phosphate (8 proteins). In addition, we found 30 uncharacterized ABC transporters, which probably contributes to bacterial drug or antibiotic resistance [69].

Fig 4. Growth features of the B. endophyticus Hbe603 strain.

A). Growth curve the B. endophyticus Hbe603 strain grown in seed medium with D-sorbitol (2%). The Y axis represents the average OD600 of triplicate bacterial cultures at each time point. B). Extracellular concentration of D-sorbitol. Data are averages of three independent experiments.

Table 3. Predicted genes related to PTS system in B. endophyticus Hbe603.

Proteins released into the extracellular environment.

A previous study has found that two extracellular proteins of B. megaterium can promote cell growth and acid production of K. vulgare. Their molecular weights are 30~50kD and more than 100kD, respectively [70]. With the help of protein localization analysis, the proteins that B. endophyticus Hbe603 released into the extracellular environment were detected. In addition to the sporulation and flagellar related proteins, we found extracellular esterase, aminopeptidase and polysaccharide deacetylase, which can digest large molecular substances in the environment of K. vulgare. Additionally, two copies of superoxide dismutase were annotated, which can remove superoxide and protect K. vulgare from oxidative injury.

Genes related to nutrients synthesis.

Previously, the metabolic model of K. vulgare was constructed on a genome-scale [71]. K. vulgare lacks genes for several pathways such as central metabolism, amino acids metabolism, fatty acids metabolism and vitamins biosynthesis, which might actually impede its growth. Previous studies showed that the addition of L-cysteine to a flask culture of K. vulgare increased cell growth, 2-KLG titer and the intracellular level of coenzyme A by 25.6%, 35.8%, and 44.7%, respectively [72]. Moreover, the addition of L-glycine, L-proline, L-threonine, L- isoleucine and gelatine increased the 2-KLG productivity by 20.4%, 17.2%, 7.2%, 11.8% and 23.4%, respectively [73]. B. endophyticus Hbe603 has a relative complete metabolic capacity involved in the supply of amino acids for K. vulgare, especially L-glycine, L-cysteine, L-methionine, L-tryptophan that K. vulgare cannot synthesize by itself [74]. In addition, a previous study has shown that K. vulgare cannot synthesize many B vitamins by itself [74]. We found that B. endophyticus Hbe603 has vitamin synthesis pathways for B1, B2, B3, B5, B6, B7, B9 and B12, which could be supplied to K. vulgare. As one of the first biotechnological vitamin B12 producers, B. megaterium has two distinct and an isolated cbiP gene to construct the whole vitamin B12 synthetic pathway [38,75]. B. endophyticus Hbe603 also has these two distinct genes, but they differ in where the cbiP (also called cobQ) is inserted. The schematic of genes related to the synthesis of vitamin B12 in B. endophyticus Hbe603 is presented by Easyfig [76] (Fig 5). Further studies will detect the effect of vitamin B12 production on this genetic difference, and B. endophyticus is expected to be a suitable engineered strain for the production of vitamin B12. Several cofactors are also supplied by Bacillus strains to K. vulgare in co-culture conditions [71], and we found numerous oxidoreductase-like proteins in B. endophyticus for the transfer of electrons generated in the cytoplasm. Five putative ferredoxins, two flavodoxins, ten thioredoxins, nine putative nitroreductases, four NADH:flavin oxidoreductases, and 16 quinol/ubiquinol oxidase were annotated in B. endophyticus. Overall, B. endophyticus Hbe603 has a relative complete metabolic capacity for the supply of amino acids, vitamins and cofactors for K. vulgare.

Fig 5. Schematic representation of the genes related to vitamin B12 synthesis in B. endophyticus Hbe603 by Easyfig.

Blue bars represent the forward and reverse strands of DNA with CDSs marked as arrows. The scale is marked in base pairs. The red bars represent normal tblastx matches. Inverted matches are colored in blue, and the depth of shading is indicative of the percentage blast match.

The schematic of B. endophyticus’ companion mechanism in K. vulgare-Bacillus strain consortium is presented in Fig 6 B. endophyticus Hbe603 has complex transcriptional regulation systems combined with its ability for spore formation and stress resistance. In addition, B. endophyticus Hbe603 has rich ABC transporters and proteins related to the PTS system for specific substrate transportation and communication with K. vulgare at a metabolic level. Likewise, the proteins that B. endophyticus Hbe603 releases into the extracellular environment may digest large molecular substances and remove superoxide for K. vulgare. With the sporulation process, B. endophyticus Hbe603 further releases abundant nutrients (amino acids, vitamins and cofactors) for the growth and the 2-KLG production of K. vulgare. B. endophyticus Hbe603 lacks the Rap-Phr signal cascade system and partly spore coat related proteins. In contrast, B. endophyticus Hbe603 has specific pathways for vitamin B12 synthesis and sorbitol metabolism. Overall, B. endophyticus provides essential functions that K. vulgare lacks to reach its maximum growth rate and acts as an alternative source of environmental nutrients in the consortium.

Fig 6. Schematic representation of the interaction and communication between K. vulgare Y25 and B. endophyticus Hbe603 at a genome-wide scale.

The red dashed line represents the defected pathways in K. vulgare.


In summary, we report the chromosome sequence of B. endophyticus Hbe603 and its annotation, which provide a better-defined genetic background for gene expression and regulation mechanism studies, especially a genome scale metabolic network construction. This comparative genome analysis provides the species-specific characters of B. endophyticus Hbe603 with respect to other Bacillus strains. The corresponding genome analysis will have an enormous impact on our understanding of K. vulgare-Bacillus strain consortium and will help us find more appropriate companion strain in the future.

Supporting Information

S1 Fig. Schematic representing the prophages of B. endophyticus Hbe603.


S2 Fig. Comparisons of the sequence similarity between B. endophyticus Hbe603 and B. endophyticus 2102 with the Artemis Comparison Tool.


S1 Table. Genome annotation of B. endophyticus Hbe603.


S2 Table. COG category distribution of B. endophyticus Hbe603.


S3 Table. Predicted genes related to sporulation in B. endophyticus Hbe603.


Author Contributions

Conceived and designed the experiments: YJY NJ FG. Performed the experiments: NJ JD. Analyzed the data: FG NJ MZD. Contributed reagents/materials/analysis tools: FG YJY. Wrote the paper: NJ FG MZD YJY.


  1. 1. Yin GL, He JM, Ren SX, Song Q, Ye Q, Lin YH, et al. Production of vitamin C precursor-2-keto-L-gulonic acid from L-sorbose by a novel bacterial component system of SCB329-SCB933. J Ind Microbiol. 1997; 27: 1–7.
  2. 2. Takagi Y, Sugisawa T, Hoshino T. Continuous 2-keto-L-gulonic acid fermentation from L-sorbose by Ketogulonigenium vulgare DSM 4025. Appl Microbiol Biot. 2009; 82: 1049–1056.
  3. 3. Yang WC, Han LT, Wang ZY, Xu H. Two-helper-strain co-culture system: a novel method for enhancement of 2-keto-L-gulonic acid production. Biotechnol Lett. 2013; 35: 1853–1857. pmid:23881329
  4. 4. Reva ON, Smirnov VV, Pettersson B, Priest FG. Bacillus endophyticus sp. nov., isolated from the inner tissues of cotton plants (Gossypium sp.). IJSEM. 2002; 52: 101–107. pmid:11837291
  5. 5. Figueiredo MVB, Martinez CR, Burity HA, Chanway CP. Plant growth-promoting rhizobacteria for improving nodulation and nitrogen fixation in the common bean (Phaseolus vulgaris L.). World J Microbiol Biotechnol. 2008; 24: 1187–1193.
  6. 6. Prasad A, Rao KVB. Physicochemical analysis of textile effluent and decolorization of textile azo dye by Bacillus Endophyticus strain VITABR13. Environ Biotechnol. 2011; 2: 55–62.
  7. 7. Du J, Zhou J, Xue J, Song H, Yuan YJ. Metabolomic profiling elucidates community dynamics of the Ketogulonicigenium vulgare-Bacillus megaterium consortium. Metabolomics. 2012; 8: 960–973.
  8. 8. Ding MZ, Zou Y, Song H, Yuan YJ. Metabolomic analysis of cooperative adaptation between co-cultured Bacillus cereus and Ketogulonicigenium vulgare. PLOS ONE. 2014; 9(4): e94889. pmid:24728527
  9. 9. Ma Q, Zhou J, Zhang WW, Meng XX, Sun JW, Yuan YJ. Integrated proteomic and metabolomic analysis of an artificial microbial community for two-step production of vitamin C. PLOS ONE. 2011; 6(10): e26108. pmid:22016820
  10. 10. Ma Q, Zhang WW, Zhang L, Qiao B, Pan CS, Yi H, et al. Proteomic analysis of Ketogulonicigenium vulgare under glutathione reveals high demand for thiamin transport and antioxidant protection. PLOS ONE. 2012; 7(2): e32156. pmid:22384164
  11. 11. Tripp HJ, Bench SR, Turk KA, Foster RA, Desany BA, Niazi F, et al. Metabolic streamlining in an open-ocean nitrogen-fixing cyanobacterium. Nature. 2010; 464: 90–94. pmid:20173737
  12. 12. McInerney MJ, Rohlin L, Mouttaki H, Kim U, Krupp RS, Rios-Hernandez L, et al. The genome of Syntrophus aciditrophicus: life at the thermodynamic limit of microbial growth. PNAS. 2007; 104: 7600–7605. pmid:17442750
  13. 13. Lee Y-J, Lee S-J, Kim SH, Lee SJ, Kim B-C, Lee H-S, et al. Draft genome sequence of Bacillus endophyticus 2102. J Bacteriol. 2012; 194: 5705–5706. pmid:23012284
  14. 14. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013; 10(6): 563–569. pmid:23644548
  15. 15. Chaisson MJ, Tesler G. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics. 2012; 13: 238. pmid:22988817
  16. 16. Besemer J, Lomsadze A, Borodovsky M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 2001; 29(12): 2607–2618. pmid:11410670
  17. 17. Altschul SF, Gish W. Local alignment statistics. Methods Enzymol. 1996; 266: 460–480. pmid:8743700
  18. 18. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006; 34: D354–D357. pmid:16381885
  19. 19. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000; 28: 45–48. pmid:10592178
  20. 20. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003; 4: 41. pmid:12969510
  21. 21. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997; 25: 0955–0964.
  22. 22. Lagesen K, Hallin P, Rødland EA, Stærfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007; 35(9): 3100–3108. pmid:17452365
  23. 23. Hua Z-G, Lin Y, Yuan Y-Z, Yang D-C, Wei W, Guo F-B. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes. Nucl. Acids Res. 2015; 43 (W1): W85–W90. pmid:25977299
  24. 24. Luo H, Lin Y, Gao F, Zhang CT, Zhang R. DEG 10, an update of the Database of Essential Genes that includes both protein-coding genes and non-coding genomic elements. Nucleic Acids Res. 2014; 42, 574–580.
  25. 25. Nakai K, Horton P. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci. 1999; 24: 34–35. pmid:10087920
  26. 26. Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011; 8: 785–786. pmid:21959131
  27. 27. Gao F, Zhang CT. Ori-Finder: a web-based system for finding oriCs in unannotated bacterial genomes. BMC Bioinformatics. 2008; 9: 79. pmid:18237442
  28. 28. Xu Z, Hao BL. CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes. Nucleic Acids Res. 2009; 37: W174–W178. pmid:19398429
  29. 29. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011; 28: 2731–2739. pmid:21546353
  30. 30. Gao F, Zhang CT. GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences. Nucleic Acids Res. 2006; 34: W686–W691. pmid:16845098
  31. 31. Stothard P, Wishart DS. Circular genome visualization and exploration using CGView. Bioinformatics. 2005; 21: 537–539. pmid:15479716
  32. 32. Carver TJ, Rutherford KM, Berriman M, Rajandream MA, Barrell BG, Parkhill J. ACT: the Artemis Comparison Tool. Bioinformatics. 2005; 21(16): 3422–3423. pmid:15976072
  33. 33. Zhou Y, Liang YJ, Lynch KH, Dennis JJ, Wishart DS. PHAST: A Fast Phage Search Tool. Nucleic Acids Res. 2011; 39(suppl 2): W347–W352.
  34. 34. Wang FY, Xiao JF, Pan LL, Yang M, Zhang GQ, Jin SG, et al. A systematic survey of mini-proteins in bacteria and archaea. PLOS ONE. 2008; 3(12): e4027. pmid:19107199
  35. 35. Li Y, Zhou B, Liu YP, Ceng S, Chen HQ, Chen Y, et al. Study on new strains in fermentation of Vitamin C. Wei sheng wu xue za zhi. 2001; 22: 26–27, 32.
  36. 36. Yuan ZY, Wei DZ, YlN GL, Yuan WK. Coimmobilization of Gluconobacter oxydans and Bacillus cereus for bioconversion of 2-keto-L-gulonic acid. Ann N Y Acad Sci. 1992; 672: 628–633.
  37. 37. Xu A, Yao J, Yu L, Lv S, Wang J, Yan B, et al. Mutation of Gluconobacter oxydans and Bacillus megaterium in a two-step process of L-ascorbic acid manufacture by ion beam. J Appl Microbiol. 2004; 96: 1317–1323. pmid:15139924
  38. 38. Eppinger M, Bunk B, Johns MA, Edirisinghe JN, Kutumbaka KK, Koenig SSK, et al. Genome sequences of the biotechnologically important Bacillus megaterium strains QM B1551 and DSM319. J Bacteriol. 2011; 193: 4199–4213. pmid:21705586
  39. 39. Ivanova N, Sorokin A, Anderson I, Galleron N, Candelon B, Kapatral V, et al. Genome sequence of Bacillus cereus and comparative analysis with Bacillus anthracis. Nature. 2003; 423: 87–91. pmid:12721630
  40. 40. Challacombe JF, Altherr MR, Xie G, Bhotika SS, Brown N, Bruce D, et al. The complete genome sequence of Bacillus thuringiensis Al Hakam. J Bacteriol. 2007; 189: 3680–3681. pmid:17337577
  41. 41. Liu LM, Li Y, Zhang J, Zou W, Zhou ZM, Liu J, et al. Complete genome sequence of the industrial strain Bacillus megaterium WSH-002. J Bacteriol. 2011; 193: 6389–6390. pmid:22038958
  42. 42. Zhu YB, Liu J, Du GC, Zhou JW, Chen J. Sporulation and spore stability of Bacillus megaterium enhance Ketogulonigenium vulgare propagation and 2-keto-L-gulonic acid biosynthesis. Bioresour Technol. 2012; 107: 399–404. pmid:22257860
  43. 43. Eichenberger P, Fujita M, Jensen ST, Conlon EM, Rudner DZ, Wang ST, et al. The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLOS Biol. 2004; 2(10): e328. pmid:15383836
  44. 44. Paredes CJ, Alsaker KV, Papoutsakis ET. A comparative genomic view of clostridial sporulation and physiology. Nat Rev Microbiol. 2005; 3: 969–978. pmid:16261177
  45. 45. Britton RA, Eichenberger P, Gonzalez-Pastor JE, Fawcett P, Monson R, Losick R, et al. Genome-wide analysis of the stationary-phase sigma factor (sigma-H) regulon of Bacillus subtilis. J Bacteriol. 2002; 184: 4881–4890. pmid:12169614
  46. 46. Jiang M, Shao WL, Perego M, Hoch JA. Multiple histidine kinases regulate entry into stationary phase and sporulation in Bacillus subtilis. Mol Microbiol. 2000; 38: 535–542. pmid:11069677
  47. 47. LeDeaux JR, Yu N, Grossman AD. Different roles for KinA, KinB, and KinC in the initiation of sporulation in Bacillus subtilis. J Bacteriol. 1995; 177: 861–863. pmid:7836330
  48. 48. Driks A. Bacillus subtilis spore coat. Microbiol Mol Biol Rev. 1999; 63: 1–20. pmid:10066829
  49. 49. Anderson I, Sorokin A, Kapatral V, Reznik G, Bhattacharya A, Mikhailova N, et al. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis. FEMS Microbiol Lett. 2005; 250: 175–184. pmid:16099605
  50. 50. Rönner U, Husmark U, Henriksson A. Adhesion of bacillus spores in relation to hydrophobicity. J Appl Bacteriol. 1990; 69: 550–556. pmid:2292519
  51. 51. Voelker U, Voelker A, Maul B, Hecker M, Dufour A, Haldenwang WG. Separate mechanisms activate sigma B of Bacillus subtilis in response to environmental and metabolic stresses. J Bacteriol. 1995; 177: 3771–3780. pmid:7601843
  52. 52. Dufour A, Haldenwang WG. Interactions between a Bacillus subtilis anti-sigma factor (RsbW) and its antagonist (RsbV). J Bacteriol. 1994; 176: 1813–1820. pmid:8144446
  53. 53. Vijay K, Brody MS, Fredlund E, Price CW. A PP2C53. Vijay K, Brody MS, Fredlund E, Price CW. A PP2C phosphatase containing a PAS domain is required to convey signals of energy stress to the σB transcription factor of Bacillus subtilis. Mol Microbiol. 2000; 35: 180–188. pmid:10632888
  54. 54. Horsburgh MJ, Moir A. σM, an ECF RNA polymerase sigma factor of Bacillus subtilis 168, is essential for growth and survival in high concentrations of salt. Mol Microbiol. 1999; 32: 41–50. pmid:10216858
  55. 55. Pottathil M, Lazazzera BA. The extracellular Phr peptide-Rap phosphatase signaling circuit of Bacillus subtilis. Front Biosci. 2003; 8: d32–45. pmid:12456319
  56. 56. Zhou J, Yi H, Wang LL, Zhang WW, Yuan YJ. Metabolomic analysis of the positive effects on Ketogulonigenium vulgare growth and 2-keto-L-gulonic acid production by reduced glutathione. Omics. 2012; 16: 387–396. pmid:22734896
  57. 57. Derré I, Rapoport G, Devine K, Rose M, Msadek T. ClpE, a novel type of HSP100 ATPase, is part of the CtsR heat shock regulon of Bacillus subtilis. Mol Microbiol. 1999; 32: 581–593. pmid:10320580
  58. 58. Gerth U, Krüger E, Derré I, Msadek T, Hecker M. Stress induction of the Bacillus subtilis clpP gene encoding a homologue of the proteolytic component of the Clp protease and the involvement of ClpP and ClpX in stress tolerance. Mol Microbiol. 1998; 28: 787–802. pmid:9643546
  59. 59. Krüger E, Zühlke D, Witt E, Ludwig H, Hecker M. Clp-mediated proteolysis in Gram-positive bacteria is autoregulated by the stability of a repressor. EMBO. 2001; 20: 852–863.
  60. 60. Ito M, Guffanti AA, Oudega B, Krulwich TA. mrp, a multigene, multifunctional locus in Bacillus subtilis with roles in resistance to cholate and to Na+ and in pH homeostasis. J Bacteriol. 1999; 181: 2394–2402. pmid:10198001
  61. 61. Janto B, Ahmed A, Ito M, Liu J, Hicks DB, Pagni S, et al. Genome of alkaliphilic Bacillus pseudofirmus OF4 reveals adaptations that support the ability to grow in an external pH range from 7.5 to 11.4. Environ Microbiol. 2011; 13: 3289–3309. pmid:21951522
  62. 62. Holtmann G, Bakker EP, Uozumi N, Bremer E. KtrAB and KtrCD: two K+ uptake systems in Bacillus subtilis and their role in adaptation to hypertonicity. J Bacteriol. 2003; 185: 1289–1298. pmid:12562800
  63. 63. Bursy J, Pierik AJ, Pica N, Bremer E. Osmotically induced synthesis of the compatible solute hydroxyectoine is mediated by an evolutionarily conserved ectoine hydroxylase. J Biol Chem. 2007; 282: 31147–31155. pmid:17636255
  64. 64. Zhou J, Ma Q, Yi H, Wang LL, Song H, Yuan YJ. Metabolome profiling reveals metabolic cooperation between Bacillus megaterium and Ketogulonicigenium vulgare during induced swarm motility. Appl Environ Microbiol. 2011; 77: 7023–7030. pmid:21803889
  65. 65. Ma Q, Zou Y, Lv YJ, Song H, Yuan YJ. Comparative proteomic analysis of experimental evolution of the Bacillus cereus-Ketogulonicigenium vulgare co-culture. PlOS ONE. 2014; 9(3): e91789. pmid:24619085
  66. 66. Galinier A, Haiech J, Kilhoffer M-C, Jaquinod M, Stülke J, Deutscher J, et al. The Bacillus subtilis crh gene encodes a HPr-like protein involved in carbon catabolite repression. PNAS. 1997; 94: 8439–8444. pmid:9237995
  67. 67. Deutscher J, Reizer J, Fischer C, Galinier A, Saier MH, Steinmetz M. Loss of protein kinase-catalyzed phosphorylation of HPr, a phosphocarrier protein of the phosphotransferase system, by mutation of the ptsH gene confers catabolite repression resistance to several catabolic genes of Bacillus subtilis. J Bacteriol. 1994; 176: 3336–3344. pmid:8195089
  68. 68. Martin-Verstraete I, Deutscher J, Galinier A. Phosphorylation of HPr and Crh by HprK, early steps in the catabolite repression signalling pathway for the Bacillus subtilis levanase operon. J Bacteriol. 1999; 181: 2966–2969. pmid:10217795
  69. 69. van Veen HW, Callaghan R, Soceneantu L, Sardini A, Konings WN, Higgins CF. A bacterial antibiotic-resistance gene that complements the human multidrug-resistance P-glycoprotein gene. Nature. 1998; 391: 291–295. pmid:9440694
  70. 70. Feng S, Zhang Z, Zhang C. Effect of Bacillus megaterium on Gluconobacter oxydans in mixed culture. Ying Yong Sheng Tai Xue Bao. 2000; 11: 119–122. pmid:11766567
  71. 71. Ye C, Zou W, Xu N, Liu LM. Metabolic model reconstruction and analysis of an artificial microbial ecosystem for vitamin C production. J Biotechnol. 2014; 182: 61–67. pmid:24815194
  72. 72. Huang Z, Zou W, Liu J, Liu LM. Glutathione enhances 2-keto-L-gulonic acid production based on Ketogulonicigenium vulgare model iWZ663. J. Biotechnol. 2013; 164: 454–460. pmid:23376843
  73. 73. Liu LM, Chen KJ, Zhang J, Liu J, Chen J. Gelatin enhances 2-keto-L-gulonic acid production based on Ketogulonigenium vulgare genome annotation. J Biotechnol. 2011; 156: 182–187. pmid:21924300
  74. 74. Zhang J, Zhou JW, Liu J, Chen KJ, Liu LM, Chen J. Development of chemically defined media supporting high cell density growth of Ketogulonicigenium vulgare and Bacillus megaterium. Bioresour Technol. 2011; 102: 4807–4814. pmid:21296571
  75. 75. Wolf JB, Brey RN. Isolation and genetic characterizations of Bacillus megaterium cobalamin biosynthesis-deficient mutants. J Bacteriol. 1986; 166: 51–58. pmid:3082859
  76. 76. Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome comparison visualizer. Bioinformatics. 2011; 27(7): 1009–1010. pmid:21278367