Draft genome sequence of Actinotignum schaalii DSM 15541T: Genetic insights into the lifestyle, cell fitness and virulence

The permanent draft genome sequence of Actinotignum schaalii DSM 15541T is presented. The annotated genome includes 2,130,987 bp, with 1777 protein-coding and 58 rRNA-coding genes. Genome sequence analysis revealed absence of genes encoding for: components of the PTS systems, enzymes of the TCA cycle, glyoxylate shunt and gluconeogensis. Genomic data revealed that A. schaalii is able to oxidize carbohydrates via glycolysis, the nonoxidative pentose phosphate and the Entner-Doudoroff pathways. Besides, the genome harbors genes encoding for enzymes involved in the conversion of pyruvate to lactate, acetate and ethanol, which are found to be the end products of carbohydrate fermentation. The genome contained the gene encoding Type I fatty acid synthase required for de novo FAS biosynthesis. The plsY and plsX genes encoding the acyltransferases necessary for phosphatidic acid biosynthesis were absent from the genome. The genome harbors genes encoding enzymes responsible for isoprene biosynthesis via the mevalonate (MVA) pathway. Genes encoding enzymes that confer resistance to reactive oxygen species (ROS) were identified. In addition, A. schaalii harbors genes that protect the genome against viral infections. These include restriction-modification (RM) systems, type II toxin-antitoxin (TA), CRISPR-Cas and abortive infection system. A. schaalii genome also encodes several virulence factors that contribute to adhesion and internalization of this pathogen such as the tad genes encoding proteins required for pili assembly, the nanI gene encoding exo-alpha-sialidase, genes encoding heat shock proteins and genes encoding type VII secretion system. These features are consistent with anaerobic and pathogenic lifestyles. Finally, resistance to ciprofloxacin occurs by mutation in chromosomal genes that encode the subunits of DNA-gyrase (GyrA) and topisomerase IV (ParC) enzymes, while resistant to metronidazole was due to the frxA gene, which encodes NADPH-flavin oxidoreductase.


Genome annotation
Genes were identified using Prodigal [24], followed by a round of manual curation using GenePRIMP [25] for finished genomes and Draft genomes in fewer than 10 scaffolds. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [26] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [27]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [28]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform [29] developed by the Joint Genome Institute, Walnut Creek, CA, USA [30].

Phylogenetic analyses
Phylogenetic analyses of 16S rRNA gene sequences were performed using the ARB-package [31]. Evolutionary distances were calculated using the method [32]. Phylogenetic trees were reconstructed using the neighbour-joining [33], maximum-likelihood (RAxML; [34]) and maximum-parsimony (ARB_PARS) methods as implemented in the ARB package. The topology of the neighbour-joining tree was evaluated using bootstrap analyses [35] based on 1000 resamplings. The sequence of the single 16S rRNA gene copy (1349 nucleotides) in the genome of A. schaalii DSM 15541 T was added to the ARB database [31] and compared with the 16S rRNA gene sequences of the type strains of Actinotignum species obtained from the NCBI database. This sequence is identical to the previously published 16S rRNA sequence (AM922112).

Results and discussion General genome features
The genome sequences of A. schaalii strains DSM 15541 T (= CCUG27420 T ) and FB 123-CAN-2 are relatively similar in size. The genome of A. schaalii DSM 15541 T consists of 2,130,987 bp in length with an average G+C content of 62.25 mol% (Fig 1).
The origin of replication (oriC) was identified in a region (710 bp) located between the dnaA and the dnaN genes. The predicted oriC is flanked on one side by the dnaA (G444DRAFT_ 01293), dnaN (G444DRAFT_01294), recF (G444DRAFT_01295), gyrB (G444DRAFT_01298) and gyrA (G444DRAFT_01299) genes and on the other side by parB (G444DRAFT_01288), parA (G444DRAFT_01289), gidB (G444DRAFT_01290), spoIIIJ (G444DRAFT_01291) and yidC (G444DRAFT_01292) genes. The genome contains 1835 predicted genes, of which 1777 (96.84%) were protein-coding genes and 58 (3.16%) were rRNA coding genes. Among the protein-coding genes, 1371 genes (74.71%) were assigned putative biological functions and the rest 406 genes (22.13%) were hypothetical proteins of unkown function. Most of the protein-coding genes, 1177 (64.14%), were assigned to Clusters of Orthologous Groups (COG). Of the 58 RNA genes, eight Features from the outer circle to the center are: genes on the forward strand (color by COG categories), genes on the reverse strand (color by COG cataegories), RNA genes (tRNA green, rRNA red, other RNAs black), % G+C content, GC skew (purple/olive) and codon adaptation index. The figure was obtained using OmicCircos [36].
are rRNA genes, three are 5S rRNA genes, four are 23S rRNA genes, one is 16S rRNA genes, fourty-six are tRNA genes and four other RNA genes. The genome contains two clustered regularly interspaced short palindromic repeat (CRISPR) loci. A summary of the genome properties and statistics are given in Table 1.

Taxonomy and phylogeny
A tree constructed using the neighbour-joining method depicted the pylogenetic position of A. schaalii is shown in Fig 2. The topology of the resultant tree was similar to the topology of the trees constructed by the maximum-likelihood and maximum parsimony algorithms. All the dendrograms were similar in that Actinotignum species formed a well separated clade (100% bootstrap value) within the radiation of the family Actinomycetaceae. In each of the dendrograms calculated A. schaalii and A. sanguinis are phylogenetic neighbours. On the basis of 16S rRNA gene sequence similarity A. schaalii was most closely related to A. sanguinis (98.5% sequence similarity). The phylogenetic relatedness and the high degree of 16S rRNA gene sequence similarity between A. schaalii and A. sanguinis is not unexpected as they are ecologically similar. They reside as commensal microbiota of the human urinary tract and have the potential to cause a wide spectrum of diseases in human such as urinary tract infections and bacteremia.
It should be noted that in all trees Actinotignum appeared as sister clade of the Actinobaculum lineage, a situation already depicted in the original description of the genus Actinotignum [1]. Apart from the lower 16S rRNA gene sequence similarities (less than 93.2%) and differences in the chemotaxonomic properties, members of the two genera are ecologically distinct. Actinobaculum suis is a resident of the urinary tract of sows and has never been isolated from human speciemens. Although A. massiliense and A. suis share the same clade in the phylogenetic tree, the former species is restricted to the human urinary tract and still await taxonomic revision. However, both species are distinguishable (at this time) on the basis of the G+C content (A. massiliense is 60.1 mol% compared with 57.8 mol% for A. suis) and habitat.

Carbohydrate transport and metabolism
Sugar transport systems. A bioinformatic reconstruction of the central carbon metabolism of A. schaalii DSM 15541 T revealed the absence of genes coding for components of the phosphoenolpyruvate:phosphotransferase systems (PTSs). A single homologue of a β-glucosidic-specific IIC permease (G444DRAFT_01249) was found. This IIC permease cannot function via a PTS-dependent mechanism since the organism lacks all other PTS protein homologues, including EI and HPr. A similar case was reported in Bacteroides thetaiotaomicron, which encodes a single homologue of a galacticol IIC protein and lacks PTS proteins completely [37]. A. schaalii also lacks a complete dihydroxyacetone PTS (DHA PTS). The genome harbors the dhaK and dhaL genes encoding DhaK (G444DRAFT_00672) and DhaL (G444DRAFT_00673) proteins, respectively, but lacks the dhaM gene encoding for DhaM homologue. Like Methylococcus capsulata, which has DhaK and DhaL but no other PTS protein [37], it seems likely that A. schaalii cannot phosphorylate DHA with either PEP or ATP. On the other hand, genome sequence analysis revealed the presence of genes predicted to encode primary sugar transporters of the ATP-binding cassette (ABC) and secondary transporters of the major facilitator (MFS) superfamilies. A characteristic feature of the ABC transporter is that the genes for three components: the ATP-binding protein, the membrane protein and the substrate-binding protein, frequently form an operon [38].Two families within the ABC superfamily concerned exclusively with carbohydrate uptake are present in the genome of A. schaalii. The carbohydrate uptake transporter-1 (CUT1) and -2 (CUT2) families Draft genome sequence of Actinotignum schaalii DSM 15541 T : Genetic insights exhibit specificity for mono-/di-/oligosaccharides and monosaccharides, respectively [39]. Three CUT1 (TC3.A.1.1.-) transporters were found in A. schaalii genome including homologues of the structural genes malEFGK encoding four essential constituents of the maltose transporter subunits MalEFGK involved in maltose transport (S1 Fig), homologues of the structural genes msmEFGK encodiong subunits of the multiple-sugar metabolism transporter MsmEFGK involved in the uptake and metabolism of disaccharides and/or oligosaccharides (S2 Fig). Although the malEFGK and the msmsEFGK operons lack the ATP-binding domain (malK and msmK), A. schaalii genome contains two paralogues of the msiK gene which encode for the ATP-binding components MsiK (G444DRAFT_01432 and G444DRAFT_01700) which shares 60% identity with MsiK from Streptomyces species. Previous studies showed that MsiK function as a universal ATPase that assists sveral ABC transporters [40]. The third CUT1 transporter consists of three open reading frames encoding for components of a putative multiple sugar transporter (G444DRAFT_01772 to G444DRAFT_01774). It is not known if this transporter has specificity for a particular carbohydrate. The substrate binding protein (G444DRAFT_01774) of this transporter shares 33% and 26% identities with homologues spr1534, Rv2041c and SCO6601 from Streptococcus pneumoniae R6, Mycobacterium tuberculosis H37Rv and Streptomyces coelicolor A3 [2], respectively, which are annotated as periplasmic-binding component of ABC transport systems specific for trehalose/maltose and similar oligosaccharides. However, the genes (G444DRAFT_01772, G444DRAFT_01773 and G444DRAFT_01774) lie upstream of the gal operon, which encodes enzymes of the Leloir pathway for galactose metabolism (S3 Fig), and immediately downstream of the aga gene encoding for α-galactosidase GalA (G444DRAFT_01775), suggesting that A. schaalii ABC transporter is likely transports α-galactosides and or other galactose-containing oligosaccharides.
The three CUT2 (TC 3.A.1.2.-) transporters present in A. schaalii include homologues of the ribose ABC transporter RbsBAC (S4 Fig) and two transporters [(G444DRAFT-01074 to G444DRAFT_01076) and (G444DRAFT_00668 to G444DRAFT_00671)] annotated as multiple sugar transport and simple sugar transport systems, respectively. The CUT2 permease and substrate-binding proteins encoded by the genes (G444DRAFT_01074 to G444DRAFT_01076; corresponding to cheV, gguA, gguB) are found associated with the xylBAF genes whose function related to the catabolism of xylose suggesting a possible candidate for a xylose ABC transporter (S5 Fig). The third CUT2 permeases and substrate-binding proteins encoded by the genes (G444DRAFT_00668 to G444DRAFT_00671) are found associated with the araBDA genes whose functions related to the catabolism of L-arabinose suggesting a possible candidate for L-arabinose ABC transporter (S6 Fig).
Secondary transporters of the major facilitator (MFS) superfamilies catalyze the active uptake of solutes in reponse to chemiosmotic gradients [41]. MFS transporters are integral to membrane and usually consist of single polypeptide chain. Among the MFS transporters rec- . This pbservation suggests that A. schaalii contains two kinetically distinguishable systems for L-arabinose import: the AraE L-arabinose:H+ symporter (MFS transporter) and the ATP-driven system encoded by the previously mentioned genes (G444DRAFT_00668 to G444DRAFT_00671).
The two genes (G444DRAFT_00703) and (G444DRAFT_01089) encode sugar porters which display 32% and 41% -44% identities with the glucose transporter GlcP (SCO5578) from Streptomyces coelicolor A3 [2] and (sll0771) from Synechocystis sp. PCC 6803, respectively. The gene (G444DRAFT_01316), which is annotated as minor myo-inositol:H+ symporter (IolF), clustered with the rhamnose utilization genes rhaDBAMR (S7 Fig) and it therefore appears likely to function as L-rhamnose-proton symporter (RhaT). A. schaalii genome also contains two paralogous genes encoding for sn-glycerol-3-phosphate transporter GlpT (G444DRAFT_00967-G444-DRAFT_00968), a member of the OPA family. GlpT mediates the translocation of glycerol-3-phosphate, which serves both as a carbon and energy source and a precursor for phospholipid biosynthesis. Furtheremore, A. schaalii has one togT gene encodes a protein (G444DRAFT_ 00165) homologous to TogT (TC 2.A.2.5.1), a member of the GPH family transporter, which mediates the uptake of oligogalagtorunides. Moreover, A. schaalii genome contains two paralogues (G444DRAFT_00245 and G444DRAFT_01505) encoding for oxalate/formate antiporter (OxlT), a member of the OFA family proteins. OxlT function as an anion exchange carrier during the one-for-one antiport of divalent oxalate and monovalent formate [42]. This observation points to a dependency of A. schaalii on organic acids as sources of carbon.
Central carbohydrate metabolic pathways. A. schaalii has several features of anaerobic bacteria, as predicted from the genome sequence. The organism can potentially metabolize glucose to triose via three different pathways: gylcolysis [the Embden-Meyerhof (EMP) pathway], the Entner-Duodoroff (ED) pathway and the pentose phosphate pathway (S8 Fig). A. schaalii genome contains all genes encoding the enzymes necessary for the oxidation of carbohydrates to pyruvate via the EMP pathway. The genes coding for the key enzymes, fructose-1,6-biphosphatase (FBP, EC 3.1.3.11) and phosphoenolpyruvate carboxykinase (PEPCK, EC 4.1.1.49), necessary to direct carbon through gluconeogenesis are absent in the genome, suggesting that A. schaalii cannot perform gluconeogenesis. The oxidative branch of the pentose-phosphate pathway (OPPP) seems to be incomplete since the pgl gene encoding 6-phosphogluconolactonase 6PGL (EC 3.1.1.31) was missing from the genome. In contrast, all orthologous genes encoding enzymes for all steps of the nonoxidative branch of the pentosephosphate pathway were present in A. schaalii genome.
Search for genes encoding the key enzymes of the ED pathway revealed the presence of the eda gene encoding 2-keto-3-deoxygluconate-6-phosphate aldolase (EDA) but lacks of the edd gene encoding 6-phosphogluconate dehydratase (EDD), thus disabling the ED pathway. Close inspection of the genome, however, revealed the presence of two genes, ilvD and (G444DRAFT_01081), predicted to encode dihydroxyacid dehydratase (DHAD) (G444DRAFT_01605; EC 4.2.1.9) and a putative hydratase (G444DRAFT_01081) of the YjhG/ YagF family, repectively. Both dehydratases belong to the ILVD_EDD protein superfamily (pfam00920), members of which have been shown to be evolutionary related [45,46]. Moreover, Kim and Lee [47] reported that DHAD exhibits substrate promiscuity with a high activity toward D-gluconate and some other pentonic and hexonic sugar acids and can catalyze the conversion of these sugar acids to 2-keto-3-deoxy analogs through a similar dehydration reaction. Therefore, we assume that the putative DHAD (G444DRAFT_01081), which is located adjacent to a gluconate:H+ symporter (G444DRAFT_01083), will compensate for the missing EDD enzyme in A. schaalii. Furthermore, analysis of genomic data revealed the presence of one copy of the gnl gene encoding gluconolactonase (EC 3.1.1.17), which catalyzes the conversion of gluconolactone to gluconic acid, and three copies of kdgK gene encoding 2-dehydro-3-deoxygluconokinase (EC 2.7.1.45), which catalyzes the phosphorylation of 2-keto-3-deoxygluconate (KDG) to 2-keto-3-deoxy-6-phosphogluconate (KDPG). The presence of the eda, (G444DRAFT_01081), gnl, kdgK genes in addition to gdh provides convincing evidence for the operation of a semiphosphorylative ED pathway in A. schaalii. Semiphosphorylative ED pathway was reported in halophilic archaea [48]. The schematic reactions of this pathway from glucose to pyruvate are presented in (Fig 3). Further experimental approaches with isotopic labeled tracers like 13 C-labeled carbon sources are needed to elucidate pathway operation in A. schaalii.
Fate of pyruvate. Pyruvate generated by the EMP and the ED pathways may be fermented to lactate, acetate, formate and ethanol. The genes found in A. schaalii genome suggest a pathway for pyruvate fermentation as shown in (Fig 4). In this predicted pathway, pyruvate is oxidatively decarboxylated by pyruvate dehydrogenase PDH (EC 1.2.4.1) and/or pyruvate formate lyase PFL (EC 2.3.1.54), resulting in the formation of acetyl-CoA and CO 2 or formate, respectively. Acetyl-CoA is then either converted to acetaldehyde by the action of acetaldehyde dehydrogenase domain of AdhE (EC 1.2.1.10 1.1.1.1) which is then reduced to ethanol by the action of alcohol dehydrogenase domain of AdhE (EC 1.2.1.10 1.1.1.1), or alternatively acetyl-CoA is converted to acetate with the generation of ATP in a two-stage reaction catalyzed by phosphate acetyltransferase PTA (EC 2.3.1.8) and acetate kinase AK (EC 2.7.2.1). Formate is converted to H 2 and CO 2 by the action of formate dehydrogenase FDH (EC 1.2.1.2). In agreement with the predicted pathway we identified the following putative genes and its enzyme products: two copies of the pflD gene encoding for pyruvate-formate lyase PFL (G444DRAFT_00375 and G444DRAFT_01725), pta gene encoding for phosphate acetyltransferase PTA (G444DRAFT_00377), ackA gene encoding for acetate kinase AK (G444DRAFT_00378), adhE gene encoding for the bifunctional acetaldehyde-CoA/alcohol dehydrogenase AdhE (G444DRAFT_01460) and fdh gene encoding for formate dehydrogenase (G444DRAFT_01038). The predicted pathway should be experimentally elucidated e.g., by using targeted mutagenesis.
Finally, the conversion of pyruvate to lactate is coupled to NADH oxidation and is catalyzed by lactate dehydrogenase LDH (EC 1.1.1.27) encoded by the ldh gene (G444DRAFT_01384). In addition, we identified a gene cluster (G444DRAFT_01666 to G444DRAFT_01669) encoding components of lactate utilization machinary. This include an lctP gene encoding lactate permease (G444DRAFT_01666) and three genes lldEFG encoding components of a predicted L-lactate dehydrogenase complex LldEFG (G444DRAFT_01667, G444DRAFT_01668 and G444DRAFT_01669). These data suggest that A. schaalii can use L-lactate as a sole source of carbon and energy.
Fermentation and enegy conservation. Genome sequence analysis indicated that A. schaalii is incapable of respiratory metabolism, either aerobically or anaerobically. Except genes encoding for cytochrome bd-type quinol oxidase and genes encoding for the subunits of the proton driven F0F1-ATPase, we did not found the genes required for a complete electron transport chain that might be associated with aerobic or anaerobic respiration. The genome harbors the cyd operon encoding proteins required for the production of intact cytochrome bd-type quinol oxidase. The cyd operon consists of four genes: cydAB genes encode the structural proteins CydA (G444DRAFT_00230) and CydB (G444DRAFT_00231) and the cydCD genes encode an ABC transporter, CydD (G444DRAFT_00232) and CydC (G444DRAFT_00233), essential for the assembly of cytochrome bd. The expression of the cydAB operon is controlled by the two global transcriptional regulators ArcA/ArcB (G444DRAFT_00487/G444DRAFT_00486) and FNR (G444DRAFT_01491) encoded by the arcA, arcB and fnr genes, respectively [51]. Although anaerobic bacteria have no need of respiratory cytochrome oxidase, a functional cytochrome bd was found in e.g., Bacteroides fraglis, Desulfovibrio gigas and Moorella thermoacetica [52,53,54]. Cytochrome bd generates a PMF by transmembrane charge separation, but does so without being a "proton pump" [55]. Apart from PMF generation, cytochrome bd endows bacteria with a number of important physiological functions. Cytochrome bd facilitates both pathogenic and commensal bacteria to colonize O 2 -poor environments [52], serves as O 2 scavenger to inhibit degradation of O 2 -sensitive enzymes [56,57]. The bd-type cytochrome with high oxygen affinity is able to scavenge oxygen to such extent that it provides resistance against oxidative stress [58]. schaalii. This pathway involves either oxidation of glucose by the membrane-bound glucose dehydrogenase (gdh) to form glucono-1,5-lactone which is then converted to gluconate by gluconolactonase or gluconate is taken up by the cell via a putative gluconate permease (GntP). Gluconate is then converted to 2-keto-3-deoxygluconate (KDG) by a specific gluconate dehydratase (ILVD_EDD). Further metabolism of KDG involves its phosphorylation by KDG kinase to form KDPG, followed by cleavage by EDA to pyruvate and glyceraldehyde-3-phosphate. Glyceraldehyde-3-phosphate is further converted to form another pyruvate molecule via common reaction of the EM pathway. A similar modified ED pathway has been shown to occur in several Clostridium species e.g. Clostridium aceticum [49] and halophilic archaea, e.g. Halobacterium saccharovorum [50]. Abbreviations: gdh, glucose-1-dehydrogenase; gnl, gluconolactonase; ilvD/EDD, dihydroxyacid dehydratase; KDGK, 2-dehydro-3-deoxygluconokinase; EDA, 2-dehydro-3-deoxyphosphogluconate aldolase; GAP, glyceraldehyde 3-phosphate dehydrogenase; PGM, phosphoglycerate mutase; ENO, enolase; PYK, pyruvate kinase.
https://doi.org/10.1371/journal.pone.0188914.g003 Another essential component of the respiratory chains is menaquinones, which delivers electrons and protons between dehydrogenases and cytochromes. Genomic data concerning the biosynthesis of menaquinones (MK) in A. schaalii are controversial. Despite clear chemotaxonomic evidence for the absence of menaquinones in all members of the genus Actinotignum including A. schaalii [1], bioinformatic analyses revealed the presence of a complete menaquinone biosynthesis pathway in the genome of A. schaalii DSM 15441 T , enabled by a pentacistronic operon menBCDEF (G444DRAFT_00407 to G444DRAFT_00411) and three separately located genes, menA (G444DRAFT_00984), ubiE (G444DRAFT_00986) and a possible menH (G444DRAFT_01490). Paradoxically, however, lacks of menDEF and menBCDEF orthologs in the genome sequences of A. schaalii CCUG 27420 T and A. schaalii strain FB123-CAN-2, respectively, indicates an incomplete menaquinone biosynthesis pathway in these two strains.
Given the lack of: a TCA cycle, subunits of the reduced form of NADH dehydrogenase, and most other electron-transport chain complexes including menaquinone, we infer a strictly anaerobic fermentation-based lifestyle. A. schaalii is predicted to produce lactate, acetate and ethanol as fermentation end products. It catabolizes glucose via the EMP pathway to pyruvate and converts part of pyruvate to lactate by lactate dehydrogenase, thereby reoxidizing NADH produced during glycolysis to NAD + , while the other pyruvate part is cleaved to acetyl-CoA and formate by pyruvate-formate lyase (PFL). It cleaved half of the produced acetyl-CoA to acetate via acetylphosphate by two enzymes (acetate kinase and phosphate acetyltransferase) generating ATP by substrate-level phosphorylation (SLP), while the second half is reduced in two steps to ethanol with the oxidation of two NADH molecules to NAD + . The overall energy yield is three molecules of ATP per glucose molecule. This ATP is used for the synthesis of cellular macromolecules and other energy requiring processes in the cell and for the generation and maintenance of a proton-motive force (PMF) by the membrane bound F0F1-ATPase. The genome of A. schaalii contains an atp operon encoding the subunits of the proton driven F0F1-ATPase. This operon consists of eight genes (G444DRAFT_01552 to G444DRAFT_01559) encoding components of the membrane-intrinsic F0 proton-channeling part (a, b, c subunits) and the membrane-extrinsic F1 catalytic part (α, β, γ, δ, ε). The membrane bound H + F0F1-ATPase serves as a major regulator of intracellular pH by extruding protons from the cell at the expence of ATP [59,60]. However, the physiological role of the H + F0F1-ATPase in A. schaalii should be established by genetic and biochemicals means.
Correlations between genotype and phenotype. As previously mentioned, A. schaalii is capable of utilizing a wide range of carbon sources, including pentose sugar (arabinose, ribose and xylose), hexose sugars (glucose), and disaccharides (maltose and sucrose). Corresponding genes of these features could be found not only in the genome of A. schaalii DSM 15541 T but also in the genome of A. schaalii CCUG 27420 T and A. schaalii FB123-CAN-2.
L-arabinose metabolism in A. schaalii is achieved by three L-arabinose-catabolizing genes, araA, araB and araD, which comprised the araBDA operon. These genes encode three intracellular enzymes for arabinose catabolism, arabinose isomerase, ribulokinase and ribulose-5-phosphate epimerase, respectively. Upstream of the araBDA operon were the araR and araE genes, which were present in the opposite direction (S6 Fig). The araE gene encodes a proton symporter involved in the transport of arabinose into the cell.
The genes responsible for transport and utilization of D-ribose cluster together to form the rbsBACR operon. The rbsBAC genes encoding for the RbsB, RbsA and RbsCsubunits of the ribose ABC transporter (S4 Fig). Xylose utilization in A. schaalii is mediated by the xylose catabolic enzymes, xylose isomerase and xylulose kinase, encoded by the xylA and xylB genes, respectively, of the xyl operon ( S5 Fig). A. schaalii possesses a PTS-independent pathway for glucose utilization. Glucose is imported via a non-PTS permease and phophorylated by glucokinase (Glk). Uptake and catabolism of the disaccharide maltose is mediated by the maltose/maltodextrin ABC transporter malEFG (S1 Fig). Genotype to phenotype correlations indicate that sucrose is metabolized via a non-PTS system, which consists of a sucrose hydrolase or invertase enzyme SacA (EC 3.2.1.26), a fructokinase ScrK (EC 2.7.1.4) and an as jet unidentified permease.
In addition, other genes associated with carbohydrate metabolism were found in the genome: (i) a rhaADB operon and rhaM gene whose annotation suggests that they encodes all enzymes involved in rhamnose mtabolim (S7 Fig); (ii) a galRKTEM operon encoding enzymes responsible for galactose catabolism (S3 Fig) and (iii) the lldEFG genes involved in D-and L-Lactate metabolism. These indicated that A. schaalii strains would have the potential abilities to metabolize rhamnose, galactose and lactate. However, the gene-phenotype correlation should be experimentally validated.

Lipid metabolism
A. schaalii is able to synthesize fatty acids as well as other major lipid classes such as phospholipids, isoprenoids and glycolipids.
Fatty acids (FAs) biosynthesis. Analysis of the cellular FA profiles showed that strains of Actinotignum schaalii contain predominantly saturated and mono unsaturated straight chain FAs [1]. These include C 18:1 ω9c (> 50.0% of total FA content), C 16:0 (> 13.0%) and C 18 ). However, exhaustive bioinformatic search of the genome failed to identify genes encoding for enoyl-CoA hydratase and 3-hydroxyacyl-CoA dehydrogenase, indicating incomplete β-oxidation cycle. These genomic data suggest that A. schaalii cannot use exogenously supplied fatty acids as energy sources. This suggestion should be confirmed experimentally by e.g., enzymatic assays.
Phospholipids biosynthesis. The most abundant phospholipids found in the cell membrane of A. schaalii comprise cardiolipin (CL), phosphatidylgylcerol (PG), phosphatidylinositol (PI) and phosphatidylinositol monomannoside (PIM1); this composition is similar to that found in many other gram-positive bacteria. In addition, an unkown choline containing phosphoglycolipid (AbGL) has been identified [1]. These data are in line with protein data predicted from genome sequence analysis.
All the genes (except the plsY gene) encoding enzymes necessary for de novo biosynthesis of phospholipids were identified in A. schaalii genome. Phosphatidic acid (PA), the key phospholipid synthetic intermediate in prokaryotes, is generated from sn-glycerol-3-phosphate (G3P) by the consecutive acylation of sn-1 carbon followed by the sn-2 carbon of G3P, reactions catalyzed by the PlsX/PlsY and PlsC acyltransferases, respectively. Bioinformatics analysis showed that A. schaalii has two plsC genes encoding for 1-acyl-sn-glycerol-3-phosphate acyltransferase PlsC (G444DRAFT_00646 and G444DRAFT_01258; [EC 2.3.1.51]). Neither the plsX nor the plsY genes encoding for PlsX and PlsY homologues, respectively, were found in the genome. This raises the intriguing possibility that an as yet unidentified enzyme participates in PA biosynthesis may be present in A. schaalii.
Furthermore, genes encoding enzymes involved in the mannosylation of the inositol residue of PI for the synthesis of glycolipids (PIMs) were identified in the genome. These include genes predicted to encode enzymes responsible for synthesis of GDP-mannose, the activated form of mannose, required for mannosylation processes and genes encoding enzymes predicted to encode glycosyltransferases required to incorporate the activated mannose into PI. Two of the three key genes, manA, manB and manC, responsible for GDP-mannose synthesis are present in the genome: manA encoding mannose-6-phosphate isomerase PMI (G444DRAFT_00639; [EC 5.3.1.8]) that catalyzes the conversion of fructose-6-phosphate to mannose-6-phosphate and manC encoding for mannose-1-phosphate guanylyltransferase GMPP (G444DRAFT_00325; [EC 2.7.7.13]) which catalyzes the synthesis of GDP-mannose from mannose-1-phosphate and GTP. The manB gene encoding phosphomannomutase PMM (EC 5.4.2.8) that catalyzes the interconversion of mannose-6-phosphate to mannose-1-phosphate is absent in A. schaalii genome. It is likely that the PMM activity may be contributed by an additional phosphohexomutase such as phosphoglucomutase PGM (G444DRAFT_01405; [EC 5.4.2.2]) and/or phosphoglucosamine mutase GlmM ([G444DRAFT_00418; EC 5.4.2.10]), which contain the four domains required to catalyze the transfer of the phosphate group between C6 and C1 positions of glucose-6-phosphate and glucosamine-6-phosphate, respectively. This explanation need to be verified by experimental studies.
The genome also harbors the pimA gene encoding phosphatidylinositol α-mannosyltransferase PimA (G444DRAFT_01212; [EC 2.4.1.57]), which catalyzes the transfer of a mannose residue from GDP-mannose to the 2-position of the myo-inositol ring of PI leading to the synthesis of phosphatidylinositol monomannoside (PIM1). Bioinformatic search of annotated genome failed to reveal the existence of pimB gene encoding phosphatidylinositol α-1,6-mannosyltransferase responsible for the transfer of a mannose residues from GDP-mannose to 6-position of the myo-inositol ring of PIM1. This finding is consistent with the absence of PIM2 in the cell wall of A. schaalii.

Protection against oxidative stress
A. schaalii is a gram-positive anaerobic bacterium that normally grows without oxygen (or in the presence of minimal concentrations of oxygen). Exposure of the organism to air can give rise to the metabolic conversion of atmospheric oxygen to reactive oxygen species (ROS), which pose significant threat to cellular integrity in terms of their damage to proteins, lipids, RNA and DNA. Like other anaerobic bacteria, A. schaalii has developed several mechanisms to protect itself from the damaging effects of ROS. Inspection of A. schaalii genome revealed the presence of 13 antioxidant-related genes encoding several antioxidant enzymes that confer resistance against toxicity of ROS including cytochrome bd oxidase, superoxide dismutase (SOD), peroxiredoxins (PRX) and thioredoxins (TRX).
Cytochrome bd (cyd). As previously mentioned A. schaalii harbors the cydAB operon encoding a cytochrome bd oxidase. Cytochrome bd has high affinity for oxygen, making it an effective oxygen scavenger protecting bacterial cell against oxidative stress conditions [55,61]. It prevents the formation of H 2 O 2 and ROS by reducing molecular oxygen to H 2 O [62]. The presence of cytochrome bd in strict anaeorobes may desensitize such bacteria to certain levels of oxygen permitting growth in nanomolar concentration of oxygen [52].
Superoxide dismutase (SOD). A. schaalii harbors sodA gene encoding for Fe/Mn-containing superoxide dismutase (G444DRAFT_01169), which contributes to aerotolerance in this bacterium. SOD catalyzes the dismutation of the superoxide radical (O 2 ) to oxygen and H 2 O 2 , which can be further reduced to water and oxygen by catalase or peroxiredoxins. Since A. schaalii does not possess catalase, detoxification of H 2 O 2 is accomplished by peroxiredoxins.
In addition A. schaalii genome harbors osmC gene encoding a protein of 136 amino acids (G444DRAFT_01716) predicted as uncharacterized OsmC-related protein (COG1765). OsmC (osmotically inducible protein) possess thiol-dependent peroxidase activity and known to be involved in the cellular defense against oxidative stress caused by exposure to organic hyperoxides or elevated osmolarity [65,66].

Defense of genome integrity and cell fitness
Genome sequence analysis revealed that A. schaalii has developed various mechanisms that may allow it to withstand viral and alien nucleic acid invasion. These include postsegregation killing systems (also called addiction modules because their loss leads to the death of their host bacterium) and Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins (CRISPR-Cas systems). Two kinds of postsegregation killing systems are present in A. schaalii genome: toxin-antitoxin (TA) systems and restrictionmodification (RM) systems.
Toxin-antitoxin (TA) systems. Toxin-antitoxin (TA) systems are small mobile genetic modules found on bacterial mobile genetic elements like plasmids as well as bacterial chromosomes [68]. They are generally composed of bicistronic operons that encode a stable toxin preceded by its cognate unstable antitoxin. Reverse organization, in which the toxin gene preceds that of the antitoxin within the operon, as well as three-component systems were reported [69]. Toxins are always small proteins (<130 amino acids), whereas antitoxins can be proteins or RNAs. Chromosomal TA systems were shown to be involved in numerous cellular processes related to programmed cell death [70], maintenance of mobile genetic elements [71], phage abortive infection [72], stress response [73], biofilm formation [74], virulence [75], persistence [76] and other cellular processes [77].
A thorough bioinformatic search of type II TA systems in genomes of A. schaalii strains led to the identification of %19 loci of TA genes, of which three pairs of genes encoding TA belonging to previously known TA systems (RelBE, HicAB and z). Four genes encoding antitoxins (RelB, HigA, HicB and Phd) that are not associated with cognate toxins suggesting that they might be solitary antitoxins. In addition, based on the "mix" and "match" principal which suggests that type II TA toxins can also interact with antitoxins from different classes [78], we identified twelve TA gene pairs encoding for twelve putative novel TA systems (Table 2). These candidate systems should be experimentally validated.
Restriction modification systems (RM). Restriction modification systems (RM) protect bacteria from infection by mobile genetic elements such as phages. RM systems consist of a methyltransferase (MTase) that modifies a specific DNA sequences in a genome by methylation and a restriction endonuclease (REase) that cleaves unmethylated DNA [79]. Searching genes encoding Mtase and REase revealed that A. schaalii predicted to harbor the genetic determinants for Type I, II and III RM systems.
Type I RM is encoded by two adjacent similarly oriented genes. One gene (G444DRAFT_00292) encoding a protein (356 amino acids) which contains two target recognition domains (TRDs) and is responsible for the recognition of target DNA sequence and the other gene (G444DRAFT_00293) encoding a methyltransferase (666 amino acids) that contains AdoMet binding motifs. BLASTP analysis revealed that the deduced amino acid sequence of the proteins (G444DRAFT_00292) and (G444DRAFT_00293) shares significant homology (45-77% identities) with previously characterized BcgI-RM system from various bacteria.
The genome harbors Type IIS RM system. This consists of two adjacent similarly oriented genes. The gene (G444DRAFT_01011) encoding a polypeptide (592 amino acids), which contained a putative conserved domain (pfam09491) related to the RE_AlwI superfamily of restriction endonucleases. The second gene, dam gene, located upstream of AlwI, encoding a DNA adenine methyltransferase (Dam) (G444DRAFT_01012), which showed a close similarity (32% identity) to homolog (b3387; M.EcoKDam MTase) of E. coli.
The putative Type III RM system is encoded by two distantly located genes. The gene (G444DRAFT_00278) encoding the restriction endonuclease subunit (Res) that contains DEAD-box motifs present in superfamily II DNA or RNA helicases. The second gene (G444DRAFT_00547) encoding putative N6 adenine-specific DNA methyltransferase (EC 2.1.1.72) with pfam01555-N6_N4_Mtase conserved domain.

Invasion-associated proteins (Virulence factors)
Genomic analysis revealed that A. schaalii expresses several virulence-associated factors that allow the organism to colonize and evade host tissues, including adhesin and fimbrial proteins, degradative enzymes such as sialidase and NlpC-P60 protein, heat-shock proteins and secretion of protein with WXG100 domain.
A. schaalii genome contains two clusters of tight adherence (tad) genes, which are located on different regions on the chromosome: tadA gene encoding ATPase of CpaF family TadA (G444DRAFT_00310 and G444DRAFT_01487); tadB gene encoding inner membrane protein TadB (G444DRAFT_00309 and G444DRAFT_01486); and tadC gene encoding inner membrane protein TadC (G444DRAFT_00308 and G444DRAFT_01485). The tad genes are organized linearly in the same direction, indicating that they consititute an operon. The tad genes encode the machinery required for the assembly of pili. The tad locus has been implicated in the pathogenesis of several bacterial diseases. Pili, which are encoded within pathogenicity islands, play major roles in adhesion and host colonization [81]. Currently there was no information about the functional significant of pili in A. schaalii and the functions of the tad loci in A. schaalii should be investigated in experimental study.
A. schaalii genome carries three genes predicted to encode sortase A (SrtA) enzymes. However, database search using PSI-BLAST analysis leaves no doubt that two gene products (G444DRAFT_00284 and G444DRAFT_00353) were homologues of sortase C (SrtC) and the third gene product (G444DRAFT_01330) was a homolog of sortase E (SrtE). The srtC genes, encoding SrtC, clustered in two separate loci (G444DRAFT_00284, G444DRAFT_00285, G444DRAFT_00286 and G444DRAFT_00353, G444DRAFT_000354, G444DRAFT_00355) with gene encoding putative surface protein. The gene products (G444DRAFT_00285 and G444DRAFT_000354) were identified by PSI-BLAST analysis as fimbrial assembly proteins, whereas the gene products (G444DRAFT_00286) and (G444DRAFT_00355) were identified as Cna protein B-type domain and cell wall anchor protein with LPXTG-motif, respectively. This suggests that A. schaalii genome expresses two types of adhesive fimbriae. However, further studies are needed to explore the role of the C sortases in fimbrial biogensis and to examine their distribution on the cell surface of this organism.
Another virulence-associated gene found in A. schaalii genome was the nanI gene encoding an exo-alpha-sialidase (G444DRAFT_01683; [EC 3.2.1.18]). Sialidase contribute to the removal of terminal sialic acid from host tissues and cells and is considered a significant virulence factor in term of adhesion and immune modulation.
In addition to the above mentioned virulence genes, A. schaalii genome harbors genes that encode Esat-6 secretion system (ESX or Ess) also known as type VII secretion system (T7SS). The Esx-1 gene cluster in A. schaalii composed of the two-gene operon esxA/esxB and the two genes, eccC and eccB, located in close proximity of the esxA/esxB operon. The esxA and esxB genes encoding small secreted proteins EsxA (G444DRAFT_00919; 95 amino acids) and EsxB (G444DRAFT_00918; 111 amino acids), respectively, containing WXG100 motif. The eccC gene encodes a transmembrane protein of the FtsK-SpoIIIE ATPase family (G444DRAFT_00923) and the eccB encodes a protein with transmembrane domainB EccB (G444DRAFT_00921). A second eccC gene predicted to encode FtsK-SpoIIIE ATPase (G444DRAFT_00811) located elsewhere in the genome. Proteins of the Esx-1 system were identified in Actinobacteria such as Mycobacterium tuberculosis and Corynebacterium diphtheriae and members of the Firmicutes (low G+C Grampositive bacteria) such as Staphylococcus aureus, Streptococcus agalactiae and Listeria monocytogens. The Esx-1 system plays an important role in the virulence of the human pathogens M. tuberculosis and Staphylococcus aureus [84,85]. Therefore, we speculate that the Esx-1 system in A. schaalii may contribute to virulence of this organism. However, this speculation should be supported by experimental studies.

Resistant to antibiotics
As previously mentioned, in vitro susceptibility showed that A. schaalii is resistant to ciprofloxacin and metronidazole.
Fluoroquinolones such as ciprofloxacin are known to exert their bactericidal activity by acting on the bacterial target enzymes DNA gyrase (GyrA) and topisomerase IV (ParC). Resistance occurs mainly as a result of single point mutation within the quinolone resistancedetermining region (QRDR) of the gyrA and the parC genes [86,87]. Examination of the QRDRs in the genome sequences of A. schaalii DSM 15541 T , A. schaalii CCUG 27420 T and A. schaalii FB 123-CAN-2 revealed the presence of a single mutation at both gyrA (position 83 according to Escherichia coli numbering) and in parC (position 80) genes resulting in changes of Ser83!Ala and Ser80!Thr, respectively (Fig 5). This observation corresponds well with those of earlier studies [88].
The molecular mechanism that governs antimicrobial resistance to metronidazole is provided by the activity of nitroimidazole reductase. This enzyme is essential to convert metronidazole from a harmless prodrug to a bactericidal agent [89]. Specific resistance genes (nim) have been identified in several genera of Gram-positive and Gram-negative anaerobic bacteria such as Peptostreptococcus and Bacteroides species [90,91]. In contrast to these bacteria, the genome sequences of A. schaalii DSM 15541 T , A. schaalii CCUG 27420 T and A. schaalii FB 123-CAN-2 lack the nim genes, instead, the genome sequences of the three strains contain frxA genes (G444DRAFT_01234; FB03:01660; HMPREF9237_01071), which encode NADPHflavin oxidoreductase. FrxA belongs to the nitroreductase protein family (pfam00881; KO: K00540) and catalyzes the reduction of nitrocompounds using NADPH as electron donor. Earlier studies showed that FrxA enhances metronidazole resistence in Helicobacter pylori [92]. Therefore, we assume that the frxA gene may be involved in metronidazole resistance among A. schaalii strains. However, further study will be required to determine what role the frxA gene plays in metronidazole resistance in A. schaalii.

Conclusions
The draft genome sequence of A. schaalii provided numerous insights into the metabolic, physiologic and virulence potential of this organism. In general the predicted physiological capabilities are in good agreement with reported experimental observations. A complete glycolytic pathway is present leading to the production of pyruvate, which can subsequently be converted to lactate, acetate and ethanol. A. schaalii contains an intact nonoxidative branch of the pentose phosphate pathway, but lacks the oxidative branch due to absence of the gene encoding 6-phosphogluconolactonase. A. schaalii is predicted to possess a standard as well as a semiphosphorylative variant of the Entner-Doudoroff pathway. The TCA cycle and the glyoxylate shunt are completely absent. The organism lacks a functional gluconeogenesis due to absence of genes encoding for fructose-1,6-bisphosphatase and PEP carboxykinase. Energy generation is primarily dependent on substrate level phosphorylation in glycolysis and fermentation. De novo fatty acids biosynthesis is carried out by multifunctional type I fatty acid synthase. A full set of genes necessary for the biosynthesis of cardiolipin, phosphatidylglycerol, phosphatidylinositol and phosphatidylinositol monomannoside are present in the genome. For detoxification of reactive oxygen species (ROS) genes encoding several proteins such as superoxide dismutase, thioredoxin, thioredoxin reductase and Bcp were identified. For resistance against invading phage, the genome harbors gene encoding for type II toxin-antitoxin system, different RM systems, two CRISPR loci and a gene encoding for phage abortive system. A. schaalii genome harbors several virulence related genes, including genes encoding for sortase-associated pili, a gene encoding sialidase (NanI), genes encoding for secreted proteins with WXG100 domain (T7SS), genes encoding for Hsp proteins. The observed resistance to ciprofloxacin correlates with the presence of point mutation in the protein products of the gyrA and parC genes. The detailed knowledges presented here are based on an in silico approach and provides a template for future experimental analysis for its validation.  Draft genome sequence of Actinotignum schaalii DSM 15541 T : Genetic insights organization of the locus within the (G444DRAFT_01575-G444DRAFT_01576-G444DRAFT_ 01577) gene cluster is similar to that observed in the genomes of other Actinobacteria e.g. Streptomyces coelicolor and Streptomyces erythrea. Abbreviations: malE encodes a maltose-binding protein; malF and malG encode permeases of the ABC transporter; aglA encodes α-glucosidase; malR encodes transcriptional regulator of the LacI family. Two copies of the malK genes encoding the ATPase are located elsewhere in the genome. Orthologs are shown by matching colors. The rbsBACR gene cluster is responsible for the metabolism of ribose. The rbsBAC genes encoding for the ABC transporter belonging to the CUT2 family, where rbsB encodes a ribosebinding protein, rbsA encodes ATP-binding protein and rbsC encodes a permease. In addition to the rbsA gene, the genome harbors three rbsK genes encoding three ribokinases (EC 2.7.1.15), which specifically directs its phosphorylating activity towards D-ribose, converting this pentose sugar to ribose-5-phosphate. The transcription of the rbs gene cluster is regulated by a LacI-type regulator encoded by rbsR, located immediately upstream of rbsB. The organism contains two kinetically distinguischable systems for L-arabinose import: the AraE L-arabinose:H+ symporter and the ATP-driven system. The two sets of transport proteins are located nearby one another, separated by the genes of the ara operon. The genes of the ara operon encode three enzymes required for arabinose catabolism: araA (encoding L-arabinose isomarise), araB (encoding L-ribulokinase) and araD (encoding Lribulose-5-phosphate 4-epimerase). Upstream of the araBDA genes are two genes: the araE gene encodes a proton symporter of the MFS superfamily for the transport of arabinose into the cell and is organized as a divergent transcriptional unit with the araR gene encodes a LacItype transcriptional regulator. Downstream of the ara operon separated by the divergently oriented agaR gene are the components of the ABC transporter: G444DRAFT_00668 encodes the substrate-binding protein, G444DRAFT_00669 encodes the ATP-binding-protein, G444DRAFT_00670 and G444DRAFT_00671 encode two permeases. Functional analysis is required to confirm the role of the two systems in arabinose transport.