Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The inside scoop: Comparative genomics of two intranuclear bacteria, “Candidatus Berkiella cookevillensis and “Candidatus Berkiella aquae

  • Destaalem T. Kidane,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, United States of America, Department of Biology, Middle Tennessee State University, Murfreesboro, TN, United States of America

  • Yohannes T. Mehari,

    Roles Investigation, Writing – review & editing

    Affiliation Department of Biological Sciences, Auburn University, Auburn, AL, United States of America

  • Forest C. Rice,

    Roles Investigation, Visualization, Writing – review & editing

    Affiliation Department of Biology, Middle Tennessee State University, Murfreesboro, TN, United States of America

  • Brock A. Arivett,

    Roles Formal analysis, Supervision, Writing – review & editing

    Affiliation Division of Infectious Disease, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States of America

  • John H. Gunderson,

    Roles Conceptualization, Writing – review & editing

    Affiliation Department of Biology, Tennessee Technological University, Cookeville, TN, United States of America

  • Anthony L. Farone,

    Roles Conceptualization, Writing – review & editing

    Affiliations Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, United States of America, Department of Biology, Middle Tennessee State University, Murfreesboro, TN, United States of America

  • Mary B. Farone

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Molecular Biosciences Program, Middle Tennessee State University, Murfreesboro, TN, United States of America, Department of Biology, Middle Tennessee State University, Murfreesboro, TN, United States of America


Candidatus Berkiella cookevillensis” (strain CC99) and “Candidatus Berkiella aquae” (strain HT99), belonging to the Coxiellaceae family, are gram-negative bacteria isolated from amoebae in biofilms present in human-constructed water systems. Both bacteria are obligately intracellular, requiring host cells for growth and replication. The intracellular bacteria-containing vacuoles of both bacteria closely associate with or enter the nuclei of their host cells. In this study, we analyzed the genome sequences of CC99 and HT99 to better understand their biology and intracellular lifestyles. The CC99 genome has a size of 2.9Mb (37.9% GC) and contains 2,651 protein-encoding genes (PEGs) while the HT99 genome has a size of 3.6Mb (39.4% GC) and contains 3,238 PEGs. Both bacteria encode high proportions of hypothetical proteins (CC99: 46.5%; HT99: 51.3%). The central metabolic pathways of both bacteria appear largely intact. Genes for enzymes involved in the glycolytic pathway, the non-oxidative branch of the phosphate pathway, the tricarboxylic acid pathway, and the respiratory chain were present. Both bacteria, however, are missing genes for the synthesis of several amino acids, suggesting reliance on their host for amino acids and intermediates. Genes for type I and type IV (dot/icm) secretion systems as well as type IV pili were identified in both bacteria. Moreover, both bacteria contain genes encoding large numbers of putative effector proteins, including several with eukaryotic-like domains such as, ankyrin repeats, tetratricopeptide repeats, and leucine-rich repeats, characteristic of other intracellular bacteria.


Free-living amoebae, found in natural and human-made aquatic environments, are predators of many bacteria, and thus play an important role in controlling microbial populations in the environment [1]. Some bacteria, however, have evolved to become resistant to predation by these protozoa. These amoeba resistant bacteria (ARB) have mechanisms that not only allow them to survive internalization and digestion by amoebae, but also allow them to replicate within the intra-amoebal environment [2, 3]. Both facultative and obligate intracellular ARB belonging to several evolutionary lineages, including alphaproteobacteria [46], betaproteobacteria [7], gammaproteobacteria [8, 9], Bacteroidetes [10] and Chlamydiae [11, 12] have been recovered from free-living amoebae. Among those identified, several ARB such as Legionella pneumophila and Mycobacterium avium are established human pathogens [13, 14] while others, including Parachlamydia acanthamoebae and the Legionella-like amoebal pathogens (LLAPs), have been designated as potential emerging human pathogens [15, 16].

Candidatus Berkiella cookevillensis” (type strain CC99) and “Candidatus Berkiella aquae” (type strain HT99) are two ARB that were each isolated from an amoeba present in biofilm recovered from a hospital cooling tower and an outdoor hot tub spa, respectively [17]. Both bacteria are non-spore-forming, motile, obligate intracellular gram-negative bacteria. CC99 bacteria are coccoid shaped with diameters ranging from 0.30 to 0.60 μm whereas HT99 are coccobacilli with a width ranging from 0.30–0.55μm and length ranging from 0.45–0.65μm (Fig 1) [17]. Based on 16S rRNA gene phylogenetic analyses and cellular fatty acid composition analyses, CC99 and HT99 have been classified as separate novel species forming distinct taxonomic lineages within the Coxiellaceae family of the order Legionellales and class Gammaproteobacteria [17]. However, recent phylogenetic analysis that included more than 100 Gammaproteobacteria genomes using concatenated amino acid alignment of 109 proteins has classified CC99 and HT99 as a separate distinct family outside Coxiellaceae [18]. Both bacteria show a close 16S rRNA similarity each other (~94%) and to the intracellular pathogens Coxiella burnetii (~90–91%) and L. pneumophila (~88%) [17], the causative agents of the zoonotic disease Q fever [19] and Legionnaires’ disease [20], respectively.

Fig 1. Protozoa infected with “Ca. B. cookevillensis” (CC99) or “Ca. B. aquae” (HT99).

Electron micrograph (13,000x magnification) of intracellular CC99 (arrow) exhibiting coccoid morphology following exposure of the BCV by the tape ripping technique (A). Micrograph of adherent HT99 (arrow) on the surface of A. polyphaga exhibiting coccobacillus morphology (20,000x magnification) (B). Giemsa staining of D. discoideum (strain AX2) infected with CC99 (C) and HT99 (D) showing bacteria (arrow; dark purple) associated with nuclei (pink).

Both CC99 and HT99 infect and replicate within protozoa, including Acanthamoeba polyphaga and Dictyostelium discoideum (Fig 1) [17]. CC99 can also infect and replicate in mammalian cells (S1 Fig), including phagocytic and non-phagocytic cell lines [21], also associating with or entering the nucleus. During infection of host cells, both bacteria exhibit replication within a bacteria-containing vacuole (BCV) that interacts with or enters the host cell nucleus. Within one hour after internalization, both bacteria are visible in the host cell cytosol enclosed in a vacuole. As infection progresses, the BCV traffics through the cytoplasm and eventually invades or closely associates with the nucleus of the host [21]. Within BCVs, the bacteria replicate to large numbers and eventually escape after lysis of the host cell. The mechanism of infection and intracellular replication as well as the biochemical composition of the replication vacuole of each bacterium is not yet understood, and both bacteria remain unculturable outside of host cells.

Here, we present the sequences and analyses of CC99 and HT99 genomes to gain an understanding of their genome content and insights into their biology and metabolic capacity. Information gained from the genome analyses may facilitate the understanding the mechanisms by which they invade and replicate within their host, as well as provide insight for the development of axenic media for culture outside of host cells.

Materials and methods

Bacterial culture

A. polyphaga (ATCC strain 30461), grown in 25 cm2 flat-bottomed cell culture flasks in tryptic soy broth (TSB; Becton Dickinson, Franklin Lakes, NJ, USA) at 25°C, was used to maintain and propagate both CC99 and HT99. Prior to transferring bacteria, TSB medium was removed from confluent A. polyphaga monolayers and cells were washed three times with sterile spring water (Carolina Biological Supply, Burlington, NC, USA) without disturbing the amoeba monolayer. Infections in amoebae were performed in sterile spring water. Co-cultures were incubated for 4–5 days at 25°C.

Genomic DNA extraction and purification

For whole genome sequencing, DNA was purified from host cell-free isolates following their growth in A. polyphaga co-culture. After complete lysis of the amoebae, bacterial cells were separated from host cells using Renografin density-gradient centrifugation as described in [22]. Briefly, amoebal debris from the bacterial lysate was removed by centrifugation at 500 x g for 5 minutes. The recovered supernatant was filtered through 0.8 μm PVDF filters and centrifuged at 31,000 x g for 30 minutes. The bacterial pellets were re-suspended in sterile phosphate buffered saline sucrose (PBSS; pH 7.4) and layered onto 30% (v/v) RenoCal-76 cushions (76% Renografin in PBSS). Tubes were filled to the top with PBSS and centrifuged at 58,000 x g for 30 minutes at 4°C. The supernatant was then removed and re-suspended in cold PBSS and centrifuged at 31,000 x g for 10 minutes at 4°C. The pellets were washed in phosphate buffered saline (PBS) to remove any residual Renografin and purified bacteria were re-suspended in PBS.

Total DNA from host cell-free bacteria was extracted using the MasterPureTM Complete DNA Purification Kit (Lucigen, Middleton, WI, USA) following the manufacturer’s instructions. Genomic DNA was quantified using a Nanodrop Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and DNA degradation and contamination were examined by 1% Tris-Acetate-EDTA agarose gel electrophoresis.

Genomic sequencing and analysis

Whole genome sequencing of purified DNA was performed using the PacBio sequel platform at Novogene Inc. (Durham, NC, USA). Raw reads generated from the PacBio sequencer were assembled using Canu v. 1.9 [23] and Falcon v. 1.8.1 [24] genome assembly programs. NCBI Prokaryotic Genome Annotation Pipeline (PGAP) [25] and RAST [26] server were used to identify protein-coding genes, rRNAs, tRNAs, and ncRNAs. Completeness of the genome was assessed using the Benchmarking Universal Single-Copy Orthologs (BUSCO) pipeline [27]. The KEGG automatic annotation server (KAAS) [28] was used to perform metabolic pathway prediction and reconstruction based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database ( The BioCyc [29] database was also used for metabolic pathway analyses.

The web-based programs SMART (Simple Modular Architecture Research Tool; [30], Profile HMM database (Pfam) [31], and National Center for Biotechnology Information’s Conserved Domain Database (CDD) [32] were used to identify eukaryotic or eukaryotic like domains/motifs in protein encoding genes (cut-off e-value of 0.001). Homology searches were performed against the SecReT4 [33] and EffectiveDB [34] databases. Protein encoding genes were searched against the virulence factor database (VFDB), a specialized repository of known bacterial virulence factors [35]. SignalP 4.1 ( and TMHMM ( servers were used to identify N-terminal signal peptide sequences and transmembrane domains, respectively.

Results and discussion

General genome features “Ca. B. cookevillensis” and “Ca. B. aquae”

The genome of CC99 consists of 2,984,836 base pairs (bp) with an average GC content of 37.9% while the genome of HT99 consists of 3,588,707 bp with an average GC content of 39.4%. Plasmids were not experimentally identified for either bacterium nor indicated by sequence data. The protein coding densities of CC99 (89.06%) and HT99 (90.63%) were higher than the average coding density (87%) of all sequenced bacterial genomes [36], although the related pathogens C. burnetii and Legionella spp. also have high coding densities of approximately 90% [37, 38]. Genome annotation identified 2,651 protein-encoding genes (PEGs) in CC99, with approximately 52% assigned putative biological function. In contrast, annotation in HT99 identified 3,238 PEGs, with about 48% assigned putative biological function. A high proportion of PEGs were annotated as hypothetical proteins (CC99: 46.5%; HT99: 51.3%). A majority of the PEGs utilize the AUG start codon (CC99: 88.1%; HT99: 90.2%) and the remainder utilize the alternate start codons, GUG (CC99: 7.8%; HT99: 6.4%) and UUG (CC99: 4.1%; HT99: 3.4%). PEGs were distributed evenly between the forward and reverse strand (CC99: 47.94% forward, 52.06% reverse; HT99: 49.75% forward, 50.25% reverse). A total of 40 and 41 tRNA genes, representing at least one for each of the 20 amino acids, were identified in CC99 and HT99, respectively. Genes for the 3 rRNAs [5S; 23S (large subunit); 16S (small subunit)] were also identified. CC99 and HT99 genome features are summarized in Table 1 and Fig 2. Complete genome annotation is available in S1 Table.

Fig 2. Structural features of “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99) genomes.

Genome tracks show (from inner to outer) the positive (green) and negative (purple) GC skew [(C-G)/(C+G)], G+C content (black) and the coding DNA sequences (blue) located in the reverse and forward and strands. Putative origin of replication is located at the top. CGview ( was used to construct the genome map.

Table 1. General genome features of the “Ca. B. cookevillensis” and “Ca. B. aquae”.

Classification of PEGs into Clusters of Orthologous Groups (COGs) [39] assigned a total of 1,910 (72.04%) and 2,177 (67.23%) PEGs to COGs in CC99 and HT99, respectively (Table 2; S1 Table). The “translation, ribosomal structure and biogenesis,” “amino acid transport and metabolism,” and “cell wall/membrane/envelope biogenesis” classes (COG classes J, E, M) were the most prominently represented categories in both bacteria (Table 2).

Table 2. Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99) PEGs grouped into COG functional categories.

As expected, genes for the necessary components of the genetic information processing machinery were present in both CC99 and HT99. The macromolecular synthesis (MMS) operon, containing the genes rpsU, dnaG, rpoD, which encode essential products for the initiation of protein, DNA and RNA synthesis [40], was identified in both bacteria. Genes encoding for proteins involved in DNA replication and transcription, including ATP-independent DNA topoisomerase I, ATP-dependent topoisomerase IV (subunits A and B), DNA gyrase (subunits A and B), RNA polymerase core subunits (rpoA, rpoB, rpoC, rpoZ), RNA polymerase sigma factors (rpoD, rpoS, rpoE, rpoH, rpoN), transcription termination factor (rho), RNA polymerase associated proteins (nusA, nusB), transcriptional anti-terminator (nusG), stringent starvation proteins (sspA, sspB), transcription elongation factors (greA, dksA) and transcription-repair coupling factor were identified in both bacteria. Genes for the large and small ribosomal subunit proteins, essential components of ribosomal structures, were also identified in both bacteria.

Genes dedicated to DNA repair and recombination processes, including genes for DNA repair (adaA, adaB, radA, dam, mutT, pcrA, recD, recF, recO, recR, recN, recG, recA, recX, rmuC, ssb, xseA, xseB,), excision repair (uvrABCD), DNA mismatch repair (mutS, mutL), base excision repair (recJ, mutM, mutY, tag, ung, alkA), and recombinational repair (ruvA, ruvB, ruvC, recJ) were identified in both bacteria. The presence of a large number of DNA recombination and repair genes suggests that these bacteria have high recombination capabilities. Genes encoding proteins involved in protein folding, including chaperones (groEL, groES, hrcA, grpE, dnaK, dnaJ, hscA, hscB, htpG, clpA, clpB) and disulfide interchange proteins (dsbA, dsbB, dsbC, dsbD, dsbE) were also identified in both bacteria.


Due to their obligate nature, very little is known about the metabolic capabilities of CC99 and HT99. Attempts to grow them in host cell-free media have, thus far, been unsuccessful. Obligate reliance on their host for growth has limited phenotypic and genetic experimental studies of these bacteria. Analyses of bacterial genomes have made it possible to gain insights into their metabolic capacity.

Carbohydrate metabolism.

Except for glucokinase (glk), we identified all the genes encoding enzymes required to generate pyruvate through the Embden-Meyerhof-Parnas (EMP) pathway in both CC99 and HT99 (Fig 3). C. burnetii likewise is missing the gene for glucokinase, nor does it require glucose for replication; however, glycolytic activity has been reported for the bacterium such that glucose use increases biomass in axenic culture with amino acid supplementation [41, 42]. L. pneumophila has a complete glycolytic pathway (Fig 3). Phosphoenolpyruvate (PEP)-dependent phosphotransferase systems (PTS), generally used for carbohydrate uptake in bacteria, were not identified in either bacterium. A glycerol uptake system or a glycerol facilitator gene (glpF) was also absent. In CC99, genes encoding for glycerol kinase (glpK) and an aerobic glycerol-3-phosphate dehydrogenase (glpD) were identified, which suggests that CC99 may be capable of phosphorylating glycerol (obtained by passive diffusion or through degradation of glycerol-containing lipids from the host by lipases) to glycerol-3-phosphate, which could then be converted to dihydroxyacetone phosphate, an intermediate of glycolysis that could shuttled through the glycolytic pathway to generate pyruvate [43].

Fig 3. Genome inferred central carbohydrate metabolic pathway of “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99) with comparisons to C. burnetii and L. pneumophila.

Genes encoding enzymes involved in the EMP of glycolysis (except for glucokinase) and non-oxidative branch of the PPP were identified in both CC99 and HT99 (indicated by blue and green arrows, respectively). Genes for enzymes involved in the oxidative branch of PPP pathway and the ED pathway were absent. Genes encoding enzymes involved in the TCA cycle were present in both bacteria. Genes encoding enzymes for these pathways in C. burnetii (yellow arrows) and L. pneumophila (pink arrows) are included for comparison.

The Entner-Doudoroff (ED) pathway appears to be completely absent in both CC99 and HT99. The ED pathway is also absent in C. burnetii, but it is the predominant glycolytic pathway for L. pneumophila [44, 45]. Genes for enzymes involved in the non-oxidative branch of the pentose phosphate pathway (PPP) were identified (Fig 3), indicating a capability for producing pentose phosphates (required for nucleic acid synthesis) and erythrose phosphates (precursors for aromatic amino acids) in both bacteria. However, key genes of the oxidative branch of the PPP, including glucose 6-phophate dehydrogenase (zwf), gluconolactonase (pgl), and 6-phosphogluconate dehydrogenase (gnd), were absent in both bacteria (Fig 3), suggesting that they may not be able to produce NADPH utilizing the PPP. Other intracellular bacteria, including C. burnetti, L. pneumophila, and Francisella tularensis are also missing genes for a complete oxidative branch of the PPP [38, 4648]. Both CC99 and HT99 are missing genes encoding fructose 1,6-bisphosphatase (fbp) and PEP carboxykinase (pckA), key enzymes of gluconeogenesis that generate intermediate precursors for other biosynthetic pathways (Fig 3). Genes for the pyruvate dehydrogenase (pdh) complex, as well as all the genes tricarboxylic acid (TCA) cycle were identified in both bacteria, suggesting capability for ATP production via substrate-level phosphorylation (Fig 3). We identified genes for PEP synthase (pps) and NAD- and NADP-dependent malic enzymes (mea) in both bacteria. Via these enzymes, they may be able to convert malate to pyruvate and generate PEP [49]. Neither bacterium may be capable of generating carbohydrates from acetyl-CoA subunits as genes for enzymes of the glyoxylate shunt, isocitrate lyase (aceA) and malate synthase (glcB), were not identified. Although the glyoxylate pathway is common in aerobic bacteria and has been associated with virulence, it is incomplete in many intracellular bacteria including C. burnetti, L. pneumophila, F. tularensis, Listeria monocytogenes, and Rickettsia spp. [38, 50, 51].

Genes encoding enzymes involved in aerobic respiration were present in both bacteria, including 14 subunits of respiratory NADH:ubiquinone oxidoreductase (NADH dehydrogenase complex; complex I), 4 subunits of respiratory succinate:quinone oxidoreductase (succinate dehydrogenase complex; complex II), and 3 subunits of respiratory ubiquinone-cytochrome c (cytochrome c reductase; complex III). Genes for a terminal cytochrome oxidase complex, including cytochrome bd (cydAB) and cytochrome c (coxCBA, cyoE) were also identified. An additional Cbb3-type cytochrome oxidase was also identified in CC99. The cytochrome bd oxidase, shown to have an increased affinity for oxygen [52], may allow CC99 to survive under limiting oxygen levels. Genes encoding the F-type ATPase (F1F0 ATP synthase) units that catalyze ATP hydrolysis were also present in both bacteria. Taken together, the presence of these genes suggests that both bacteria likely rely on proton motive force-driven aerobic respiration for ATP production.

Lipid metabolism.

In both CC99 and HT99, genes for enzymes involved in the initiation step (accADCB, acpP, acpS, fabD, fabH) and chain elongation cycle (fabG, fabZ, fabI, fabF) of the type II fatty acid synthesis pathway were identified, suggesting the capability of producing fatty acids. A gene homologue of fabV encoding an enoyl-acyl carrier protein reductase, implicated in resistance to the antibacterial triclosan in Vibrio cholera and Pseudomonas aeruginosa [53, 54], was identified in HT99. Genes for 3-hydroxydecanoyl dehydratase (fabA) and 3-oxoacyl synthase 1 (fabB), acyl-carrier protein homologs which together control the level of unsaturated fatty acid synthesis in Escherichia coli [55], were not identified. Absence of these genes suggests that both CC99 and HT99 may not be capable of synthesizing unsaturated fatty acids. Indeed, lipid profile analyses have previously shown that both CC99 and HT99 contain straight chain fatty acids [17].

Genes encoding an outer membrane protein (fadL) and a long-chain acyl-CoA synthetase (fadD) were identified in both bacteria. In E. coli, these proteins are responsible for transporting fatty acids via a transport/acyl‐activation mechanism [55] and may have similar functions in CC99 and HT99. Genes encoding the necessary enzymes to generate ATP from β‐oxidation of fatty acids (fadE, fadB, fadA) were also identified in both bacteria (Fig 3). Via β‐oxidation pathway, both CC99 and HT99 may be able to degrade long chain fatty acids into acetyl-CoA, which can then be further oxidized via the TCA cycle for ATP production.

Both bacteria lack genes involved in the mevalonate pathway for isoprenoid biosynthesis but have genes for enzymes of the non-mevalonate pathway for isoprenoid biosynthesis. The non-mevalonate pathway, common in most gram-negative bacteria, generates the five-carbon isoprenoid precursors isopentyl diphosphate and dimethylallyl diphosphate which can be modified to make diverse organic molecules involved in important biological functions in the cell, including electron transport and peptidoglycan biosynthesis [56]. Conversely, C. burnetii and L. pneumophila have genes for the mevalonate but not the non-mevalonate pathway [38, 57]. The mevalonate pathway has been associated with predatory bacterial species as well as the intracellular bacteria, Teredinibacter turnerae and ‘Ca. Liberibacter asiaticus’ [57]. Both bacteria encode enzymes for synthesizing phosphatidylethanolamine (PE) and phosphatidylserine, important phospholipid membrane components. However, they are missing the genes encoding phosphatidylglycerophosphatase (pgpA, pgpB, pgpC) and cardiolipin synthetase (cls), key enzymes involved in phosphatidylglycerol and cardiolipin biosynthesis, although these genes are found in C. burnetti and Legionella spp. [58, 59].

Nucleotides, amino acids and cofactor metabolism.

Both CC99 and HT99 have genes encoding proteins involved in the synthesis of purines and pyrimidines de novo. Genes for key enzymes involved in the initial synthesis of uridine monophosphate (UMP) from phosphoribosyl pyrophosphate (PRPP) and L-glutamine (carA, carB, pyrB, pyrC, pyrD, pyrE, pyrF) and for converting UMP to uridine diphosphate/triphosphate (UDP/UTP) and cytidine diphosphate/triphosphate (CDP/CTP) (cmk, ndk, pyrG, pyrH) were identified. Genes for enzymes involved in generating inosinic acid (IMP) (a purine precursor) from L-glutamine and ribose-5-phosphate (prs, purF, purD, purN, purL, purM, purK, purE, purC, purH) and for the subsequent conversion of IMP to adenosine monophosphate (AMP) (purA, purB) and guanosine monophosphate (guaAB) were also identified.

Amino acid biosynthesis pathways appear to be reduced in both CC99 and HT99 (Fig 4, S1 Text, S2 Table). For CC99, 11 (Ala, Asn, Asp, Glu, Gln, Gly, His, Lys, Ser, Thr and Trp) of the 20 amino acid pathways could be asserted while only 8 (Asp, Glu, Gln, Gly, His, Lys, Ser, and Trp) could be asserted for HT99 (Fig 4). A high degree of amino acid auxotrophy suggests that both bacteria likely rely on their hosts for obtaining the necessary amino acids and intermediates. Both C. burnetti and L. pneumophila are also auxotrophic for several amino acids yet have potential mechanisms to scavenge amino acids from their host cells. C. burnetti has been reported to up-regulate autophagy resulting in the release of free amino acids, and L. pneumophila induces proteasome degradation of host cell proteins to increase in intracellular amino acids [60, 61]. C. burnetti and L. pneumophila have also both been grown axenically in media containing only amino acids [41, 62] which suggests that amino acid-based medium may be developed for the axenic growth of bacteria CC99 and HT99.

Fig 4. Amino acid biosynthesis pathways of “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99).

Genome inferred amino acid pathways appear to be reduced in both bacteria. Only 11 of the 20 amino acid pathways could be asserted for CC99 while only 8 could be asserted for HT99.

Several genes involved in cofactor biosynthesis pathways were identified. Genes for enzymes required to catalyze the biosynthesis of riboflavin [precursor for flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD)] from guanosine triphosphate and ribulose‐5‐phosphate were identified in both CC99 and HT99. A gene encoding a bifunctional riboflavin kinase/FMN adenylyltransferase (ribF), involved in phosphorylation of riboflavin to the FMN and subsequent adenylation of FMN to FAD, was also present in both bacteria. FMN and FAD are important cofactors for many metabolic enzymes that involve oxidation-reduction reactions. Genes for de novo synthesis of NAD from L-aspartate or L-tryptophan were not identified. However, genes encoding NAD salvage-specific enzymes that synthesize NAD from nicotinamide or nicotinic acid, including nicotinate phosphoribosyltransferase (pncB), nicotinamidase (pncA), NAD synthetase (nadE) and nicotinate-nucleotide adenylyltransferase (nadD), were identified in both bacteria. Both bacteria also encode NAD kinase (nadK), a key enzyme which catalyzes the phosphorylation NAD to from NADP [63].

Biotin, derived from pimelic acid, is an important cofactor for many carboxylation, decarboxylation and transcarboxylation reactions [64]. Genes (bioF, bioA, bioD, bioB) encoding enzymes involved in a four-step path conversion of pimeloyl-CoA to biotin (second stage of biotin biosynthetic pathway) [65] were identified in both bacteria. However, genes involved in the synthesis of a pimelate moiety (first stage of biotin biosynthetic pathway), including genes encoding pimeloyl-CoA synthetase (bioW) and an enzyme of the cytochrome P450 family (bioI) involved in direct conversion of pimelic acid to pimeloyl-CoA [66] are missing in both bacteria. Both bacteria are also missing the gene encoding pimeloyl methyl ester esterase (bioH). In E. coli, this enzyme acts synergistically with malonyl-O-methyltransferase (encoded by bioC) to produce a pimelate moiety that serves as a carbon backbone in the early steps of biotin biosynthesis [67]. Absence of these genes suggests that both bacteria may synthesize the pimelate moiety precursor via an alternate pathway, or they may not be able to use pimelic acid as a source for biotin synthesis. Genes encoding transmembrane component (bioN) and a substrate-specific component (bioY) for a biotin transporter protein are absent in both bacteria and so they may not be capable of transporting biotin.

Genes encoding the necessary enzymes required to convert pantothenate (vitamin B5) to coenzyme A, an essential cofactor in many important metabolic processes, were identified in both bacteria. However, they are missing essential genes for enzymes involved in the de novo synthesis of pantothenate from aspartate and α-ketoisovalerate (panD, panE), suggesting that they likely import pantothenate precursors from host cells. Genes for the biosynthesis of thiamine, including genes encoding enzymes involved in the biosynthesis of thiazole and pyrimidine moieties (thiamine precursors) were identified in both bacteria. CC99 encodes a complete gene set for the folate biosynthesis pathway. HT99, however, is missing important genes in this pathway, including aminodeoxychorismate lyase (pabAc).

Motility, secretion and adhesion

The bacterial flagellum, assembled from as many as 40 different protein components, is a complex motility organelle composed of a basal body, a curved flexible hook, and a long helical propagating filament [68]. In both CC99 and HT99, genes encoding the flagella hook/filament (flgE, flgL, flgK, fliC, fliD), motor/switch (motA, motB, fliM, fliN, fliG), basal body (flgG, flgH, flgJ, flgI, flgF, flgC, flgB, fliE, fliF), and flagella export apparatus (flhA, flhB, fliP, fliQ, fliR, fliI) components were identified (Fig 5). Most of these genes are located within a single locus. Both CC99 and HT99 have been shown to be motile by a single, polar flagellum [17].

Fig 5. Genes involved in the synthesis of flagella in bacteria.

Gene homologs encoding flagellar proteins identified in “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99). Genes encoding motor/switch, basal body, hook, filament, filament cap proteins were identified.

Homologs of flhD and flhC genes, which regulate the transcriptional activation of class II flagellar genes in E. coli [69], were not identified. However, gene homologs encoding sigma factor (σ) 54 (rpoN), transcriptional regulator FleQ (fleQ) and trascription inititation factor FilA (fliA) were identified in both bacteria. During the transmissive life-cycle of L. pneumophila, the regulatory protein FleQ together with σ54, enhance the expression of flagellar class II genes which encode protein components involved in the assembly of flagellar basal body, flagellar hook and regulatory proteins while FilA induces the expression of flagellar class III genes (encoding motor/switch proteins) and class IV (encoding filament and filament cap proteins) genes [70, 71]. The presence of these transcription regulator homolog genes suggest flagellar gene expression in CC99 and HT99 is similar to that of L. pneumophila.

Genes encoding proteins that recognize and bind signal peptides of the twin-arginine translocase (TAT)-dependent folded substrates (tatB, tatC) were missing, while genes for the general secretion (Sec) dependent translocon which transports unfolded proteins across the cytoplasmic membrane were present in both bacteria. A set of genes encoding T1SS components were identified in both bacteria, including genes encoding an ABC transporter, an outer membrane-spanning porin TolC, and a periplasmic membrane fusion protein of the HlyD family. Type I secretion systems (T1SS), which belong to a family of ATP-binding cassette (ABC) transporters, secrete unfolded substrates in a one-step process directly from the cytoplasm to extracellular milieu without a periplasmic intermediate [72]. T1SSs secrete a diverse range of substrates involved in bacterial pathogenesis, including proteases, bacteriocins and adhesins [73]. Moreover, genes encoding T1SS-secreted agglutinin repeat-in-toxins (RTX), which are diverse multifunctional proteins with important roles in pathogenesis in many bacterial species, including V. cholerae [74] and L. pneumophila [75], were also identified in both bacteria. The presence of T1SS genes and genes encoding T1SS-secreted toxins suggests that CC99 and HT99 each have a functional T1SS.

Genes for components of functional Type II, III, V or VI secretion systems were not identified in either CC99 or HT99. However, genes for the Dot/Icm (Defect in organelle trafficking/Intracellular multiplication) type IV secretion system (T4SS) were identified in both bacteria. The dot/icm T4SS is a class B type IV secretion apparatus composed of at least 27 protein components, that spans the inner and outer bacterial cell membrane [76]. Found in pathogens such as L. pneumophila and C. burnetii, this specialized transport apparatus allows the delivery of a large repertoire of virulence factors (dot/icm substrates) into host cells [38, 76, 77]. Within host cells, dot/icm substrates have been shown to manipulate various host cell processes allowing the establishment of replication vacuole and intracellular growth in L. pneumophila and C. burnetii [7880]. We identified 17 and 18 of the 27 L. pneumophila dot/icm genes in CC99 and HT99, respectively (Fig 6). Dot/icm genes in CC99 and HT99 share significant homology to those of L. pneumophila (S3 Table). The presence of dot/icm genes suggests that both bacteria likely utilize the dot/icm T4SS to deliver effectors to host cells to mediate successful infection and intracellular replication.

Fig 6. Dot/icm T4SS in “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99) with comparisons to C. burnetii and L. pneumophila.

(A) Presumed location and topological relationships of dot/icm proteins. Figure adapted from [77]. (B) Genomic organization of the genes encoding dot/icm proteins in CC99 and HT99. Genes dotC, dotD, dotF/icmG, dotG/icmE and dotH/icmK encode the core proteins of the complex. Genes icmS and icmW encode components involved substrate recognition and secretion. Both bacteria are missing the gene encoding LvgA, a protein identified as a potential fifth chaperone to IcmS. Genes dotA, dotE, dotI, dotP and icmV which encode the inner membrane proteins are also present in both bacteria. Genes dotI and dotJ (absent in both bacteria) encode for an integral inner membrane protein. Genes dotL, dotM and dotN encode membrane proteins components involved in recruitment of effector proteins. Genes dotB and dotO encode cytoplasmic ATPase complex proteins. Genes dotV and dotJ (encoding inner membrane proteins); icmQ (encoding a pore-forming cytoplasmic protein) and icmR (encoding a chaperone for IcmQ) are absent in both bacteria.

Several type IV pili (Tfp) genes were identified in both bacteria. Evolutionarily related to components of type II secretion systems, Tfp are multifunctional appendages that are associated with virulence functions, such as promoting host cell adherence, twitching motility, and biofilm formation in many bacteria species including, Legionella species [81] and F. tularensis [82]. Genes for major prepilin (pilA), assembly proteins (pilB, pilC, pilD, pilN, pilQ), minor pilins (pilE, pilV, pilW, pilX and fimT), ATPase component (pilU), twitching motility protein (pilT) and a prepilin peptidase/N-methyltransferase (pilD) were identified in both CC99 and HT99. C. burnetii also contains a number of genes for Tfp assembly but is missing pilT and pilU genes that are critical for pilus function [83]. A functional Tfp expressed on the surface of CC99 and HT99 may play a role in promoting adherence of these bacteria to host cells.

Effector proteins

Many intracellular bacteria rely on effectors to successfully promote their uptake, subvert essential cellular pathways, acquire essential nutrients, and suppress host immune responses [84]. When translocated to host cells, effectors can exert their function either by activating/inhibiting the function of host proteins or by mimicking the function of endogenous host proteins [85]. Using bioinformatic guided approaches, we identified 218 and 286 genes that encode putative effectors in CC99 and HT99, respectively (S4 Table). Among those identified were genes encoding proteins with domains and/or motifs frequently or primarily found in eukaryotic proteins. Proteins containing eukaryotic or eukaryotic-like domains and/or motifs, have been described in many intracellular bacteria (thought likely to have been acquired via interdomain horizontal gene transfer) such as L. pneumophilia and C. burnetti [86, 87].

In CC99 and HT99, we identified 26 and 63 genes encoding proteins containing eukaryotic-like ankyrin repeats (ANK), respectively (S4 Table). ANKs are protein-protein interaction motifs of tandemly repeated modules of 30–34 amino acids which are present in eukaryotic proteins involved in various cellular functions, including transcription regulation, signal transduction, vesicular trafficking, and cytoskeleton integrity [88]. Proteins containing ANKs have been identified in intracellular pathogens such as L. pneumophilia [89], C. burnetti [90], Orientia tsutsugamushi [91] and Wolbachia spp. [92]. Many of these proteins have effector functions that promote host-cell invasion and replication by interfering with host cell functions such as vesicular transport and preventing pathogen-induced apoptosis [93].

We also identified genes encoding proteins with tetratricopeptide repeats (TPR) (CC99: 13; HT99: 7) and Sel1 repeats (CC99: 6; HT99: 8) in both bacteria (S4 Table). TPRs, typically consisting of 34 amino acid residues arranged in tandem repeats, also facilitate protein-protein interactions and assembly of multi-protein complexes [94]. Proteins containing TPR motifs are also present in both C. burnetii [86] and L. pneumophilia [95]. TPR motifs have also been implicated in virulence-associated functions of other bacterial pathogens such as O. tsutsugamushi [96] and P. aeruginosa [97]. The Sel1 repeat (SLR) motif, a subclass of the TPR motif with a similar consensus sequence but a variable length, has also been implicated in virulence of bacterial pathogens [98, 99].

Genes encoding proteins with eukaryotic-like serine/threonine protein kinases (STPKs; CC99: 18; HT99: 23), leucine rich repeats (LRR; CC99: 6; HT99: 6), F-Box (CC99: 1; HT99: 6), and U-Box (CC99: 0; HT99: 1) domains were also identified (S4 Table). In eukaryotic organisms, STPKs control phosphorylation states of substrates involved in intracellular signaling pathways. By adding or removing phosphate groups of protein substrates, STPKs function as on/off switches to activate or deactivate specific intracellular signaling pathways [100]. Bacterial effectors with structural and functional similarities to eukaryotic STPKs and phosphatases have been identified in several intracellular pathogens, including L. pneumophilia [101], C. burnetti [86], and M. tuberculosis [102]. LRR-containing proteins have been implicated in the virulence of intracellular pathogens such as Listeria monocytogenes [103] and Salmonella enterica [104]. F-box and U-box domains have also been identified in genomes of numerous human and plant bacterial pathogens [105].

Stress genes

During colonization, infection, and transmission from the host, intracellular bacteria can encounter many stresses, such as limited nutrients, changes in pH, temperature and osmolarity, and exposure to oxidative stress and toxic molecules, such reactive oxygen species (ROS), peroxides, metals, and antibiotics. Therefore, bacteria must be able to quickly respond and adapt to these stresses. In both CC99 and HT99, we identified several genes involved in oxidative stress responses, including catalase (katE), superoxide dismutase [Fe] (sodB), superoxide dismutase [Cu-Zn] (sodC), cytochrome c551 peroxidase (ccpA), glutathione reductase (gor), thioredoxin peroxidase (btuE), alkyl hydroperoxide reductase C (ahpC), and rubredoxin (rubA). We also identified a gene encoding the catalase peroxidase enzyme (katG) in CC99. This enzyme has been shown to play a role in replication of Mycobacterium tuberculosis and F. tularensis in host cells by contributing to resistance of ROS and reactive nitrogen species (RNS) mediated killing [106, 107]. Other stress response genes identified include stringent starvation protein genes (sspA, sspB), hfl operon genes (hflq, hflK hflC), and periplasmic stress sensor genes (degS, rseA).

Both bacteria contain the gene (relA/spoT) encoding a putative RelA/bifunctional synthetase/hydrolase SpoT homolog protein. Also present in L. pneumophilia and C. burnetti [108], RelA/SpoT protein controls the synthesis and the degradation of the small alarmone nucleotides, guanosine tetraphosphate (ppGpp) and guanosine pentaphosphate (pppGpp). These nucleotides are collectively referred to as (p)ppGpp and are synthesized in response to high stress and low nutrient conditions, allowing gram-negative bacteria to initiate an adaptive global physiological response known as a stringent response [109]. Accumulation of (p)ppGpp in bacterial cells (in cooperation with the small RNA polymerase binding protein DksA) triggers a response resulting in modification of RNA polymerase and control of transcriptional activity such that bacteria rapidly express factors crucial for nutrient acquisition and stress survival [109]. Besides promoting physiological adaptation during nutritional starvation, this stringent response mediated by (p)ppGpp is involved in pathogenesis of several pathogens, including promoting adherence of enterohemorrhagic (EHEC) E. coli [110], promoting invasion of host cells in S. enterica serovar Typhimurium [111], and transmission of L. pneumophila into host cells [112].

We also identified genes encoding putative two-component systems (TCS) in both bacteria. Widely found in bacteria, TCS are signal transduction pathways, composed of a sensor histidine kinase (HK) and a response regulator (RR) protein, typically encoded by a pair of adjacent genes, that allow bacteria to detect and mediate a response to changes in the environment [113]. Among those identified, include fleS-fleR which regulate expression of flagellar genes in P. aeruginosa, [114], cheA-cheY which regulate bacterial chemotaxis through counterclockwise/clockwise rotation in Bacillus subtilis [115], barA-uvrY which regulate efficient switching between different carbon sources in E. coli [116], and pilS-pilR which regulate fimbriae expression in P. aeruginosa [117].


Here, we present genomic descriptions and analyses of two intracellular bacteria isolated from amoebae found in human-constructed water systems. Both ‘Ca. B. cookevillensis’ (strain CC99) and ‘Ca. B. aquae’ (strain HT99) have been described as bacterial obligate intracellular parasites of amoebae (recently coined as BOIP by Sanchez and Omsland, 2021 [118]). CC99 is also an obligate intracellular parasite of mammalian cell lines. Both bacteria replicate within BCVs that are closely associated with or within the nucleus, resulting in lysis of the host cell [21]. Although, this genomic study provides insight into gene products that contribute to the metabolism and virulence of these bacteria, it still remains unclear why CC99 but not HT99 is able to infect mammalian cells. Notable differences between these two bacteria include their amino acid auxotrophies (9 for CC99 and 12 for HT99) and the lack of a complete folate biosynthesis pathway in HT99. Ongoing efforts in our laboratories investigating intracellular trafficking and the compartments with which these bacteria associate, as well as RNA sequencing analyses during different stages of infection will help to elucidate such differences. The obligate lifestyles of these bacteria have also limited genetic studies or manipulations of these bacteria; however, this study has identified significant nutritional deficiencies for amino acids that provide insight into development of media for axenic growth.

Supporting information

S1 Fig. Electron micrograph of cells infected with “Ca. B. cookevillensis” (CC99).

Transmission electron micrograph of Thp1 cell infected with CC99. CC99-CV appears within the nucleus at 24-hour post infection (12,000x magnification).


S1 Table. Annotated genomes of “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99).


S2 Table. Genes involved in amino acid biosynthesis reactions.


S3 Table. Comparison of "Ca. B. cookevillensis" (CC99) and “Ca. B. aquae” (HT99) dot/icm genes with L. pnemophilia dot/icm genes.


S4 Table. Genes encoding putative effector proteins in "Ca. B. cookevillensis" (CC99) and “Ca. B. aquae” (HT99).


S1 Text. Amino acid biosynthesis pathways of “Ca. B. cookevillensis” (CC99) and “Ca. B. aquae” (HT99).



The authors of this manuscript would like to acknowledge Joyce Miller of the MIMIC Center of MTSU for technical assistance with electron micrographs.


  1. 1. Samba-Louaka A, Delafont V, Rodier MH, Cateau E, Héchard Y. Free-living amoebae and squatters in the wild: ecological and molecular features. FEMS Microbiol. Rev. 2019 Jul 1;43(4):415–34. pmid:31049565
  2. 2. Sandstrom G, Saeed A, Abd H. Acanthamoeba-Bacteria: A model to study host interaction with human pathogens. Curr. Drug Targets. 2011;12(7):936–41. pmid:21366523
  3. 3. Greub G, Raoult D. Microorganisms resistant to free-living amoebae. Clin. Microbiol. Rev. 2004 Apr;17(2):413–33. pmid:15084508
  4. 4. Schulz F, Lagkouvardos I, Wascher F, Aistleitner K, Kostanjšek R, Horn M. Life in an unusual intracellular niche: a bacterial symbiont infecting the nucleus of amoebae. ISMEJ. 2014 Aug 1;8(8):1634–44.
  5. 5. Fritsche TR, Horn M, Seyedirashti S, Gautom RK, Schleifer KH, Wagner M. In situ detection of novel bacterial endosymbionts of Acanthamoeba spp. phylogenetically related to members of the Order Rickettsiales. Appl. Environ. Microbiol. 1999 Jan 1;65(1):206.
  6. 6. Horn M, Fritsche TR, Gautom RK, Schleifer KH, Wagner M. Novel bacterial endosymbionts of Acanthamoeba spp. related to the Paramecium caudatum symbiont Caedibacter caryophilus. Environ. Microbiol. 1999 Aug 1;1(4):357–67.
  7. 7. Horn M, Fritsche TR, Linner T, Gautom RK, Harzenetter MD, Wagner M. Obligate bacterial endosymbionts of Acanthamoeba spp. related to the beta-Proteobacteria: proposal of “Candidatus Procabacter acanthamoebae”; gen. nov., sp. nov. Int. J. Syst. Evol. Microbiol. 2002;52(2):599–605.
  8. 8. Denet E, Coupat-Goutaland B, Nazaret S, Pélandakis M, Favre-Bonté S. Diversity of free-living amoebae in soils and their associated human opportunistic bacteria. Parasitol. Res. 2017 Nov 1;116(11):3151–62. pmid:28988383
  9. 9. Birtles RJ, Rowbotham TJ, Raoult D, Harrison TG. Phylogenetic diversity of intra-amoebal legionellae as revealed by 16S rRNA gene sequence comparison. Microbiology. 1996;142(12):3525–30. pmid:9004515
  10. 10. Horn M, Harzenetter MD, Linner T, Schmid EN, Müller KD, Michel R, et al. Members of the Cytophaga–Flavobacterium–Bacteroides phylum as intracellular bacteria of acanthamoebae: proposal of ‘Candidatus Amoebophilus asiaticus.’ Environ. Microbiol. 2001 Jul 1;3(7):440–9.
  11. 11. Fritsche TR, Horn M, Wagner M, Herwig RP, Schleifer KH, Gautom RK. Phylogenetic diversity among geographically dispersed Chlamydiales endosymbionts recovered from clinical and environmental isolates of Acanthamoeba spp. Appl. Environ. Microbiol. 2000 Jun 1;66(6):2613.
  12. 12. Amann R, Springer N, Schönhuber W, Ludwig W, Schmid EN, Müller KD, et al. Obligate intracellular bacterial parasites of Acanthamoebae related to Chlamydia spp. Appl. Environ. Microbiol. 1997 Jan 1;63(1):115.
  13. 13. Cirillo JD, Falkow S, Tompkins LS, Bermudez LE. Interaction of Mycobacterium avium with environmental amoebae enhances virulence. Infect. Immun. 1997 Sep 1;65(9):3759–67.
  14. 14. Rowbotham TJ. Preliminary report on the pathogenicity of Legionella pneumophila for freshwater and soil amoebae. J. Clin. Pathol. 1980 Dec 1;33(12):1179.
  15. 15. Greub G, Raoult D. Parachlamydiaceae: potential emerging pathogens. Emerg. Infect. Dis. 2002 Jun;8(6):625–30.
  16. 16. Adeleke A, Pruckler J, Benson R, Rowbotham T, Halablab M, Fields B. Legionella-like amebal pathogens—phylogenetic status and possible role in respiratory disease. Emerg. Infect. Dis. 1996;2(3):225–30. pmid:8903235
  17. 17. Mehari YT, Jason Hayes B, Redding KS, Mariappan PVG, Gunderson JH, Farone AL, et al. Description of ‘Candidatus Berkiella aquae’ and ‘Candidatus Berkiella cookevillensis’, two intranuclear bacteria of freshwater amoebae. Int. J. Syst. Evol. Microbiol. 2016;66(2):536–41.
  18. 18. Hugoson E, Guliaev A, Ammunét T, Guy L. Host adaptation in Legionellales is 1.9 Ga, coincident with eukaryogenesis. Mol. Biol. Evol. 2022 Mar;39(3). pmid:35167692
  19. 19. Baca OG, Paretsky D. Q fever and Coxiella burnetii: a model for host-parasite interactions. Microbiol. Rev. 1983 Jun;47(2):127–49.
  20. 20. McDade JE, Shepard CC, Fraser DW, Tsai TR, Redus MA, Dowdle WR. Legionnaires’ Disease—isolation of a bacterium and demonstration of its role in other respiratory disease. N. Engl. J. Med. 1977 Dec 1;297(22):1197–203. pmid:335245
  21. 21. Chamberlain NB, Mehari YT, Hayes BJ, Roden CM, Kidane DT, Swehla AJ, et al. Infection and nuclear interaction in mammalian cells by ‘Candidatus Berkiella cookevillensis’, a novel bacterium isolated from amoebae. BMC Microbiol. 2019 May 9;19(1):91. pmid:31072343
  22. 22. Shannon JG, Heinzen RA. Infection of human monocyte-derived macrophages with Coxiella burnetii. Methods Mol. Biol. 2008; 431:189–200.
  23. 23. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017 May 1;27(5):722–36. pmid:28298431
  24. 24. Kronenberg ZN, Hall RJ, Hiendleder S, Smith TPL, Sullivan ST, Williams JL, et al. FALCON-Phase: integrating PacBio and Hi-C data for phased diploid genomes. bioRxiv. 2018 Jan 1;327064.
  25. 25. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, et al. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res. 2016 Aug 19;44(14):6614–24. pmid:27342282
  26. 26. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genom. 2008 Feb 8;9(1):75. pmid:18261238
  27. 27. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015 Oct 1;31(19):3210–2. pmid:26059717
  28. 28. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007 Jul 1;35(suppl_2):W182–5. pmid:17526522
  29. 29. Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahrén D, et al. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 2005 Oct 1;33(19):6083–9. pmid:16246909
  30. 30. Schultz J, Milpetz F, Bork P, Ponting CP. SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl. Acad. Sci. U.S.A. 1998 May 26;95(11):5857. pmid:9600884
  31. 31. Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths‐Jones S, et al. The Pfam protein families database. Nucleic Acids Res. 2004 Jan 1;32(suppl_1):D138–41. pmid:14681378
  32. 32. Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, et al. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 2015 Jan 28;43(D1):D222–6. pmid:25414356
  33. 33. Bi D, Liu L, Tai C, Deng Z, Rajakumar K, Ou HY. SecReT4: a web-based bacterial type IV secretion system resource. Nucleic Acids Res. 2013 Jan 1;41(D1):D660–5. pmid:23193298
  34. 34. Eichinger V, Nussbaumer T, Platzer A, Jehl MA, Arnold R, Rattei T. EffectiveDB—updates and novel features for a better annotation of bacterial secreted proteins and Type III, IV, VI secretion systems. Nucleic Acids Res. 2016 Jan 4;44(D1):D669–74. pmid:26590402
  35. 35. Chen L, Yang J, Yu J, Yao Z, Sun L, Shen Y, et al. VFDB: a reference database for bacterial virulence factors. Nucleic Acids Res. 2005 Jan 1;33(suppl_1):D325–8. pmid:15608208
  36. 36. Land M, Hauser L, Jun SR, Nookaew I, Leuze MR, Ahn TH, et al. Insights from 20 years of bacterial genome sequencing. Funct. Integr. Genomics. 2015 Mar 1;15(2):141–61. pmid:25722247
  37. 37. Gomez-Valero L, Rusniok C, Rolando M, Neou M, Dervins-Ravault D, Demirtas J, et al. Comparative analyses of Legionella species identifies genetic features of strains causing Legionnaires’ disease. Genome Biol. 2014 Nov 3;15(11):505. pmid:25370836
  38. 38. Seshadri R, Paulsen IT, Eisen JA, Read TD, Nelson KE, Nelson WC, et al. Complete genome sequence of the Q-fever pathogen Coxiella burnetii. Proc. Natl. Acad. Sci. U.S.A. 2003 Apr 29;100(9):5455.
  39. 39. Galperin MY, Kristensen DM, Makarova KS, Wolf YI, Koonin EV. Microbial genome analysis: the COG approach. Brief. Bioinform. 2019 Jul 19;20(4):1063–70. pmid:28968633
  40. 40. Versalovic J, Koeuth T, Britton R, Geszvain K, Lupski JR. Conservation and evolution of the rpsU-dnaG-rpoD macromolecular synthesis operon in bacteria. Mol. Microbiol. 1993 Apr 1;8(2):343–55. pmid:8316085
  41. 41. Vallejo Esquerra E, Yang H, Sanchez SE, Omsland A. Physicochemical and nutritional requirements for axenic replication suggest physiological basis for Coxiella burnetii niche restriction. Front. Cell. Infect. Microbiol. 2017;7:190.
  42. 42. Häuslein I, Cantet F, Reschke S, Chen F, Bonazzi M, Eisenreich W. Multiple substrate usage of Coxiella burnetii to feed a bipartite-type metabolic network. Front. Cell. Infect. Microbiol. 2017;7:285.
  43. 43. Blötz C, Stülke J. Glycerol metabolism and its implication in virulence in Mycoplasma. FEMS Microbiol. Rev. 2017 Sep 1;41(5):640–52. pmid:28961963
  44. 44. Schunder E, Gillmaier N, Kutzner E, Herrmann V, Lautner M, Heuner K, et al. Amino Acid Uptake and Metabolism of Legionella pneumophila Hosted by Acanthamoeba castellanii. J. Biol. Chem. 2014 Jul 25;289(30):21040–54.
  45. 45. Eylert E, Herrmann V, Jules M, Gillmaier N, Lautner M, Buchrieser C, et al. Isotopologue profiling of Legionella pneumophila: role of serine and glucose as carbon substrates. J. Biol. Chem. 2010 Jul 16;285(29):22232–43.
  46. 46. Rytter H, Jamet A, Ziveri J, Ramond E, Coureuil M, Lagouge-Roussey P, et al. The pentose phosphate pathway constitutes a major metabolic hub in pathogenic Francisella. PLOS Pathog. 2021 Aug 2;17(8):e1009326.
  47. 47. Häuslein I, Manske C, Goebel W, Eisenreich W, Hilbi H. Pathway analysis using 13C-glycerol and other carbon tracers reveals a bipartite metabolism of Legionella pneumophila. Mol. Microbiol. 2016 Apr 1;100(2):229–46.
  48. 48. Chien M, Morozova I, Shi S, Sheng H, Chen J, Gomez SM, et al. The genomic sequence of the accidental pathogen Legionella pneumophila. Science. 2004 Sep 24;305(5692):1966–8.
  49. 49. Sauer U, Eikmanns BJ. The PEP—pyruvate—oxaloacetate node as the switch point for carbon flux distribution in bacteria. FEMS Microbiol. Rev. 2005 Sep 1;29(4):765–94. pmid:16102602
  50. 50. Larsson P, Oyston PCF, Chain P, Chu MC, Duffield M, Fuxelius HH, et al. The complete genome sequence of Francisella tularensis, the causative agent of tularemia. Nat. Genet. 2005 Feb 1;37(2):153–9.
  51. 51. Muñoz-Elías EJ, McKinney JD. Carbon metabolism of intracellular bacteria. Cell. Microbiol. 2006 Jan 1;8(1):10–22. pmid:16367862
  52. 52. D’Mello R, Hill S, Poole RK. The cytochrome bd quinol oxidase in Escherichia coli has an extremely high oxygen affinity and two oxygen-binding haems: implications for regulation of activity in vivo by oxygen inhibition. Microbiology. 1996;142(4):755–63.
  53. 53. Zhu L, Lin J, Ma J, Cronan JE, Wang H. Triclosan resistance of Pseudomonas aeruginosa PAO1 is due to FabV, a triclosan-resistant enoyl-acyl carrier protein reductase. Antimicrob. Agents Chemother. 2010 Feb 1;54(2):689.
  54. 54. Massengo-Tiassé RP, Cronan JE. Vibrio cholerae FabV defines a new class of enoyl-acyl carrier protein reductase. J. Biol. Chem. 2008 Jan 18;283(3):1308–16.
  55. 55. Fujita Y, Matsuoka H, Hirooka K. Regulation of fatty acid metabolism in bacteria. Mol. Microbiol. 2007 Nov 1;66(4):829–39. pmid:17919287
  56. 56. Odom AR. Five questions about non-mevalonate isoprenoid biosynthesis. PLOS Pathog. 2011 Dec 22;7(12):e1002323. pmid:22216001
  57. 57. Pasternak Z, Pietrokovski S, Rotem O, Gophna U, Lurie-Weinberger MN, Jurkevitch E. By their genes ye shall know them: genomic signatures of predatory bacteria. ISME J. 2013 Apr 1;7(4):756–69. pmid:23190728
  58. 58. Kowalczyk B, Chmiel E, Palusinska-Szysz M. The role of lipids in Legionella-host interaction. Int. J. Mol. Sci. 2021;22(3).
  59. 59. Stead CM, Cockrell DC, Beare PA, Miller HE, Heinzen RA. A Coxiella burnetii phospholipase A homolog pldA is required for optimal growth in macrophages and developmental form lipid remodeling. BMC Microbiol. 2018 Apr 16;18(1):33.
  60. 60. Bruckert WM, Price CT, Abu Kwaik Y, Bäumler AJ. Rapid nutritional remodeling of the host cell upon attachment of Legionella pneumophila. Infect. Immun. 2014 Jan 1;82(1):72–82.
  61. 61. Price CTD, Al-Quadan T, Santic M, Rosenshine I, Abu Kwaik Y. Host proteasomal degradation generates amino acids essential for intracellular bacterial growth. Science. 2011 Dec 16;334(6062):1553–7. pmid:22096100
  62. 62. Tesh MJ, Miller RD. Amino acid requirements for Legionella pneumophila growth. J. Clin. Microbiol. 1981 May 1;13(5):865–9.
  63. 63. Kawai S, Mori S, Mukai T, Hashimoto W, Murata K. Molecular characterization of Escherichia coli NAD kinase. Eur. J. Biochem. 2001 Aug 1;268(15):4359–65.
  64. 64. Streit WR, Entcheva P. Biotin in microbes, the genes involved in its biosynthesis, its biochemical role and perspectives for biotechnological production. Appl. Microbiol. Biotechnol. 2003 Mar 1;61(1):21–31. pmid:12658511
  65. 65. Rodionov DA, Mironov AA, Gelfand MS. Conservation of the biotin regulon and the BirA regulatory signal in Eubacteria and Archaea. Genome Res. 2002 Oct;12(10):1507–16. pmid:12368242
  66. 66. Bower S, Perkins JB, Yocum RR, Howitt CL, Rahaim P, Pero J. Cloning, sequencing, and characterization of the Bacillus subtilis biotin biosynthetic operon. J. Bacteriol. 1996 Jul 1;178(14):4122.
  67. 67. Lin S, Hanson RE, Cronan JE. Biotin synthesis begins by hijacking the fatty acid synthetic pathway. Nat. Chem. Biol. 2010 Sep 1;6(9):682–8. pmid:20693992
  68. 68. Chevance FFV, Hughes KT. Coordinating assembly of a bacterial macromolecular machine. Nat. Rev. Microbiol. 2008 Jun;6(6):455–65. pmid:18483484
  69. 69. Liu X, Matsumura P. The FlhD/FlhC complex, a transcriptional activator of the Escherichia coli flagellar class II operons. J. Bacteriol. 1994 Dec 1;176(23):7345. pmid:7961507
  70. 70. Albert-Weissenberger C, Sahr T, Sismeiro O, Hacker J, Heuner K, Buchrieser C. Control of flagellar gene regulation in Legionella pneumophila and its relation to growth phase. J. Bacteriol. 2010 Jan 15;192(2):446.
  71. 71. Jacobi S, Schade R, Heuner K. Characterization of the alternative sigma factor σ54 and the transcriptional regulator FleQ of Legionella pneumophila, which are both involved in the regulation cascade of flagellar gene expression. J. Bacteriol. 2004 May 1;186(9):2540.
  72. 72. Holland IB, Schmitt L, Young J. Type 1 protein secretion in bacteria, the ABC-transporter dependent pathway. Mol. Membr. Biol. 2005 Jan 1;22(1–2):29–39.
  73. 73. Masi M, Wandersman C. Multiple signals direct the assembly and function of a Type 1 Secretion System. J. Bacteriol. 2010 Aug 1;192(15):3861–9. pmid:20418390
  74. 74. Sheahan KL, Cordero CL, Satchell KJF. Autoprocessing of the Vibrio cholerae RTX toxin by the cysteine protease domain. EMBO J. 2007 May 16;26(10):2552–61.
  75. 75. D’Auria G, Jiménez N, Peris-Bondia F, Pelaz C, Latorre A, Moya A. Virulence factor rtx in Legionella pneumophila, evidence suggesting it is a modular multifunctional protein. BMC Genom. 2008 Jan 14;9(1):14.
  76. 76. Segal G, Feldman M, Zusman T. The Icm/Dot type-IV secretion systems of Legionella pneumophila and Coxiella burnetii. FEMS Microbiol. Rev. 2005 Jan 1;29(1):65–81.
  77. 77. Isberg RR O’Connor TJ, Heidtman M. The Legionella pneumophila replication vacuole: making a cosy niche inside host cells. Nat. Rev. Microbiol. 2009 Jan 1;7(1):13–24.
  78. 78. Beare PA, Gilk SD, Larson CL, Hill J, Stead CM, Omsland A, et al. Dot/Icm Type IVB secretion system requirements for Coxiella burnetii growth in human macrophages. mBio. 2011 Sep 1;2(4):e00175–11.
  79. 79. Coers J, Kagan JC, Matthews M, Nagai H, Zuckman DM, Roy CR. Identification of Icm protein complexes that play distinct roles in the biogenesis of an organelle permissive for Legionella pneumophila intracellular growth. Mol. Microbiol. 2000 Nov 1;38(4):719–36.
  80. 80. Vogel JosephP Andrews HL, Wong SK Isberg RR. Conjugative transfer by the virulence system of Legionella pneumophila. Science. 1998 Feb 6;279(5352):873.
  81. 81. Stone BJ, Kwaik YA. Expression of Multiple Pili by Legionella pneumophila: Identification and characterization of a Type IV pilin gene and its role in adherence to mammalian and protozoan cells. Infect. Immun. 1998 Apr 1;66(4):1768. pmid:9529112
  82. 82. Näslund Salomonsson E, Forslund AL, Forsberg Å. Type IV Pili in Francisella–A virulence trait in an intracellular pathogen. Front. Microbiol. 2011;2:29. pmid:21687421
  83. 83. Adams DW, Pereira JM, Stoudmann C, Stutzmann S, Blokesch M. The type IV pilus protein PilU functions as a PilT-dependent retraction ATPase. PLOS Genet. 2019 Sep 16; 15(9):e1008393. pmid:31525185
  84. 84. Galán JE. Common themes in the design and function of bacterial effectors. Cell Host Microbe. 2009 Jun 18;5(6):571–9. pmid:19527884
  85. 85. Stebbins CE, Galán JE. Structural mimicry in bacterial virulence. Nature. 2001 Aug 1;412(6848):701–5. pmid:11507631
  86. 86. Beare PA, Unsworth N, Andoh M, Voth DE, Omsland A, Gilk SD, et al. Comparative genomics reveal extensive transposon-mediated genomic plasticity and diversity among potential effector proteins within the genus Coxiella. Infect. Immun. 2008/12/01 ed. 2009 Feb;77(2):642–56.
  87. 87. de Felipe KS, Pampou S, Jovanovic OS, Pericone CD, Ye SF, Kalachikov S, et al. Evidence for acquisition of Legionella type IV secretion substrates via interdomain horizontal gene transfer. J. Bacteriol. 2005 Nov;187(22):7716–26.
  88. 88. Mosavi LK, Cammett TJ, Desrosiers DC, Peng ZY. The ankyrin repeat as molecular architecture for protein recognition. Protein Sci. 2004 Jun 1;13(6):1435–48. pmid:15152081
  89. 89. Cazalet C, Rusniok C, Brüggemann H, Zidane N, Magnier A, Ma L, et al. Evidence in the Legionella pneumophila genome for exploitation of host cell functions and high genome plasticity. Nat. Genet. 2004 Nov 1;36(11):1165–73.
  90. 90. Voth DE, Howe D, Beare PA, Vogel JP, Unsworth N, Samuel JE, et al. The Coxiella burnetii ankyrin repeat domain-containing protein family is heterogeneous, with C-terminal truncations that influence Dot/Icm-mediated secretion. J. Bacteriol. 2009/05/01 ed. 2009 Jul;191(13):4232–42.
  91. 91. Cho NH, Kim HR, Lee JH, Kim SY, Kim J, Cha S, et al. The Orientia tsutsugamushi genome reveals massive proliferation of conjugative type IV secretion system and host–cell interaction genes. Proc. Natl. Acad. Sci. U.S.A. 2007 May 8;104(19):7981.
  92. 92. Walker T, Klasson L, Sebaihia M, Sanders MJ, Thomson NR, Parkhill J, et al. Ankyrin repeat domain-encoding genes in the wPip strain of Wolbachia from the Culex pipiens group. BMC Biol. 2007 Sep 20;5(1):39.
  93. 93. Pan X, Lührmann A, Satoh A, Laskowski-Arce MA, Roy CR. Ankyrin repeat proteins comprise a diverse family of bacterial type IV effectors. Science. 2008 Jun 20;320(5883):1651. pmid:18566289
  94. 94. D’Andrea LD, Regan L. TPR proteins: the versatile helix. Trends Biochem. Sci. 2003 Dec 1;28(12):655–62. pmid:14659697
  95. 95. Bandyopadhyay P, Sumer EU, Jayakumar D, Liu S, Xiao H, Steinman HM. Implication of proteins containing tetratricopeptide repeats in conditional virulence phenotypes of Legionella pneumophila. J Bacteriol. 2012 Jul;194(14):3579–88.
  96. 96. Bang S, Min CK, Ha NY, Choi MS, Kim IS, Kim YS, et al. Inhibition of eukaryotic translation by tetratricopeptide-repeat proteins of Orientia tsutsugamushi. J. Microbiol. 2016 Feb 1;54(2):136–44.
  97. 97. Bröms JE, Edqvist PJ, Forsberg Å, Francis MS. Tetratricopeptide repeats are essential for PcrH chaperone function in Pseudomonas aeruginosa type III secretion. FEMS Microbiol. Lett. 2006 Mar 1;256(1):57–66.
  98. 98. Li MS, Langford PR, Kroll JS. Inactivation of NMB0419, encoding a Sel1-like repeat (SLR) protein, in Neisseria meningitidis is associated with differential expression of genes belonging to the Fur regulon and reduced intraepithelial replication. Infect. Immun. 2017 May 1;85(5):e00574–16.
  99. 99. Newton HJ, Sansom FM, Dao J, McAlister AD, Sloan J, Cianciotto NP, et al. Sel1 repeat protein LpnE is a Legionella pneumophila virulence determinant that influences vacuolar trafficking. Infect. Immun. 2007 Dec 1;75(12):5575.
  100. 100. Huse M, Kuriyan J. The conformational plasticity of protein kinases. Cell. 2002 May 3;109(3):275–82. pmid:12015977
  101. 101. Hervet E, Charpentier X, Vianney A, Lazzaroni JC, Gilbert C, Atlan D, et al. Protein kinase LegK2 is a type IV secretion system effector involved in endoplasmic reticulum recruitment and intracellular replication of Legionella pneumophila. Infect. Immun. 2011/02/14 ed. 2011 May;79(5):1936–50.
  102. 102. Av-Gay Y, Everett M. The eukaryotic-like Ser/Thr protein kinases of Mycobacterium tuberculosis. Trends Microbiol. 2000 May 1;8(5):238–44.
  103. 103. Bierne H, Sabet C, Personnic N, Cossart P. Internalins: a complex family of leucine-rich repeat-containing proteins in Listeria monocytogenes. Microbes Infect. 2007 Aug 1;9(10):1156–66.
  104. 104. Haraga A, Miller SI. A Salmonella type III secretion effector interacts with the mammalian serine/threonine protein kinase PKN1. Cell. Microbiol. 2006 May 1;8(5):837–46.
  105. 105. Angot A, Vergunst A, Genin S, Peeters N. Exploitation of eukaryotic ubiquitin signaling pathways by effectors translocated by bacterial type III and type IV secretion systems. PLOS Pathog. 2007 Jan 26;3(1):e3. pmid:17257058
  106. 106. Lindgren H, Shen H, Zingmark C, Golovliov I, Conlan W, Sjöstedt A. Resistance of Francisella tularensis strains against reactive nitrogen and oxygen species with special reference to the role of KatG. Infect. Immun. 2007 Mar 1;75(3):1303.
  107. 107. Ng VH, Cox JS, Sousa AO, MacMicking JD, McKinney JD. Role of KatG catalase-peroxidase in mycobacterial pathogenesis: countering the phagocyte oxidative burst. Mol. Microbiol. 2004 Jun 1;52(5):1291–302. pmid:15165233
  108. 108. Moormeier DE, Sandoz KM, Beare PA, Sturdevant DE, Nair V, Cockrell DC, et al. Coxiella burnetii RpoS regulates genes involved in morphological differentiation and intracellular growth. J Bacteriol. 2019 Mar 26;201(8):e00009–19.
  109. 109. Dalebroux ZD, Swanson MS. ppGpp: magic beyond RNA polymerase. Nat. Rev. Microbiol. 2012 Mar 1;10(3):203–12. pmid:22337166
  110. 110. Nakanishi N, Abe H, Ogura Y, Hayashi T, Tashiro K, Kuhara S, et al. ppGpp with DksA controls gene expression in the locus of enterocyte effacement (LEE) pathogenicity island of enterohaemorrhagic Escherichia coli through activation of two virulence regulatory genes. Mol. Microbiol. 2006 Jul 1;61(1):194–205.
  111. 111. Thompson A, Rolfe MD, Lucchini S, Schwerk P, Hinton JCD, Tedin K. The bacterial signal molecule, ppGpp, mediates the environmental regulation of both the invasion and intracellular virulence gene programs of Salmonella. J. Biol. Chem. 2006 Oct 6;281(40):30112–21.
  112. 112. Hammer BK, Swanson MS. Co-ordination of Legionella pneumophila virulence with entry into stationary phase by ppGpp. Mol. Microbiol. 1999 Aug 1;33(4):721–31.
  113. 113. Alm E, Huang K, Arkin A. The evolution of two-component systems in bacteria reveals different strategies for niche adaptation. PLOS Comput. Biol. 2006 Nov 3;2(11):e143. pmid:17083272
  114. 114. Ritchings BW, Almira EC, Lory S, Ramphal R. Cloning and phenotypic characterization of fleS and fleR, new response regulators of Pseudomonas aeruginosa which regulate motility and adhesion to mucin. Infect. Immun. 1995 Dec 1;63(12):4868.
  115. 115. Garrity LF, Ordal GW. Activation of the CheA kinase by asparagine in Bacillus subtilis chemotaxis. Vol. 143, Microbiology. Microbiology Society; 1997. p. 2945–51.
  116. 116. Pernestig AK, Georgellis D, Romeo T, Suzuki K, Tomenius H, Normark S, et al. The Escherichia coli BarA-UvrY two-component system is needed for efficient switching between glycolytic and gluconeogenic carbon sources. J. Bacteriol. 2003 Feb 1;185(3):843.
  117. 117. Hobbs M, Collie ESR, Free PD, Livingston SP, Mattick JS, PilS and PilR, a two-component transcriptional regulatory system controlling expression of type 4 fimbriae in Pseudomonas aeruginosa. Mol. Microbiol. 1993 Mar 1;7(5):669–82.
  118. 118. Sanchez SE, Omsland A. Conditional impairment of Coxiella burnetii by glucose-6P dehydrogenase activity. Pathog. Dis. 2021 Aug 1;79(6):ftab034.