Figure 1.
Steps involved in the ‘context-specific’ bioinformatics study.
The chart is organized in the consecutive major steps labelled as 1 to 7, and it contains four columns; the first column shows the number of protein sequences before and the last column that of after the execution of each step (No seq INPUT and No seq OUTPUT), respectively. The second column shows the description of the steps, the third column the references to the steps, respectively. For details see ‘Materials and Methods.’ To carry out these steps, we have written a few Python-routines for the steps 1 through 3 and employed several open access programs (steps in light grey).
Figure 2.
Estimated phylogenetic tree of the WXG-100 protein family consisting of three WXG100 subfamilies.
The tree of WXG100 proteins was constructed in midpoint rooted presentation with three main clades: CFP-10-like (blue circular arc), ESAT-6-like (cyan circular arc) proteins and sagEsxA-like proteins (orange circular arc). The WXG100 gene pairs of M. tuberculosis occurring within the RD1-like gene clusters denoted as the regions (Esx) 1 to 5 are coloured accordingly along with the Rv-annotations (see subtitles). The annotations of the genes in close proximity to each of the WXG100 genes were manually analyzed and this information was also included to the tree. Two WXG100 genes with an intergenic distance of less than 80 nucleotides (according to the definition Roback et al. [47]) are considered to be encoded within a bi-cistronic operon (filled black squares on the circle layer 3), whilst mono-cistronic WXG100 genes are indicated by an unfilled squares. Those WXG-proteins whose oligomeric properties have been experimentally determined are marked with a triangle for homodimers and with pairs of blue dots for heterodimers. The second inner arcs show the phyla of the bacteria.
Figure 3.
Alignments of the WXG100 subfamilies reveal conserved subfamily specific residues and generally conserved C-terminal residues pattern.
(A) The position of helices, according to the structures of ESAT-6, CFP-10, and sagEsx-A, are shown below the alignments of each subfamily. The four-helix bundle requires mostly hydrophobic residues at the position of ‘a’ and ‘d’ of a helix turn consisting of the heptad helix repeat (a-b-c-d-e-f-g), shown as grey shading on the aligned residues. The key features of ‘ESAT-6-like’ subfamily (top panel): Shown are three highly conserved residues besides the almost invariant WXG motif (marked with red triangles), boxed in K/P38 (yellow), Y51 (green) and Q55 (red). Numbering of residues followed those of ESAT-6 (Rv3875). In the ‘CFP-10-like’ subfamily (middle panel), there are almost no conserved features, except for the C-terminal sequence conservation (marked with asterisks, filled with black for hydrophobic residues and unfilled for acidic residues), shared by all WXG100 superfamily members. In the ‘sagEsx-like’ subfamily (bottom panel), all residues involved in the inter-dimer interactions are hydrophobic except two residues, boxed in cyan and black. The gene IDs of the WXG targets are shown on each line. The numbers correspond to the locus of each genes depicted here. The bacterial species out of the phylum “Actinobacteria” are abbreviated as: Mmcs0071: Mycobacterium sp. MCS, Mvan: M. vanbaalenii, ML: M. leprae, jk: Corynebacterium (C.) jeikeium, cur: C. urealyticum, Ncgl: C. glutamicum, DIP: C. diphtheria, Mflv: M.gilvum, Sare: Salinispora (Sa.) arenicola, Strop: Sa. tropica, SACE: Saccharopolyspora erythraea, and those from the phylum “Firmicutes” as: CAC: Clostridium acetobutylicum, BPUM: Bacillus pumilus, GTNG: Geobacillus thermodenitrificans, ABC: alkaliphilic Bacillus clausii, BC: Bacillus cereus, Sez: Streptococcus equi, Lmo: Listeria monocytogenes serovar, SAV: Staphylococcus aureus, SAG: Streptococcus agalactiae, BH: Bacillus halodurans, Cthe: Clostridium thermocellum, respectively. (B) The C-terminal consensus sequence HxxxD/ExxhxxxH is shown as a sequence logo diagram. The residue at the eighth position is marked with ‘h’ indicating lower conservation on hydrophobic residues (see panel A). (C) Structural superposition of CFP-10 (blue), ESAT-6 (cyan), sagEsxA (orange) and sauEsxA (violet): Only the C-terminal helices along with the adjacent WXG loops facing towards helices are shown. For better visibility only the side chains of sagEsxA are shown. (D) The side chains of the conserved C-terminal residues decorate the same side of the C-terminal helix as observed in the structures of the WXG100 proteins, shown is that of sagEsxA (see text), marked with asterisks in panel A. To emphasize the structural feature, the C-terminal helix is shown in a surface representation, where the consensus hydrophobic residues are in grey and the acid residue is in red. The remaining residues (x) are shown in light grey.
Figure 4.
Structures of sagEsxA and CFP-10/ESAT-6 complexes, and comparisons of the loop conformation, as observed in the three WXG100 proteins.
(A) The four-helix bundle structures of the homodimeric sagEsxA and heterodimeric CFP-10/ESAT-6 complexes are shown. (B) The WXG motif-containing loops of ESAT-6 showing an extended hydrogen-bonding network as indicated by dashed lines and labelled with their hydrogen bond donor-acceptor distances. (C) Comparisons of the loops of CFP-10 and ESAT-6. The asymmetric unit (AU) of CFP-10/ESAT-6 crystal contains two copies of the heterodimer. The view shows down towards the central long axis of the dimer. The relation of top to bottom panel views are 180° rotation around central short axis of the dimer, showing the WXG containing loop of ESAT-6 (top) and that of CFP-10 (bottom). Superimpositions of the structures of the AU content show that the WXG containing loops of ESAT-6 exhibit lower B-values and overlap better than that of CFP-10. (D) A hydrogen bond interaction formed by Y18 and Q38 at the inter-dimer interface of sagEsxA is shown.
Table 1.
Crystallographic statistics.
Figure 5.
WXG100 proteins form dimeric complexes, studied using FRET.
(A) Schematic diagram of the FRET experiments. Fluorescence donor, Alexa 488 (green), and fluorescence acceptor, Alexa 647 (red), are represented as stars. The Alexa fluorescence dye-conjugated proteins are indicated after their names along with the type of the Alexa dye, e.g. D-ESAT-6 instead of Alexa 488-ESAT-6. (B) The fluorescence spectra of the labelled proteins in those combinations, which were indicated in the schematic diagram A. Control contains only donor labelled protein (black). The donor/acceptor labelled ESAT-6 shows no FRET signal, also after heat de- and renaturation, indicating no homo-dimer formation (dark green). The donor labelled ESAT-6 and acceptor labelled CFP-10 gives a FRET signal, showing that CFP-10 and ESAT-6 spontaneously form a heterodimer (green). sagEsxA exhibits after initial mixing no FRET, but upon heat de- and renaturation there is reconstitution of FRET pairs (blue). For the FRET measurements the respective samples are mixed equimolar prior the measurements.