Knowledge-guided data mining on the standardized architecture of NRPS: Subtypes, novel motifs, and sequence entanglements
Fig 5
Statistical coupling analysis reveals overlapped sectors across the C+A+T module.
A. The upper panel shows the conservation of residues in a multiple sequence alignment of 1,161 NRPS modules (containing the C, A, and T domains), quantified by the relative entropy in SCA method. The mean conservation level (0.32) is marked by the blue dashed line. In the lower panel, there are three groups of positions (II(+) with green, II(-) with magenta, and IV (-) with red, termed “sectors”. Their corresponding conservations are marked in the same color in the upper panel. Blue bars mark C domain motifs from C1 to C7. Orange bars mark A domain motifs from A1 to A10. Yellow bar marks the T domain motif T1. Domain boundaries annotated by Pfam are divided by vertical black dashed lines. Black triangle marks the re-engineering point in the C domain reported by Bozhüyük et al. [27], black circle marks the re-engineering point in the C-A inter-domain reported by Calcott et al. [28] and black diamond marks the re-engineering point in the C-A inter-domain reported by Bozhüyük et al. [26]. B. Mapping three groups of correlated conservation positions into the three-dimensional structure of the NRPS module (PDB 4ZXI, containing the C, A, T, and TE domains. TE domain is hidden for clarity). Three sectors are marked in the same color as that in (A). C domain, A core domain, A sub domain, T domain are circled by blue, orange, yellow green, and yellow dotted line, respectively. Gly and AMP are substrates of this A domain. They and Mg2+ (for catalysis) are colored cyan. C. Heatmap of the SCA matrix after reduction of statistical noise and of global coherent correlations (see Method for details). Each sector is marked by the corresponding color bracket under the heatmap, with the number of contained residues listed. 68, 54, and 50 positions belong to the II(-), II(+), and IV(-) sectors, respectively. In each sector, residues are ordered by descending contributions, showing that sector positions comprise a hierarchy of correlation strengths.