Skip to main content
Advertisement

< Back to Article

Figure 1.

An Overview of the Pipeline for Generating Structurally Similar Groups, Multi-Domain Architecture Groups and their Respective Phylogenetic Trees.

More »

Figure 1 Expand

Figure 2.

Summary of Phylogenetic, Functional, Metabolite and Domain Architectures for the Phosphatidylinositol-phosphodiesterase Superfamily.

A diagrammatic representation of the FunTree phylogenetic tree with associated functional data and multi-domain architectures from ArchSchema. Each domain is given a unique colour, with the domain of focus coloured green. Three major clades (C1–C3) are highlighted. Within the first group a number of functional sub-groups can be observed, with differences in function defined by changes in substrate or product formed. The presence of additional domains does not change function.

More »

Figure 2 Expand

Figure 3.

Multi-Domain Architectures Defined by Phosphatidylinositol-Phosphodiesterase Domain.

An ArchSchema graph showing the multiple domain architectures found in the phosphatidylinositol-phosphodiesterase superfamily. Each node represents a unique multiple domain architecture, with a red line under the node indicating that a structure exists. Also shown are the E.C. numbers (and the number of sequences which have them in brackets) found for each MDA. Each E.C. class is coloured separately, with the intensity of the colour proportional to the number of sequences that have that E.C. assigned. Only domain architectures for sequences that are annotated in the reviewed section of UniProtKB are shown.

More »

Figure 3 Expand

Figure 4.

Summary of Phylogenetic, Functional, Metabolite and Domain Architectures for Ntn-type Amide Hydrolase Superfamily.

The superfamily is divided into three structurally similar groups, with a diagrammatic version of the FunTree phylogenetic tree shown for each, as well as functional, substrate and multi-domain architecture data. Each domain is given a unique colour, with the domain of focus coloured green.

More »

Figure 4 Expand

Figure 5.

Multi-Domain Architectures as Defined by Ntn Type Amide Hydrolasing Domain.

The ArchSchema graph showing MDAs as defined by the Ntn type amide hydrolasing domain. It should be noted that the MDA representing the single domain has cataloged two sequences that have the amidophosphoribosyltransferase function. These sequences represent two truncated sequences (the truncations resulting from a frame-shift), with the full sequence comprising two domains (highlighted by a caution remark in the UniProtKB record). The two truncated sequences inherit the function from the full sequence. Likewise a glutamate synthase function is also ascribed to a sequence in this MDA but comes from a sequence fragment and is likely to be a longer sequence with more domains. Though rare, this highlights the care that needs to be taken when analysing sequence annotations.

More »

Figure 5 Expand

Figure 6.

Structural and functional diversity of the 276 superfamilies.

A & B. Distribution of the number of structurally similar groups and unique multi-domain architectures in these superfamilies. C. Distribution of sequence conservation in the alignments for structurally similar groups (SSG) and D. multi-domain architectures (MDA), as measured by ScoreCons. Although some MDAs are quite diverse, others appear quite conserved, which may be due to some MDAs having relatively few sequences associated with them. E. The distribution of the number of fully described (to the fourth level) E.C. numbers across all 276 superfamilies. F. Shows the largest percentage of sequences with the same E.C. number compared to size of the superfamily observed by the number of sequences with a fully classified E.C. number in the superfamily. A dashed line shows that 50% of superfamilies have greater than 65% of their sequences with the same E.C. number.

More »

Figure 6 Expand

Figure 7.

Changes in Function within 276 Superfamilies.

A: A heatmap showing the cumulative changes in all superfamilies where a change is observed based on the differences in E.C. annotations at the class level in a superfamily and uses the phylogenetic tree to infer the order in which changes have occurred. These counts do not take into account changes that occur between SSGs and so need to be viewed in conjunction with the counts in B (right of the diagonal). The colour intensity indicates the number of times a change in E.C. class occurs. The matrix shows the percentage of changes (with total counts in brackets) in E.C. class observed across all 276 superfamilies. Along the matrix diagonal the number of changes occurring within the E.C. class to the 4th level of the E.C. number. B. A similar heatmap to that described for A, but using all the possible combinations of E.C. found in a superfamily. The top right of the matrix shows the observed percentage of changes, with actual count totals in brackets, while the lower right half shows the percentage of changes expected based on a random simulation of E.C. changes. To the right of the matrix the observed (OBS) and expected (EXP) percentage of changes for each E.C. class are shown. C. The same exchanges as described for B but concentrating on the interchanges between classes. D. A box plot showing the proportion of E.C. changes in a superfamily by E.C. level (i.e. derived from data in A top right of matrix). For example if a superfamily has 2 changes at E.C. level 1 and 3 changes at level 4, then the primary E.C. level has contributed 2/5 and the fourth level has contributed 3/5. These fractions are catalogued across all superfamilies in the plot. The insert shows the total number of observed exchanges at each class level. All interchanges shown in A to D exclude those that are being contributed by ‘confusion domains’ detailed in Figure S12.

More »

Figure 7 Expand