Real time structural search of the Protein Data Bank

BioZernike descriptors workflow.

Every atomic structure in the PDB (a) is converted to a volume by selecting representative atoms per residue (b) and placing a gaussian density in their place (c). The geometric features (GEO) can be calculated directly from the representative atoms coordinates, whilst the Zernike moments and their Canterakis Norms of various orders are calculated out of the volume (note that different normalizations are offset in y axis for clarity) (d). The vector of concatenated geometric features and CNs of selected orders constitute the composite BioZernike shape descriptor. The distance between descriptors (composing both GEO and CNs) is calculated by learning optimal weights on a training set (e). The alignment descriptor is obtained directly from the CNs.

