Skip to main content
Advertisement

< Back to Article

Structure-Guided Comparative Analysis of Proteins: Principles, Tools, and Applications for Predicting Function

Figure 5

PIRSF (A,B), COG (C,D), and Pfam (E,F) input and results.

(A) The fasta sequence of query protein with UniProt accession O67940 from Aquifex aeolicus is scanned against PIR's curated family database. (The query is searched against the full-length and domain hidden Markov models for manually curated PIRSFs. If a match is found, the matched regions and statistics are displayed). (B) The query hits the PIRSF family PIRSF006779. The output provides family details; statistical data for full-length proteins, composite domains, and a pairwise alignment of query with the consensus sequence of the PIRSF. (C) The fasta sequence of query protein with UniProt accession O67940 from Aquifex aeolicus is scanned against the database of clusters of orthologous groups. COG compares protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of orthologous/co-orthologous proteins from at least three lineages. (D) The query hits COG1912. The output provides the family details: statistical score, reciprocal best hits, and members of the family. (E) The fasta sequence of query protein with UniProt accession O67940 from Aquifex aeolicus is scanned against the Pfam domain database. The Pfam database is a large collection of domain families, each represented by multiple sequence alignments and hidden Markov models (HMMs). (F) The query hits Pfam family PF01887.

Figure 5

doi: https://doi.org/10.1371/journal.pcbi.1000151.g005