BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation

doi:10.1371/journal.pone.0182216

Fig 1.

The BrEPS 2.0 workflow.

The protocol consists of six steps to generate the BrEPS database. A: Selection and preparation of sequences. B: All-vs-all BLAST of sequences. C: Complete linkage clustering based on the E-value from BLAST. D: Multiple sequence alignment and pattern creation on selected nodes. E: Pattern verification. F: Preparation of the final database.

More »

Expand

Table 1.

Filters applied to UniProt protein entries to parse enzyme data from UniProt flatfiles.

More »

Expand

Fig 2.

Detailed overview of the new data selection.

Only Swiss-Prot sequences with evidence on protein level (A) are used as seed sequences to retrieve additional, non-redundant sequences from TrEMBL and Swiss-Prot using UniRef references with >= 50% sequence identity (B and C). These additional sequences get the corresponding Swiss-Prot annotation and are merged with the seed sequences into one database (D).

More »