Skip to main content
Advertisement

< Back to Article

MetaNovo: An open-source pipeline for probabilistic peptide discovery in complex metaproteomic datasets

Fig 2

A Graphical representation of the MetaNovo algorithm applied for sequence database filtration.

Normalized spectral abundance factor calculations include non-unique spectra. The magnitude of probabilities are represented by +’s. Proteins are ranked by the joint probability of organism and protein probabilities, represented by the arrow, in order of increasing probability. The number of unique spectra for each protein is determined based on its position in the ranked list, and only include spectra that do not appear in the set of proteins in the list above (but may include spectra that appear below), such as the spectra for Peptide B that are counted towards the first protein in the list, but not the second. Tie breaks for adjacent and nearly identical isoforms that share the same set of spectra, will be based on the shortest (most probable) sequence having a higher NSAF (and thus a higher protein probability) or a higher organism probability. Proteins in green will be selected for inclusion in the filtered sequence database, and proteins in red will be excluded (having no unique spectra). The colors shared by proteins, peptides and spectra above, illustrate the assignment of unique spectra and peptides, to the most probable protein in the ranked list.

Fig 2

doi: https://doi.org/10.1371/journal.pcbi.1011163.g002