Pepnovo

Contacts

ari frank[afrank at ucsd.edu]

Summary

De novo sequencing of low precision MS/MS data

The standard tool for high throughput sequencing of MS/MS data is the database search (tools like Sequest, Mascot, and InsPecT). However, there are cases where the traditional database search cannot be used. For instance, to be able to offer the relevant candidate peptides for a database search, the target organism’s genome must be sequenced. Though the number of sequenced genomes is constantly growing, many organisms are still not sequenced, and do not even have a sequenced close homologue. In addition, even if a genome is sequenced, all its alternative splice variants of genes also need to be known to be able to identify all peptides in a sample.

The de novo sequencing method is useful in the situations described above since it does not require any knowledge of the sequenced genome, rather it performs all its sequencing effort using only the information present in the mass spectrum itself. In addition de novo sequencing can serve as an independent verification stage for the database search results.

We developed PepNovo to serve as a high throughput de novo peptide sequencing tool for tandem mass spectrometry data. PepNovo typically runs in less than 0.2 seconds per spectrum. PepNovo uses a probabilistic network to model the peptide fragmentation events in a mass spectrometer. In addition, it uses a likelihood ratio hypothesis test to determine if the peaks observed in the mass spectrum are more likely to have been produced under the fragmentation model, than under a probabilistic model that treats the appearance of peaks as random events. In benchmark experiments, PepNovo was found to outperform several of leading de novo algorithms such as SHERENGA, Peaks, and Lutefisk.

Downloads

New PepNovo+ : de novo sequencing, quality filtering and MS-Blast query generation

Publications

Predicting intensity ranks of peptide fragment ions. Frank AM. J. Proteome Res., 8:2226–40, 2009.
A ranking-based scoring function for peptide-spectrum matches. Frank, AM. J. Proteome Res., 8:2241–52, 2009.