Bacterial Proteogenomic Annotation

Main Page, Download, Copyright Notice, Documentation
While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. The number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes or annotate bacterial proteomes. In this project, we use tandem mass spectrometry (MS/MS) to annotate the proteome of bacterial genomes and provide a comprehensive map of post-translational modifications. We also detect multiple genes that were missed or suggest corrections to improve the gene annotation.

Comparative Shotgun Protein Sequencing

Main Page, Download, Copyright Notice
De novo sequencing of monoclonal antibodies is an important step in the drug discovery process in when the cDNA or original cell line are not available, or when the characterization of unexpected post translational modifications is needed to verify the integrity of the antibody. Despite being time-consuming, the fifty year old technique of Edman degradation has remained the primary tool for de novo protein sequencing. Here we demonstrate that Shotgun Protein Sequencing (SPS), a recently developed approach employing tandem mass spectrometry, represents a fast and accurate protein analysis technique with the potential to dramatically reduce the reliance on Edman degradation in the studies of unknown proteins. We illustrate the application of SPS for sequencing monoclonal antibodies and introduce Comparative Shotgun Protein Sequencing (CSPS) to assemble multiple protein contigs into complete antibodies using related antibodies as templates. We estimate that CSPS leads to one-two orders of magnitude reduction in protein sequencing effort as compared to conventional Edman degradation approaches. Furthermore, rather than being hindered by post-translational modifications, this approach allows one to automatically discover unexpected modifications.

Inspect

Main Page, Download, Copyright Notice, Documentation
Inspect is a general purpose database search algorithm, with an emphasis on efficiently and confidently identifying modified peptides. It includes special scoring models for phosphorylation which allow for increased accuracy. In addition, Inspect implements the MS-Alignment algorithm for discovery of unanticipated modifications in blind mode.

MS-Clustering

Main Page, Download, Copyright Notice
Tandem mass spectrometry (MS/MS) experiments often generate redundant datasets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. For more details see downloadable zip file

MS-Dictionary

Main Page
MS-Dictionary is a software to generate all plausible de novo interpretations of a tandem mass spectrum(spectral dictionary) and matches them against a protein database quickly. It enables proteogenomic searches in six-frame translation of genomic sequences that may be prohibitively time-consuming for existing database search approaches.

MS-GeneratingFunction

Main Page, Download, Copyright Notice, Documentation
MS-GF is a software for computing the generating function of a tandem mass spectrum.The generating functions and their derivatives represent new features of tandem mass spectra that improve peptide identifications. Further, they enable one to rigorously compute error rates of peptide identifications and get better sensitivity-specificity trade-off of existing MS/MS search tools.

PepNovo

Main Page, Download, Copyright Notice
PepNovo is a software tool for de novo sequencing of peptides from mass spectra. PepNovo uses a probabilistic network to model the peptide fragmentation events in a mass spectrometer. In addition, it uses a likelihood ratio hypothesis test to determine if the peaks observed in the mass spectrum are more likely to have been produced under the fragmentation model, than under a probabilistic model that treats the appearance of peaks as random events.

Shotgun Protein Sequencing

Download, Copyright Notice, Documentation
Analysis of MS/MS spectra from multiple overlapping peptides opens the possibility of assembling MS/MS spectra into entire proteins, similar to teh assembly of DNA into genomes. This software recovers all or parts of the protein sequence through clustering, pairwise alighment, assembly and de novo interpretation of the input MS/MS spectra.

Spectral Networks

Main Page, Download, Copyright Notice, Documentation
Spectral networks are based on the idea of performing an MS/MS database search without comparing a spectrum against a database. Spectral newtorks capitalize on spectral pairs, which allow for the identification of prefix and suffix ladders and greatly reduce noise.

MS Top Down

Download, Copyright Notice
Recent advances in mass spectrometry instrumentation, such as FT-ICR and OrbiTrap, have made it possible to generate high resolution spectra of entire proteins. While these methods offer new opportunities for performing "top-down" studies of proteins, the computational tools for analyzing top-down data are still scarce. MS-TopDown is a new algorithm for sequencing such data. It implements a version of the Spectral Alignment algorithm specially suited for the problem of identifying protein forms in top-down mass spectra (i.e., identifying the modifications, mutations, insertions and deletions). MS-TopDown can efficiently discover protein forms even in the presence of numerous modifications, and it can also recover positional isomers from spectra of mixtures of isobaric protein forms.

Latest Releases

Inspect, MS-Alignment

2009.11.18

MS-GeneratingFunction

2008.09.04

PepNovo

2009.10.29

MS-Clustering

2008.06.09

MS-Dictionary

2007.11.30

Spectral Networks

Sept 2007

Copyright Notice

 

Media Coverage


Nonribosomal Peptide Dereplication and Sequencing (Scientific American, Genetic Engineering News, Natural Products Industry Insider and Genome Web Daily News)

A powerful tool for PTM discovery (Jan 2008, Journal of Proteome research, Vol 7. Issue 1)

From spectral networks to shotgun sequencing (June 2007, Nature Methods, Vol. 4 No. 6)

Identifying peptides without a database (May 2007, Journal of Proteome Research)

UCSD Computer Scientist Wins Young Investigator Award, Research on Snake Venom Proteins Highlighted (Nov 2006, UCSD)