Computational Mass Spectrometry

The computational mass spectrometry group, headed by Professors Vineet Bafna and Pavel Pevzner, focuses on developing algorithms to process mass spectrometry data. In our lab we have developed a number of tools for computational proteomics. Each one has it's own purpose and setting. These tools are free for download, or are also integrated into a web server.

News

Nonribosomal Peptides Dereplication and SequencingAugust 7, 2009

Nonribosomal peptides (NRPs) are of great pharmacological importance, but there is currently no technology for high-throughput NRP 'dereplication' and sequencing. We used multistage mass spectrometry followed by spectral alignment algorithms for sequencing of cyclic NRPs. We also developed an algorithm for comparative NRP dereplication that establishes similarities between newly isolated and previously identified similar but nonidentical NRPs, substantially reducing dereplication efforts. The homepage for this project can be found here.

Arabidopsis ProteogenomicsJanuary 8, 2009

Our study of the Arabidopsis proteome through tandem mass spectrometry revealed over 18,000 novel peptides not in the TAIR7 genome annotation release. Using Inspect, we identified over 144,000 peptides from 3 sequence databases; the six-frame translation of the Arabidopsis genome, an exon-graph based on ab initio gene predictions, and the TAIR7 proteome. From the novel peptides we predicted over 700 new gene models and over 600 corrections to current gene models. The peptides and predicted models can be accessed here.

Multistage mass spectrometryMay 29, 2008

Multistage mass spectrometry (collecting multiple MS^3 spectra from each MS^2 spectrum) and accurate precursor masses (but inaccurate fragment masses) have been demonstrated to lead to significant gains in peptide identification via database search but have had a limited impact in de novo peptide sequencing. Our Multi-stage Spectral Networks package addresses both of these in a rigorous probabilistic framework for analyzing spectra of overlapping peptides, resulting in both accurate de novo peptide sequencing from multistage mass spectra (despite the inferior quality of MS^3 spectra) and improved interpretation of spectral networks. Additional details and the open-source package are available here.

Phosphate Localization ScoreApril 4, 2008

Phosphate Localization Score is an algorithm which determines the confidence of the placement of a phosphate on a given residue. This method is similar to the AScore, and is described in Albuquerque et al., Mol Cell Proteomics 2008. The program is integrated with the Inspect package, download available here. A tutorial for using the program is in the Inspect documentation, here.

MS-DictionaryNovember 30, 2007

MS-Dictionary is a software to generate all plausible de novo interpretations of a tandem mass spectrum(spectral dictionary) and matches them against a protein database quickly. It enables proteogenomic searches in six-frame translation of genomic sequences that may be prohibitively time-consuming for existing database search approaches.

MS-GeneratingFunctionNovember 28, 2007

MS-GF is a software for computing the generating function of a tandem mass spectrum.The generating functions and their derivatives represent new features of tandem mass spectra that improve peptide identifications. Further, they enable one to rigorously compute error rates of peptide identifications and get better sensitivity-specificity trade-off of existing MS/MS search tools.

MS-ClusteringNovember 26, 2007

MS-Clustering is a new program aimed at improving the analysis large MS/MS datasets by removing many of their redundant or low quality spectra. MS-Clustering is capable of reducing the number of spectra submitted for analysis from a large 10+ million dataset by 90% while increasing the number peptide/protein identifications by up to 10%.

Spectral NetworksOctober 24, 2007

Spectral networks are a novel approach to the identification of MS/MS spectra that detects and combines spectra from overlapping peptides or modified variants of the same peptide. This approach allows for the blind indentification of unexpected post-translational modifications and highly modified peptides. The spectral networks software package is now available in open-source and Windows-binary versions.

PepNovoOctober 4, 2007

A new version of PepNovo is released. It contains optional quality filtering and models for several MS instrument types.

Web serverJuly 25, 2007

The web server hosting all of our software is up and running. Users may sign up for an account and search spectra. The server posts jobs to a large compute grid.

Phosphorylation searchJuly 10, 2007

Inspect has been trained to score phosphorylated MS/MS spectra. The new scoring function has been trained on LTQ machines, and works great.

Proteogenomics Consortium

HHMI suppports the Bioinformatics [Under]graduate Research Consortium in Comparative Proteogenomics at UCSD. Proteogenomics is a new research area that utilizes the whole genome MS/MS datasets to better characterize the genomic and proteomic annotations on a global scale. The consortium provides an opportunity to the undergraduate and fresh graduate students to get hands-on research experience with real and unsolved bioinformatics problems in this upcoming field. More information.

Latest Releases

Inspect, MS-Alignment

2009.11.18

MS-GeneratingFunction

2008.09.04

PepNovo

2009.10.29

MS-Clustering

2008.06.09

MS-Dictionary

2007.11.30

Spectral Networks

Sept 2007

Copyright Notice

 

Media Coverage


Nonribosomal Peptide Dereplication and Sequencing (Scientific American, Genetic Engineering News, Natural Products Industry Insider and Genome Web Daily News)

A powerful tool for PTM discovery (Jan 2008, Journal of Proteome research, Vol 7. Issue 1)

From spectral networks to shotgun sequencing (June 2007, Nature Methods, Vol. 4 No. 6)

Identifying peptides without a database (May 2007, Journal of Proteome Research)

UCSD Computer Scientist Wins Young Investigator Award, Research on Snake Venom Proteins Highlighted (Nov 2006, UCSD)