MS-Cluster/Archives

Summary

Tandem mass spectrometry experiments often generate redundant data sets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. MS-Cluster is an efficient clustering approach for analyzing large MS/MS data sets with a capability to reduce the number of spectra submitted to further analysis by an order of magnitude. The MS/MS database search of clustered spectra results in fewer spurious hits to the database and increases number of peptide identifications as compared to regular nonclustered searches.

Publications

Clustering millions of tandem mass spectra. Frank AM, Bandeira N, Shen Z, Tanner S, Briggs SP, Smith RD, Pevzner PA. J Proteome Res. 2008 Jan;7(1):113-22. Epub 2007 Dec 8.
Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra Frank AM, Monroe ME, Shah AR, Carver JJ, Bandeira N, Moore RJ, Anderson GA, Smith RD, Pevzner PA. Nat Methods. 2011 May 15;8(7):587-91. doi: 10.1038/nmeth.1609.