Contacts
Natalie Castellana [ncastell(at)cs.ucsd.edu]Summary
Inspect is a MS/MS database search tool specifically designed to address two crucial needs of the proteomics comminuty: post-translational modification identification and search speed. The program is available as a free download or online in our ProteoSAFe webserver. The online interface is coordinated with other proteomics software developed in the lab, like PepNovo
Typical database searches do not deal well with the dynamic nature of the proteome. Post-translational modifications, alternative splicing, and laboratory chemisty all affect protein behavior and make spectrum interpretation more challenging. The primary challenge is that the “virtual database” of all modified peptides undergoes a combinatorial explosion when a broad range of modifications is allowed. This affects search running time. A secondary challenge is that in this richer database, there are many more close “relatives” for each peptide. This affects scoring accuracy, since differentiating between correct and incorrect identifications is more difficult.
InsPecT addresses several algorithmic problems in order to identify modified proteins.
InsPecT uses peptide sequence tags (PSTs) to filter the database. InsPecT has an internal tag generator, but can accept tags generated by other tools (e.g. Pepnovo, GutenTAG). Because de novo is imperfect, multiple tags are produced for each spectrum, to ensure that (at least) one tag is corrrect. These PSTs are extremely efficient filters, even in the context of up to a dozen possible modifications. Tag-based filtering can also be combined with the “two-pass” filtering pioneered by X!Tandem, where from one search provides a list of proteins (a mini-database) for a more detailed search.
Unanticipated modifications are common in proteomics. InsPecT implements the MS-Alignment algorithm for “blind” spectral search, with no bias toward anticipated modification types. This search has been applied to annotate heavily-modified proteins such as crystallins.
The InsPecT distribution includes a script (PTMAnalysis.py) implementing the PTMFinder procedure for analysis of unrestrictive modification results. This procedure allows for the accurate scoring of PTMs, and for the calculation of a false discovery rate.
Documentation
Documentation is included in the software package, and is also available online. Also, a tutorial covering the MS2DB file format is available (as presented at Algorithmic Biology 2006).