Cancer Proteogenomics Tools

Contacts

Sunghee Woo [suwoo (at) ucsd.edu]

Summary

The advent of inexpensive RNA-Seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our tool addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads.

  • Input: RNA-seq file (.sam / .spl / .vcf)
  • Output: FASTA database ( .spl / .ms2db files will be given for a future reference )

Documentation

Manual

Publications

Proteogenomic database construction driven from large scale RNA-seq data Woo S, Cha SW, Merrihew G, He Y, Castellana N, Guest C, MacCoss M, Bafna V. J Proteome Res. 2014 Jan 3;13(1):21-8. doi: 10.1021/pr400294c. Epub 2013 Jul 17.