Software Tools & Datasets
SpliceDB: Tool web page ProteoSAFe workflow (beta)ENOSI: Tool web page ProteoSAFe workflow (beta)
Key Publications
Discovery of Aberrant Cancer Genes and Revealing Antibody Repertoires
The central dogma of Biology is that DNA contains the code for making proteins. Traditionally, genomic and computational (gene-finding) methods have been used to predict genes and encoded proteins. Mass spectrometry has been used to validate and quantify expressed proteins in samples. However, the human genome is at best, an incomplete template for protein synthesis. Mutations change the amino-acid code of proteins. Exons splice-together in different ways to make new protein products. Recombination, splicing and non-templated insertions are used to make antibody proteins. Large structural variation delete insert, translocate and duplicate genomic fragments changing gene structure and copy number.
CCMS is pioneering the use of proteogenomics techniques to identify expressed protein sequences using tandem mass spectra of expressed peptides, searched against customized databases of genomic information as a partial. CCMS developed proteogenomics annotations are now routinely used to annotate model organism. However, the use of next generation sequencing has greatly enhanced the landscape of proteome variation within a population, and in diseases like cancer.
Recent research has shown tremendous plasticity in cancer genomes. Various sequencing projects such as The Cancer Genome Atlas (TCGA) have suggested large structural abnormalities in tumor genomes, with a consequent impact on the expressed transcriptome and proteome. The validation of these aberrant (or still unnannotated) genes remains challenging due to erroneous and variable data, Big (petabyte scale) data-sets, and importantly, lack of direct protein level confirmation. Our current focus is on mass spectrometric validation of these gene aberrations.