NeuroPedia: Neuropeptide database and spectra library
Downloads
Browse NeuroPedia spectral libraries
Search your data using NeuroPedia
Contact: Yoona Kim [yok002 (at) ucsd.edu], Nuno Bandeira [bandeira (at) ucsd.edu]
Summary
Neuropeptides are essential for cell-cell communication in neurological and endocrine physiological processes in health and disease. While many neuropeptides have been identified in previous studies, the resulting data has not been structured to facilitate further analysis by tandem mass spectrometry (MS/MS), the main technology for high throughput neuropeptide identification. Many neuropeptides are difficult to identify when searching MS/MS spectra against large protein databases because of their atypical lengths (e.g., shorter/longer than common tryptic peptides) and lack of tryptic residues to facilitate peptide ionization/fragmentation. NeuroPedia is a neuropeptide encyclopedia of peptide sequences (including genomic and taxonomic information) and spectral libraries of identified MS/MS spectra of homolog neuropeptides from multiple species. Searching neuropeptide MS/MS data against known NeuroPedia sequences will improve the sensitivity of database search tools. Moreover, the availability of neuropeptide spectral libraries will also enable the utilization of spectral library search tools, which are known to further improve the sensitivity of peptide identification. These will also reinforce the confidence in peptide identifications by enabling visual comparisons between new and previously identified neuropeptide MS/MS spectra.
[Reference] NeuroPedia: Neuropeptide database and spectra library.
Y. Kim, S. Bark, V. Hook, and N. Bandeira
(In submission)
Neuropeptide sequence databases
Sequences, taxonomic, and genomic information for 847 neuropeptides Download in Excel formatMatch Types of pairs of sequences, total 340,725 pairs:    
Types | Number of pairs | |
---|---|---|
Identical | 531 | Download |
Overlapping | 5,020 | Download |
Homolog | 9,185 | Download |
FASTA files per species:    
Species | Number of sequences | |
---|---|---|
Human | 270 | Download |
Rat | 195 | Download |
Mouse | 188 | Download |
Bovine | 154 | Download |
Rhesus macaque | 20 | Download |
Chimpanzee | 17 | Download |
California sea hare | 2 | Download |
Leech | 1 | Download |
Neuropeptide spectra
All identified 3,401 neuropeptide spectra:
Species | Instrument | Enzyme | Number of spectra from NIST | Number of spectra from In-housed | Total Number of spectra | Browse spectra | Spectral library | Decoy library |
---|---|---|---|---|---|---|---|---|
Human | IT | trypsin | 385 a | 221 | 606 | Browse | Download | Download |
Human | IT | v8 | 0 | 91 | 91 | Browse | Download | Download |
Human | IT | none | 0 | 1,630 | 1,630 | Browse | Download | Download |
Human | QTOF | trypsin | 41b | 454 | 495 | Browse | Download | Download |
Human | QTOF | v8 | 0 | 202 | 202 | Browse | Download | Download |
Human | QTOF | none | 0 | 160 | 160 | Browse | Download | Download |
Mouse | IT | trypsin | 67c | 0 | 67 | Browse | Download | Download |
Rat | IT | trypsin | 4 | 0 | 4 | Browse | Download | Download |
Bovine | IT | none | 0 | 145 | 145 | Browse | Download | Download |
Leech | QTOF | none | 0 | 1e | 1 | Browse | Download | Download |
a, b, c NIST library spectra (NIST_human_IT_2010-01-14, NIST_human_QTOF_2010-03-02, NIST_mouse_IT_2009-12-10, and NIST_rat_IT_v3.0_2009-05-21)
d Gupta et al., (2010) Mass Spectrometry-Based Neuropeptidomics of Secretory Vesicles from Human Adrenal Medullary Pheochromocytoma Reveals Novel Peptide Products of Prohormone Processing. J. Proteome Res., 9, 5065-5075.
e Bruand. et al., (2011) Automated Querying and Identification of Novel Peptides using MALDI Mass Spectrometric Imaging. J. Proteome Res., Article ASAP
Neuropeptide spectral libraries
527 neuropeptide spectra for unique peptide/charge-state pairs.
Download the spectral libraries and spectral libraries + decoy libraries in High Quality (HQ) and Low Quality (LQ).
Species | Instrument | Enzyme | Number of spectra from NIST | Number of spectra from In-house | Total Number of spectra for unique peptide/charge-state pairs | Total Number of spectra in HQ spectral library | Browse HQ spectra | HQ Spectral library | HQ Spectral library + Decoy library | Total Number of spectra in LQ spectral library | Browse LQ spectra | LQ Spectral library | LQ Spectral library + Decoy library |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Human | IT | trypsin | 296 | 68 | 364 | 303 | Browse | Download | Download | 61 | Browse | Download | Download |
Human | IT | v8 | 0 | 53 | 53 | 24 | Browse | Download | Download | 29 | Browse | Download | Download |
Human | IT | none | 0 | 121 | 121 | 54 | Browse | Download | Download | 67 | Browse | Download | Download |
Human | QTOF | trypsin | 37 | 109 | 146 | 50 | Browse | Download | Download | 96 | Browse | Download | Download |
Human | QTOF | v8 | 0 | 69 | 69 | 14 | Browse | Download | Download | 55 | Browse | Download | Download |
Human | QTOF | none | 0 | 44 | 44 | 13 | Browse | Download | Download | 31 | Browse | Download | Download |
Mouse | IT | trypsin | 60 | 0 | 60 | 42 | Browse | Download | Download | 18 | Browse | Download | Download |
Rat | IT | trypsin | 4 | 0 | 4 | 2 | Browse | Download | Download | 2 | Browse | Download | Download |
Bovine | IT | none | 0 | 33 | 33 | 24 | Browse | Download | Download | 9 | Browse | Download | Download |
Leech | QTOF | none | 0 | 1 | 1 | 1 | Browse | Download | Download | 0 | Browse | Download | Download |
All | IT | All | 360 | 275 | 635 | 449 | Browse | Download | Download | 186 | Browse | Download | Download |
All | QTOF | All | 37 | 223 | 260 | 78 | Browse | Download | Download | 182 | Browse | Download | Download |
Browsing NeuroPedia Spectral library

- a. By 'Filename', 'Scan', 'Peptide', 'Protein', 'Charge', 'Score', 'ΔScore', 'FDR', 'PepFDR' and 'FPR' at the top of the table,
- - Sort by search result values (letter or number) using the two upper and lower small triangles next to column titles.
- - Filter by specific values using the text boxes below column headers.
- b. It is also possible to use 'checked only' to show only spectral selected in the first column of table.


Searching your data using NeuroPedia
M-SPLIT search  +  FDR:
- Download the NeuroPedia spectra library + decoy library in the 'HQ Spectral library +Decoy library' or 'LQ Spectral library +Decoy library' column in the above 'Neuropeptide spectral libraries' table.    
- Download M-SPLIT package from here: [http://proteomics.ucsd.edu/Software/MSPLIT/#Downloads]
- Run M-SPLIT as follows:   java –Xmx800M –jar MSPLIT_v1.0.jar  < NeuroPedia library file >  < Your spectrum file >  < precursor mass tolerance >  < output file > 
- After the search, M-SPLIT uses a SVM to determine whether a match is significant:  - SVM classification is done using the svm-light package.
- For more detailed information about M-SPLIT please see: [http://proteomics.ucsd.edu/Software/MSPLIT/]
- This will search the library and find the spectra in the library that best match to each query spectrum.
- < Your spectrum file  >  can be in mgf or mzXML file format.
- < precursor mass tolerance  > is in Da units. Usually one should use a relative large tolerance like 2 Da to allow for identification even if the data was acquired with high accuracy MS survey scans. (to allow for 13C errors in determination of monoisotopic masses)
- < output file > is in text file format. If the output file parameter is omitted, then the output will be written to standard out.
- The binaries can be obtained at http://svmlight.joachims.org/
- Place the binary files in the appropriate M-SPLIT directory depending on which operating system is being used: svm_learn and svm_classify in the svm_light_linux or svm_light_window subdirectories of the M-SPLIT directory
- Use spectrumMatchClassify.pl script to score spectrum/spectra matches as well as estimate FDR by the target/decoy method and run the script as follows:
./spectrumMatchClassify.pl  < search result file >    < filteredOutputFile >  < FDR > 
- < FDR > is a number between zero and one, i.e. to enforce 1% FDR use 0.01.
InsPecT Search:
- Download sequence FASTA files in the above 'FASTA files per species' table in the 'Neuropeptide sequence databases' section.
- Go to the CCMS website at http://proteomics.ucsd.edu/ProteoSAFe
- Log on to the website. If you don't have an account, it is free and straightforward to register using the link on the top right
- In the 'Tool Selection',
- a. Select 'InsPecT' as the search tool.
- b. Upload spectrum and database files by clicking 'Select Input Files'.
- - In the 'Upload Files' tab, upload one or more of your spectrum files and the FASTA files downloaded from NeuroPedia.
- - In the 'Select Input Files' tab, select the spectrum files on the left panel of 'Select Input Files' and clicking the small button of 'Select Spectrum Files' to add them to the right panel of 'Selected files'.
- - Also select the FASTA files on the left panel and click the small button of 'Select Sequence Files' to add them to the right panel.
- - Click 'Finish Selection' if you finish selecting all files for searching.
- c. Describe shortly about your search.
- d. Adjust your search parameters in 'Instrument', 'Cysteine protecting group', 'Protease', 'Parent mass tolerance', and 'Ion tolerance'.
- In the 'Allowed Post-Translational Modifications', consider your choices of Post-Translational Modifications.
- In 'More options', check 'Include common contaminants' and set 'Spectrum-Level FDR' according to your False Discovery Rate.
- Click the 'Search' button at the bottom of the page.
- The page will continue to be updated automatically until the search completes. You will also receive an email notification at your registered email address once the search is finished or fails for some unexpected reason.


