java -jar GenoMS.jarRequired Parameters:
-i [FILE] The input configuration file. More details can be found below -o [FILE] The output fileOptional Parameters:
-r [DIR] The resource directory to look for things like DBs -k [DIR] Project directory to write all output -e [DIR] An execution directory where all exectubles can be found (e.g. InsPecT, MSGF.jar, PepNovo). This is only needed if the full path is not provided in the configuration file. -x Force a rerun database searches -p Use a missing peak penalty when scoring HMM Match states -s Require all2all similarity for Multiple Spectrum Alignment -a Continue to add eligible match states, even after all spectra are added to HMM -w [NUM] FDR cutoff for peptide-spectrum matches -f Search database allowing mutations -l [FILE] Write output to a log file (default is to stdout)
inspectspectra,SPECTRUM_FILE,INSPECT_INPUT_FILEOR spectra,SPECTRUM_FILE |
You may use either InsPecT or MSGF-DBto perform peptide identification. If you are using InsPecT, you must use the inspectspectraparameter and specify both the full path to the file and the path to the input file to InsPecT for that file. See Preprocessing for more information on creating this file. If you are using MSGF-DB, then use the spectraparameter, and you need only specify the full path to the spectra. |
prms,PRM_FILE_NAMEOR pepnovoexec,PEPNOVO_EXECUTABLEAND pepnovomodeldir,PEPNOVO_MODEL_DIR |
For each PRM file created by PepNovo, you must specify the full path to the file. For more information, see Preprocessing. Alternatively, you can specify the path to PepNovo, and GenoMS will generate the PRMs using the default parameters and those specified in the config file. Specifically, files are assumed to be from an LTQ machine, and the digest is inferred from the spectrum file name (must include 'tryp' or 'trypsin' to be identified as a tryptic digest). |
inspectpath,PATH_INSPECT_EXECUTABLEOR msgfdbpath,PATH_TO_MSGFDB_JAR_FILE |
If you plan to use InsPecT for peptide identification, you must compile InsPecT and provide the full path to the InsPecT executable. Alternatively, you may use MSGF-DB and must provide the full path to the MSGF-DB jar file. Depending on which database search tool you plan to use, the spectrum parameter will change. |
genomeseq,GENOME_FILEOR dbrootname,ROOT_NAMEOR templateconstraintfile,FILE_NAMEAND dbcombined,DB_FILE_NAME |
One of three different types of databases must be specified; a genome sequence, the root name of a set of sequence files for further construction, or a fully constructed template database file. If a fully constructed template database is provided, the constraint file is also needed. For more details about these database types see Template Databases |
contaminants,COMMON_CONTAMINANTS_FILE |
A database file (FASTA format) of common contaminants. |
fixedmod,AA,MASS |
A modification which occurs on all of the specified amino acids (e.g. fixedmod,C,57) |
msgfdir,DIR |
The directory containing the MSGF.jar file for rescoring using MSGF. MSGF is only used with running InsPecT. It should not be used with MSGF-DB. |
tolerance_pm,NUM |
The parent mass tolerance in Daltons of the mass spectra (Default is 3.0 Da). |
tolerance_peak,NUM |
The fragment ion tolerance in Daltons of the mass spectra (Default is 0.5 Da). |
digest,STRING |
Specifies the protease used for digestion. Accepted case-insensitive values are trypsin, chymotrypsin, other. If no digest is specified, then it is guessed from the spectrum file name. |
java -jar CreateConfigFile.jar version 2010.12.07 Creates the config file to be input to GenoMS given the set of spectra. -s [FILE or DIR] Spectrum file(s) to be used in the experiment -x [DIR] Path to InsPecT executable -o [FILE] Config file to writeMust include one of the following template database options (See Template Databases):
-c [FILE] Combined database name (either FASTA or trie) AND -t [FILE] Template constraint file -n [DBName] Prefix name of the DB to be created -g [FILE] FASTA file containing genomic data to create a 6frame DBOptional parameters:
(-d [FILE] Contaminants file) (-m [STRING] Fixed modification of the form C+99, or M-16. Can be used multiple times to specify multiple fixed modifications. These are modifications which occur on all instances of the amino acid such as a cysteine protecting group. (-p [FILE or DIR] File or Directory containing the PRM files that have already been generated) (-i [FILE or DIR] Directory containing the InsPecT input files for the spectra) (-f [DIR] Model directory for PepNovo (Note: You must specify either the PepNovo directory and executable, or a directory of PRM spectra with -p)) (-r [FILE] Executable for PepNovo (-q [DIR] Directory containing MSGF jar file (-a [NUM] Parent mass tolerance (Da) (Default 3.0 Da)) (-e [NUM] Fragment mass tolerance (Da) Default 0.5 Da)) (-h [trypsin/chymotrypsin/other] Enzyme used for digestion (Default: infer tryptic/non-tryptic from spectrum file names)
-r [DIR] |
The resource directory to look for things like DBs |
-x |
Force a rerun Inspect searches. The default behavior is to only re-run InsPecT if the results files are missing |
-p |
Use a missing peak penalty when scoring HMM Match states. The penalty is equal to the log likelihood of the average PRM score. |
-s |
Require all2all similarity for Multiple Spectrum Alignment. This will reduce the number of false positive extensions, but may significantly reduce the final predicted protein length. |
-a |
Continue to add eligible match states, even after all spectra are added to the HMM. This is a useful option if your spectral dataset is fairly small and you expect little overlap of peptides. |
-w [NUM] |
Peptide-spectrum match false discovery rate (FDR) cutoff for InsPecT results (or rescored MS-GF results). Default cut-off is 0.01 |
-f |
Search database allowing a single amino acid mutation per peptide |