Run OMSSA
Run OMSSA
Description
This script runs OMSSA on a given set of spectral files and databases.
Input files
- at least 1 spectra file (.dta | .mgf | .mzdata | .mzdata.bz2 | .mzdata.gz | .mzdata.zip | .mzml | .mzml.bz2 | .mzml.gz | .mzml.zip | .mzxml | .mzxml.bz2 | .mzxml.gz | .mzxml.zip | .xml | .xml.bz2 | .xml.gz | .xml.zip)
- at least 1 database file (.fa | .faa | .fas | .fasta | .phr | .pin | .psq)
Output files
- OMSSA results (omssa-results.csv)
Context
Synopsis
This script runs OMSSA, the Open Mass Spectrometry Search Algorithm. OMSSA produces peptide/spectral matches (PSM) from a set of input MS/MS spectra and an amino acid sequence database.
References
- Geer LY, Markey SP, Kowalak JA, Wagner L, Xu M, Maynard DM, Yang X, Shi W, Bryant SH. 2004. Open mass spectrometry search algorithm. J. proteome res 3:958-964. [PubMed]
Masses and tolerance
- Precursor ion m/z tolerance
-
Default: 2.0 Da
- Product ion m/z tolerance
-
Default: 0.8 Da
- Precursor ion search type
-
Choices: monoisotopic (default), average, N15, exact
- Product ion search type
-
Choices: monoisotopic (default), average, N15, exact
- Precursor charge dependency
- Charge dependency of precursor mass tolerance
Choices: none (default), linear
- Neutron mass add threshold
- Threshold above which the mass of neutron should be added in exact mass search
Default: 1446.94 Da
Preprocessing
- Low intensity cutoff
- Low intensity cutoff as a fraction of max peak
Default: 0.0
- Single charge window
-
Default: 20 Da
- Double charge window
-
Default: 14 Da
- Min peaks in single charge window
-
Default: 2
- Min peaks in double charge window
-
Default: 2
- Eliminate charge reduced precursors
- Eliminate charge reduced precursors in spectra
Choices: no (default), yes
- Min peak count (spectrum)
- The minimum number of m/z values a spectrum must have to be searched
Default: 4
- Min precursor spectrum match count
- Minimum number of precursors that match a spectrum
Default: 1
- Proline exception
- Id numbers of ion series to apply no product ions at proline rule at
Choices: a, b, c, x, y, z
Charge handling
- Min precursor ion charge
- Minimum precursor ion charge to search when not +1
Default: 1
- Max precursor ion charge
- Maximum precursor ion charge to search when not +1
Default: 3
- Min charge for considering multi
- Minimum precursor charge to start considering multiply charged products
Default: 3
- Peaks fraction below precursor for +1
- Fraction of peaks below precursor used to determine if spectrum is charge +1
Default: 0.95
- Determine charge +1
- Should charge +1 be determined algorithmically?
Choices: no, yes (default)
- Precursor charge determination
-
Choices: believe the input file, use a range (default)
- Max product ion charge
- Maximum product ion charge to search
Default: 2
Enzyme specification
- Missed cleavages allowed
- Number of missed cleavages allowed.
Default: 1
- Enzyme
-
Choices: Trypsin, Arg-C, CNBr, Chymotrypsin, Formic Acid, Lys-C, Lys-C, no P rule, Pepsin A, Trypsin+CNBr, Trypsin+Chymotrypsin, Trypsin, no P rule, Whole protein, Asp-N, Glu-C, Asp-N+Glu-C, Top-Down, Semi-Tryptic, No Enzyme, Chymotrypsin, no P rule, Asp-N (DE), Glu-C (DE)
- Min peptide length for NE and ST searches
- Minimum length of peptides for no-enzyme and semi-tryptic searches
Default: 4
- Max peptide length for NE and ST searches
- Maximum length of peptides for no-enzyme and semi-tryptic searches (0: none)
Default: 40
Ions to search
- Ions to search
- Ions to search
Choices: a, b, c, x, y, z
- Search c-term ions
- Should c terminus ions be searched?
Choices: yes (default), no
- Max ions to search per series
- Max number of ions in each series being searched (0: all)
Default: 100
- Search first forward b1 product ions
- Should first forward (b1) product ions be in search?
Choices: yes, no (default)
Sequence library
- Taxonomy IDs to search
- comma delimited list of taxids to search (0 = all)
Default: 0
- Cleave n-term methionine
-
Choices: yes (default), no
- Fixed modifications
-
Choices: 2-amino-3-oxo-butanoic acid T, Asparagine HexNAc, Asparagine dHexHexNAc, CAMthiopropanoyl K, ICAT heavy, ICAT light, M cleavage from protein n-term, MMTS on C, NEM C, NIPCAM, O18 on peptide n-term, PNGasF in O18 water, SeMet, Serine HexNAc, Threonine HexNAc, acetylation of K, acetylation of protein n-term, amidation of peptide c-term, arginine to ornithine, beta elimination of S, beta elimination of T, beta methythiolation of D, beta-methylthiolation of D, carbamidomethyl C, carbamylation of K, carbamylation of n-term peptide, carboxyamidomethylation of D, carboxyamidomethylation of E, carboxyamidomethylation of H, carboxyamidomethylation of K, carboxykynurenin of W, carboxymethyl C, citrullination of R, deamidation of N and Q, dehydro of S and T, di-O18 on peptide n-term, di-iodination of Y, di-methylation of K, di-methylation of R, di-methylation of peptide n-term, farnesylation of C, fluorophenylalanine, formylation of K, formylation of peptide n-term, formylation of protein n-term, gamma-carboxylation of D, gamma-carboxylation of E, geranyl-geranyl, glucuronylation of protein n-term, glutathione disulfide, guanidination of K, heavy arginine-13C6, heavy arginine-13C6-15N4, heavy lysine – 13C6 15N2, heavy lysine – 2H4, heavy lysine-13C6, homoserine, homoserine lactone, hydroxylation of Y, hydroxylation of D, hydroxylation of F, hydroxylation of K, hydroxylation of N, hydroxylation of P, iTRAQ114 on K, iTRAQ114 on Y, iTRAQ114 on nterm, iTRAQ115 on K, iTRAQ115 on Y, iTRAQ115 on nterm, iTRAQ116 on K, iTRAQ116 on Y, iTRAQ116 on nterm, iTRAQ117 on K, iTRAQ117 on Y, iTRAQ117 on nterm, iodination of Y, lipoyl K, methyl C, methyl H, methyl N, methyl R, methyl ester of D, methyl ester of E, methyl ester of S, methyl ester of Y, methyl ester of peptide c-term, methylation of D, methylation of E, methylation of K, methylation of Q, methylation of peptide c-term, methylation of peptide n-term, methylation of protein n-term, myristoleylation of G, myristoyl-4H of G, myristoylation of K, myristoylation of peptide n-term G, n-acyl diglyceride cysteine, n-formyl met addition, oxidation of C to cysteic acid, oxidation of C to sulfinic acid, oxidation of F to dihydroxyphenylalanine, oxidation of H, oxidation of H to D, oxidation of H to N, oxidation of M, oxidation of P to pyroglutamic acid, oxidation of W, oxidation of W to formylkynurenin, oxidation of W to hydroxykynurenin, oxidation of W to kynurenin, oxidation of W to nitro, oxidation of Y to nitro, palmitoylation of C, palmitoylation of K, palmitoylation of S, palmitoylation of T, phosphopantetheine S, phosphorylation of S, phosphorylation of S with ETD loss, phosphorylation of S with prompt loss, phosphorylation of T, phosphorylation of T with ETD loss, phosphorylation of T with prompt loss, phosphorylation of Y, phosphorylation with neutral loss on C, phosphorylation with neutral loss on D, phosphorylation with neutral loss on H, phosphorylation with neutral loss on S, phosphorylation with neutral loss on T, phosphorylation with prompt loss on Y, propionamide C, propionyl heavy K, propionyl heavy peptide n-term, propionyl light K, propionyl light on peptide n-term, pyridyl K, pyridyl peptide n-term, pyro-cmC, pyro-glu from n-term E, pyro-glu from n-term Q, s-pyridylethylation of C, sulfation of Y, sulphone of M, sumoylation of K, thioacylation of K, thioacylation of peptide n-term, tri-deuteromethylation of D, tri-deuteromethylation of E, tri-deuteromethylation of peptide c-term, tri-iodination of Y, tri-methylation of K, tri-methylation of R, tri-methylation of protein n-term, ubiquitinylation residue
- Variable modifications
-
Choices: 2-amino-3-oxo-butanoic acid T, Asparagine HexNAc, Asparagine dHexHexNAc, CAMthiopropanoyl K, ICAT heavy, ICAT light, M cleavage from protein n-term, MMTS on C, NEM C, NIPCAM, O18 on peptide n-term, PNGasF in O18 water, SeMet, Serine HexNAc, Threonine HexNAc, acetylation of K, acetylation of protein n-term, amidation of peptide c-term, arginine to ornithine, beta elimination of S, beta elimination of T, beta methythiolation of D, beta-methylthiolation of D, carbamidomethyl C, carbamylation of K, carbamylation of n-term peptide, carboxyamidomethylation of D, carboxyamidomethylation of E, carboxyamidomethylation of H, carboxyamidomethylation of K, carboxykynurenin of W, carboxymethyl C, citrullination of R, deamidation of N and Q, dehydro of S and T, di-O18 on peptide n-term, di-iodination of Y, di-methylation of K, di-methylation of R, di-methylation of peptide n-term, farnesylation of C, fluorophenylalanine, formylation of K, formylation of peptide n-term, formylation of protein n-term, gamma-carboxylation of D, gamma-carboxylation of E, geranyl-geranyl, glucuronylation of protein n-term, glutathione disulfide, guanidination of K, heavy arginine-13C6, heavy arginine-13C6-15N4, heavy lysine – 13C6 15N2, heavy lysine – 2H4, heavy lysine-13C6, homoserine, homoserine lactone, hydroxylation of Y, hydroxylation of D, hydroxylation of F, hydroxylation of K, hydroxylation of N, hydroxylation of P, iTRAQ114 on K, iTRAQ114 on Y, iTRAQ114 on nterm, iTRAQ115 on K, iTRAQ115 on Y, iTRAQ115 on nterm, iTRAQ116 on K, iTRAQ116 on Y, iTRAQ116 on nterm, iTRAQ117 on K, iTRAQ117 on Y, iTRAQ117 on nterm, iodination of Y, lipoyl K, methyl C, methyl H, methyl N, methyl R, methyl ester of D, methyl ester of E, methyl ester of S, methyl ester of Y, methyl ester of peptide c-term, methylation of D, methylation of E, methylation of K, methylation of Q, methylation of peptide c-term, methylation of peptide n-term, methylation of protein n-term, myristoleylation of G, myristoyl-4H of G, myristoylation of K, myristoylation of peptide n-term G, n-acyl diglyceride cysteine, n-formyl met addition, oxidation of C to cysteic acid, oxidation of C to sulfinic acid, oxidation of F to dihydroxyphenylalanine, oxidation of H, oxidation of H to D, oxidation of H to N, oxidation of M, oxidation of P to pyroglutamic acid, oxidation of W, oxidation of W to formylkynurenin, oxidation of W to hydroxykynurenin, oxidation of W to kynurenin, oxidation of W to nitro, oxidation of Y to nitro, palmitoylation of C, palmitoylation of K, palmitoylation of S, palmitoylation of T, phosphopantetheine S, phosphorylation of S, phosphorylation of S with ETD loss, phosphorylation of S with prompt loss, phosphorylation of T, phosphorylation of T with ETD loss, phosphorylation of T with prompt loss, phosphorylation of Y, phosphorylation with neutral loss on C, phosphorylation with neutral loss on D, phosphorylation with neutral loss on H, phosphorylation with neutral loss on S, phosphorylation with neutral loss on T, phosphorylation with prompt loss on Y, propionamide C, propionyl heavy K, propionyl heavy peptide n-term, propionyl light K, propionyl light on peptide n-term, pyridyl K, pyridyl peptide n-term, pyro-cmC, pyro-glu from n-term E, pyro-glu from n-term Q, s-pyridylethylation of C, sulfation of Y, sulphone of M, sumoylation of K, thioacylation of K, thioacylation of peptide n-term, tri-deuteromethylation of D, tri-deuteromethylation of E, tri-deuteromethylation of peptide c-term, tri-iodination of Y, tri-methylation of K, tri-methylation of R, tri-methylation of protein n-term, ubiquitinylation residue
Tweaks
- Min matching peaks (database)
- The minimum number of m/z matches a sequence library peptide must have for the hit to the peptide to be recorded
Default: 2
- Min matching peaks (spectrum)
- Number of m/z values corresponding to the most intense peaks that must include one match to the theoretical peptide
Default: 6
- Use memory mapped sequence libraries
-
Choices: no (default), yes
- Maximum hit list size
-
Default: 30
- Mass ladders per database peptide
- The maximum number of mass ladders to generate per database peptide
Default: 128
- Use correlation correction
-
Choices: yes, no
- Auto mass tolerance adjustment
- Automatic mass tolerance adjustment fraction
Default: 1.0
- Consecutive ion probability
- Probability of consecutive ion (used in correlation correction)
Default: 0.5
Result handling
- e value cutoff
- The maximum evalue allowed in the hit list
Default: 1.0
- Report spectra and search parameters
- Include spectra and search parameters in search results
Choices: no (default), yes
Tweaks
- Spectrum batch size
-
Default: 2000
- Search threads
- Number of search threads to use. 0 means autodetect.
Default: 0
Source code
run-omssa.rb, run-omssa.yaml (GitHub)