Searching for the exact peptides in database provided

**Describe the question or problem**
Hi there, I wish to conduct a search using MSGF+ where the algorithm only considers EXACT matches to the peptides provided in the fasta database (with a static and a dynamic modification).

For example, there are two peptide entries in the fasta file:

> \>peptide_1
> MDFYAMIHAFWLIAVLYRR
> \>peptide_2
> MDFYAMIHAFWLIAVLYR

My samples were digested with trypsin, so in my database there are only tryptic peptides (with some miscleavages that I have already included).

I am using the following settings, these are the only ones that I can think of that is relevant:
```
#Enzyme ID
#  0 means No enzyme used
#  1 means Trypsin (Default); use this along with NTT=0 for a no-enzyme search of a tryptically digested sample
#  2: Chymotrypsin, 3: Lys-C, 4: Lys-N, 5: Glu-C, 6: Arg-C, 7: Asp-N, 8: alphaLP, 9: No Enzyme (for peptidomics)
EnzymeID=9
#Number of tolerable termini
#  The number of peptide termini that must have been cleaved by the enzyme (default 1)
#  For trypsin, 2 means fully tryptic only, 1 means partially tryptic, and 0 means no-enzyme search
NTT=2
```

MSGF+ would return this result:
`sample.mzML     controllerType=0 controllerNumber=1 scan=65059  65059   HCD     537.5329        2       1.9908882       4       DFYAM+15.995IHAFWLIAVLYR   peptide_1(pre=M,post=R);peptide_2(pre=M,post=-) 101     42      2.5559657E-9    0.006229213`

My problem with this result is 2-fold:
1. DFYAMIHAFWLIAVLYR is not a peptide in the database, and I do not see an option in the config to TURN OFF M-terminal M cleaveage (while I appreciate that MSGF+ probably just tried both possibilities, I still wish to turn it off to not interfere with my FDR calculations).
2. The "protein" column of the PSM has the name of all entries in the database that contains the peptide. In my understanding of "no enzyme" digestion, only exact matches to the peptide given in the database should be made, and even if other entries also contain the peptide (and fit the trypsin digestion pattern), they should still not be listed under "protein" because that is not the fasta entry where the match is made. This is not really a problem, but a nuisance for parsing which exact entry the peptide match came from.

Do you have any suggestions on how I could modify the params file to get cleaner results? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Searching for the exact peptides in database provided #120

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Searching for the exact peptides in database provided #120

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions