A bioinformatic toolbox for postprocessing of MASCOT results and its application to the proteome of Halobacterium salinarum
Carolina Garcia-Rizo1, Cristian Klein2, Pfeiffer, Siedler, Oesterhelt
1rizo@biochem.mpg.de, Max-Planck Institute Biochemistry; 2klein@biochem.mpg.de, Max-Planck Institute Biochemistry
Proteome analysis validates that a predicted reading frame corresponds to a real protein. This is required for Halobacterium salinarum, where severe overprediction open reading frames (Orfs) occurs as a result of a scarcity in stop codons. As a consequence in 87% of the genome more than one reading frame is open for more than one orf. Identification by peptide fingerprint analysis is based on matching experimental and theoretical peptide masses. The reliable identification rises by increasing the number of matching peptides.
A bioinformatic toolbox for post-processing the proteomic results was developed in order to detect additional matching peptides. It scans for (a) post-translational modifications, (b) incorrect start codon assigment and (c) missed cleavage sites.
The same algorithm also permits to focus on samples of interest and of a huge data set. In adition to post-translational modifications, we focus on (a) low score identifications, (b) N-terminal peptides. Peaks from these samples will be submited for PSD (post-source decay) analysis which provides a sequence tag and thus validates the identification. PSD experiments can be started while the samples are still in the mass spectrometer.
This approach illustrate a close interconection between experimental and bioinformatic procedures, with the objective of rapidly processing and enhance proteomic data. Results are stored in HaloLex, a database of biological information for Halobacterium salinarum . The database permits to view genomics, proteomics and biochemical data in an integrated way.