PeptideAtlas builds are available for download as a group of three different file formats, each containing some unique information not found in the other file formats.
The consensus spectral libraries are exported from SpectraST in sptxt format.
They are essentially compatible with the more widely known MSP format. If your spectral library software does not import sptxt files directly, the files can be renamed to .msp and it is likely that your software can import them then.
A file of sequences from the PeptideAtlas build where the first line is ">" followed by the PeptideAtlas accession and the second line is the peptide amino-acid sequence.
>PAp0000001The FASTA file that we map the peptide sequences to.
It includes the database (including decoys) we use to do the searches, and may includes the ensemble database if available.
A file containing the peptide accession, and the position of the peptide relative to protein start (CDS coordinates).
Column # | Field |
---|---|
1. | PeptideAtlas accession |
2. | Sequence length |
3. | Protein accession |
4. | Length of sequence match |
5. | % Identity (=% match of sequence) |
6. | Start of sequence in protein CDS |
7. | End of sequence in protein CDS |
8. | Difference between sequence and matched sequence |
A file containing the peptide accession, the peptide's position within a protein relative to protein start (CDS coordinates), and it's chromosomal coordinates.
Column # | Field |
---|---|
1. | PeptideAtlas accession |
2. | Sequence length |
3. | Protein accession |
4. | Length of sequence match |
5. | % Identity (=% match of sequence) |
6. | Start of sequence in protein CDS |
7. | End of sequence in protein CDS |
8. | Residue prior |
9. | Residue after |
10. | Difference between sequence and matched sequence |
11. | Chromosome |
12. | Chromosome start location |
13. | Chromosome end location |
14. | Transcript accession |
15. | Gene accession |
This file contains the contents of the various tables in the Peptide Atlas schema for a specified build, exported in an XML format.
This format is suitable for loading into a preexisting PeptideAtlas schema using the SBEAMS DataImport.pl script. This format can be loaded into mysql or MS SQL Server databases, and could possibly be used for others.
Peptide Atlas schema and data for a given build exported using mysqldump utility.
The data can be loaded into an empty mysql instance with the mysql command-line utility as follows:
mysql -u username -D database < PA_export.mysql
This greatly accelerates loading the data info a mysql instance
relative to the xml format above, but would probably require some changes
to work with a database other than mysql (SQL dialect issues)