115 | ProteomeXchange datasets |
369 | experiments |
70,470,125 | PSMs |
596,839 | distinct peptides |
18,267 | canonical core proteins |
The Arabidopsis PeptideAtlas provides a compendium of results from uniformly reprocessed mass spectrometry proteomics datasets.
Publicly-available Arabidopsis thaliana columbia-0 datasets were downloaded from ProteomeXchange and reprocessed from the raw files using the Trans-Proteomic Pipeline suite of tools. A publication describing the build is available.
Chromosome | Entries | Canonical | Uncertain | Redundant | Not Observed | ||||
---|---|---|---|---|---|---|---|---|---|
M | 35 | 27 | 77.1% | 5 | 14.3% | 0 | 0.0% | 3 | 8.6% |
C | 79 | 63 | 79.7% | 12 | 15.2% | 0 | 0.0% | 4 | 5.1% |
1 | 7,156 | 4,730 | 66.1% | 502 | 7.0% | 384 | 5.4% | 1,540 | 21.5% |
2 | 4,317 | 2,762 | 64.0% | 290 | 6.7% | 240 | 5.6% | 1,025 | 23.7% |
3 | 5,460 | 3,630 | 66.5% | 353 | 6.5% | 296 | 5.4% | 1,181 | 21.6% |
4 | 4,180 | 2,788 | 66.7% | 282 | 6.7% | 247 | 5.9% | 863 | 20.6% |
5 | 6,332 | 4,267 | 67.4% | 412 | 6.5% | 373 | 5.9% | 1,280 | 20.2% |
2023-10 Total | 27,559 | 18,267 | 66.3% | 1,856 | 6.7% | 1,540 | 5.6% | 5,896 | 21.4% |
Below are individual Arabidopsis thaliana PeptideAtlas builds available for download in various flat file formats. Note that not all files contain all information from the build. A build subtitled "PSM FDR=0.002" denotes a PSM FDR threshold of 0.002 (0.2%) is applied to every sample in the build.
Complete description of each of the available download formats
Filename | Size | # Sequences | Description |
---|---|---|---|
revised_mito_plastid_edited.fasta | 48KB | 114 | Protein ids and sequences after application of editing for the 79 plastid-encoded and 35 mitochondrial-encoded proteins and pseudogenes; minor frequency edits are not applied but all high frequency edits are included |
revised_mito_plastid_edited.peff | 52KB | 114 | Protein ids and their amino acid sequences with all possible variants supplied in PEFF format. PEFF allows for encoding the variants in a compact way in the file. Comet and some other search engines support PEFF. |
revised_mito_plastid_pre-edit.fasta | 48KB | 114 | Protein ids and unedited sequences (with the exception of essential edits for start and stop codons that need to be applied to generate a protein) for the 79 plastid-encoded and 35 mitochondrial-encoded proteins and pseudogenes |
revised_mito_plastid_all-editing-permutations.fasta | 1.2MB | 3,818 | Proteins ids and the >10.000 sequence variants (see Materials and Methods) to allow for complete and exhaustive MSMS data base search of all possible edits and allow for partial editing (similar as we did in this study) |
Araport11_genes.201606.pep.2.fasta | 25MB | 48,359 | Araport11 2016-06 [ link ] |
TAIR10_label_pep_20101214.2.fasta | 20MB | 35,386 | TAIR10 20101214 [ link ] |
Refseq_GCF_000001735.4.protein.2.faa | 24MB | 48,265 | Refseq 000001735.4 [ link ] |
uniprot-rename-proteome_UP000006548.2.fasta | 21MB | 39,342 | Uniprot UP000006548 [ link ] |
araport11_pseudogene_3frame_newid.2.fasta | 1.1MB | 3,720 | Araport11 from Qi Sun, Cornell University, select transcripts |
crap_GFP.fasta | 4KB | 3 | Our custom GFP contaminants Klaas Van Wijk 09-2020 |
crap_CONTAM.2.fasta | 44KB | 116 | Our PeptideAtlas custom contaminants |
CONTRIB_LW_peptides.2.fasta | 1.3MB | 16,809 | Contributed peptides LW 16809 very short peptides [ link ] |
CONTRIB_SIPs_peptides.2.fasta | 44KB | 607 | Contributed peptides SIPs 607 very short peptides [ link ] |
CONTRIB_sORFs_peptides.2.fasta | 568KB | 7,901 | Contributed peptides sORFs 7901 very short peptides [ link ] |
Arabidopsis_RNA_Edits.fasta | 20KB | 50 | RNA Edits from Joshua Heazlewood 02-2021 |
Mito_ORFS_new.fasta | 4KB | 3 | Mito ORFS from Philippe Giege |
refseq_var_sloan_30.fasta | 2.6MB | 8,672 | Sloan edits all possible permutations |
refseq_var_IS_30.fasta | 1.3MB | 3,099 | Permutations of RNA edits from Ian Small UWA |
CONTRIB_Iowa_peptides.2.fasta | 1.6MB | 7,481 | Contributed peptides from Iowa State University, Eve Wurtele |
Araport11_CORE.fasta | 14MB | 27,559 | Core proteome |
Arabidopsis_PeptideAtlas_search.fasta | 57MB | 176,064 | Full database used in the searches |
We gratefully acknowledge the support for the Arabidopsis PeptideAtlas from NSF grant 1922871 “TRTech-PGR: A PeptideAtlas for Arabidopsis thaliana and other plant species; harnessing world-wide proteomics data and mining for biological features”.