≡   PeptideAtlas Links Seattle Proteome Center

PeptideAtlas: Home Overview Contacts Publications Software Database Schema Feedback Funding FAQ
Atlas Data: Data Repository HPPP Data Central PeptideAtlas Builds PeptideAtlas Exports THISP Search Database
Contribute Data

Related: SRMAtlas PASSEL SWATHAtlas
Spectral Libs: Libraries + Info SpectraST Search

Glossary/Terms: Atlas nomenclature Protein ID terms

LOG IN
MENU
Log In

PeptideAtlas Tiered Human Integrated Search Proteome

In order to provide human proteomics MS/MS search databases that are well defined, comprehensive, and frequently updated, we have developed an automated system that integrates all of major sources of human protein sequences into a set of search databases. These databases are tiered into several levels of complexity from which researchers may choose depending on the goal of the experiment and the data processing resources available.

Description of the Databases

On the first of every month, all protein lists are pulled down from their original sources. If any of them have changed, they are integrated according to the description in Deutsch et al. (submitted) and released here. If none of the source databases have changed, there is no new release. Briefly, the individual levels are as follows:

Level 1 Includes only the core ~20,000 primary isoforms from Swiss-Prot, Universal Protein Contaminants
Level 2 Level 1 plus all ~22,000 "varplic" alternative splice isoforms from Swiss-Prot, immunoglobulin variable region sequences from Swiss-Prot and IMGT.
Level 3 Level 2 plus GENCODE, UniProt "UP000005640" and additional non-redundant sequences from other small sources including microbes, external contributions, and additional RefSeq XP sequences.
Level 4 A "kitchen sink" database that includes Level 3 plus all other distinct sequences from UniProtKB/TrEMBL and RefSeq XP that are not already present in lower levels.

Listing of All Source Databases

Database Date # Entries Level 1 Level 2 Level 3 Level 4
Swiss-Prot canonical2024-11-0120,32020,32020,32020,32020,320
Swiss-Prot + varsplic2024-11-0140,75320,32040,75040,75040,750
GENCODE2024-11-01112,21861,50561,505
UP0000056402024-11-01102,47720,32040,75043,53943,539
UniProtKB + TrEMBL2024-11-01227,04920,32040,75043,539141,677
NCBI RefSeq NP2024-11-0167,68413,48512,766
NCBI RefSeq XP2024-11-01131,34950,941
IMGT2024-11-01711711711711
Microb2024-11-011,6081,6081,608
Contrib2024-11-01702,058702,058702,058
Contaminant2024-04-19499299299299299
# Entries20,61941,760823,219971,579

Download THISP Databases

Below are the monthly releases of the THISP databases available for download. The "Base" is the set of Level 1-4 FASTA files (target and target-decoy). The "Components" is the set of all individual source components (from neXtProt, RefSeq, IMGT, cRAP, etc.) used to make the FASTA files in "Base", as described in the THISP article.

2024-11-012024-10-012024-09-012024-08-012024-07-012024-06-012024-05-012024-04-01
2024-02-012024-01-012023-12-012023-11-012023-10-012023-09-012023-08-012023-07-01
2023-06-012023-05-012023-04-012023-03-012023-02-012023-01-012022-12-012022-11-01
2022-10-012022-09-012022-08-012022-07-012022-06-012022-05-012022-04-012022-03-01
2022-02-012022-01-012021-12-012021-10-012021-09-012021-08-012021-07-012021-06-01
2021-05-012021-04-012021-03-012021-02-012021-01-012020-12-012020-11-012020-10-01
2020-09-012020-08-012020-07-012020-06-012020-05-012020-04-012020-03-012020-02-01
2020-01-012019-12-012019-10-012019-09-012019-08-012019-07-012019-06-012019-05-01
2019-04-012019-03-012019-02-012019-01-012018-12-012018-11-012018-10-012018-09-01
2018-08-012018-07-012018-06-012018-05-012018-04-012018-03-012018-02-012018-01-01
2017-12-012017-11-012017-10-012017-09-012017-08-012017-07-012017-06-012017-05-01
2017-04-012017-03-012017-02-012017-01-012016-12-012016-11-012016-10-012016-09-01
2016-08-012016-07-012016-06-012016-05-012016-04-062016-03-012016-02-012016-01-01
2015-12-012015-11-012015-10-01

Cite

If you use this database, please cite us:

General purpose citation: Deutsch et al., "Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics", J Proteome Res. Author manuscript; available in PMC 2016 Nov 4.