≡   PeptideAtlas Links Seattle Proteome Center

PeptideAtlas: Home Overview Contacts Publications Software Database Schema Feedback Funding FAQ
Atlas Data: Data Repository HPPP Data Central PeptideAtlas Builds PeptideAtlas Exports THISP Search Database
Contribute Data

Related: SRMAtlas PASSEL SWATHAtlas
Spectral Libs: Libraries + Info SpectraST Search

Glossary/Terms: Atlas nomenclature Protein ID terms

LOG IN
MENU
Log In

PeptideAtlas Tiered Human Integrated Search Proteome

In order to provide human proteomics MS/MS search databases that are well defined, comprehensive, and frequently updated, we have developed an automated system that integrates all of major sources of human protein sequences into a set of search databases. These databases are tiered into several levels of complexity from which researchers may choose depending on the goal of the experiment and the data processing resources available.

Description of the Databases

On the first of every month, all protein lists are pulled down from their original sources. If any of them have changed, they are integrated according to the description in Deutsch et al. (submitted) and released here. If none of the source databases have changed, there is no new release. Briefly, the individual levels are as follows:

Level 1 Includes only the core ~20,000 primary isoforms from neXtProt (nP), Universal Protein Contaminants
Level 2 Level 1 plus all ~22,000 "varplic" alternative splice isoforms from neXtProt (nP), immunoglobulin variable region sequences from Swiss-Prot and IMGT.
Level 3 Level 2 plus UniProt "UP000005640" and additional non-redundant sequences from other small sources including microbes, external contributions, and additional RefSeq XP sequences.
Level 4 A "kitchen sink" database that includes Level 3 plus all other distinct sequences from UniProtKB/TrEMBL and RefSeq XP that are not already present in lower levels.

Listing of All Source Databases

Database Date # Entries Level 1 Level 2 Level 3 Level 4
neXtProt2023-10-0242,38220,38942,38242,38242,382
Swiss-Prot canonical2024-04-0120,419424242
Swiss-Prot + varsplic2024-04-0142,49942129129
UP0000056402024-04-01104,5734262,20362,203
UniProtKB + TrEMBL2024-04-01226,1434262,203183,773
NCBI RefSeq NP2024-04-0167,65013,70713,038
NCBI RefSeq XP2024-04-01131,64051,127
IMGT2024-04-01706706706706
Microb2024-04-011,3981,3981,398
Contrib2024-04-01702,058702,058702,058
Contaminant2023-03-21499299299299299
# Entries20,68843,429822,767994,795

Download THISP Databases

Below are the monthly releases of the THISP databases available for download. The "Base" is the set of Level 1-4 FASTA files (target and target-decoy). The "Components" is the set of all individual source components (from neXtProt, RefSeq, IMGT, cRAP, etc.) used to make the FASTA files in "Base", as described in the THISP article.

2024-04-012024-02-012024-01-012023-12-012023-11-012023-10-012023-09-012023-08-01
2023-07-012023-06-012023-05-012023-04-012023-03-012023-02-012023-01-012022-12-01
2022-11-012022-10-012022-09-012022-08-012022-07-012022-06-012022-05-012022-04-01
2022-03-012022-02-012022-01-012021-12-012021-10-012021-09-012021-08-012021-07-01
2021-06-012021-05-012021-04-012021-03-012021-02-012021-01-012020-12-012020-11-01
2020-10-012020-09-012020-08-012020-07-012020-06-012020-05-012020-04-012020-03-01
2020-02-012020-01-012019-12-012019-10-012019-09-012019-08-012019-07-012019-06-01
2019-05-012019-04-012019-03-012019-02-012019-01-012018-12-012018-11-012018-10-01
2018-09-012018-08-012018-07-012018-06-012018-05-012018-04-012018-03-012018-02-01
2018-01-012017-12-012017-11-012017-10-012017-09-012017-08-012017-07-012017-06-01
2017-05-012017-04-012017-03-012017-02-012017-01-012016-12-012016-11-012016-10-01
2016-09-012016-08-012016-07-012016-06-012016-05-012016-04-062016-03-012016-02-01
2016-01-012015-12-012015-11-012015-10-01

Cite

If you use this database, please cite us:

General purpose citation: Deutsch et al., "Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics", J Proteome Res. Author manuscript; available in PMC 2016 Nov 4.