A method for independent estimation of false localisation rate for phosphoproteomics



Ramsbottom, Kerry ORCID: 0000-0002-7432-9293, Prakash, Ananth ORCID: 0000-0001-5799-9618, Riverol, Yasset Perez, Camacho, Oscar Martin, Martin, Maria, Vizcaíno, Juan Antonio, Deutsch, Eric and Jones, Andrew ORCID: 0000-0001-6118-9327
(2021) A method for independent estimation of false localisation rate for phosphoproteomics. 2021.10.18.464791-.

Access the full-text of this item by clicking on the Open Access link.

Abstract

Phosphoproteomics methods are commonly employed in labs to identify and quantify the sites of phosphorylation on proteins. In recent years, various software tools have been developed, incorporating scores or statistics related to whether a given phosphosite has been correctly identified, or to estimate the global false localisation rate (FLR) within a given data set for all sites reported. These scores have generally been calibrated using synthetic data sets, and their statistical reliability on real datasets is largely unknown. As a result, there is considerable problem in the field of reporting incorrectly localised phosphosites, due to inadequate statistical control. In this work, we develop the concept of using scoring and ranking modifications on a decoy amino acid, i.e. one that cannot be modified, to allow for independent estimation of global FLR. We test a variety of different amino acids to act as the decoy, on both synthetic and real data sets, demonstrating that the amino acid selection can make a substantial difference to the estimated global FLR. We conclude that while several different amino acids might be appropriate, the most reliable FLR results were achieved using alanine and leucine as decoys, although we have a preference for alanine due to the risk of potential confusion between leucine and isoleucine amino acids. We propose that the phosphoproteomics field should adopt the use of a decoy amino acid, so that there is better control of false reporting in the literature, and in public databases that re-distribute the data. Data are available via ProteomeXchange with identifier PXD028840.

Item Type: Article
Uncontrolled Keywords: Generic health relevance
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Institute of Systems, Molecular and Integrative Biology
Depositing User: Symplectic Admin
Date Deposited: 07 Jul 2022 14:00
Last Modified: 14 Mar 2024 18:37
DOI: 10.1101/2021.10.18.464791
Open Access URL: https://pubs.acs.org/doi/10.1021/acs.jproteome.1c0...
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3157948