Informatics tools for the analysis and assignment of phosphorylation status in proteomics

UoM administered thesis: Unknown

  • Authors:
  • Dave Lee


Presently, progress in the field of phosphoproteomics has been accelerated by mass spectrometry. This is not a surprise owing to not only the accuracy, precision and high-throughput capabilities of MS but also due to the support it receives from informaticians whom allow the automated analysis; making the task of going from a complex sample to a statistically satisfactory set of phosphopeptides and corresponding site positions with relative ease. However, the process of identifying and subsequently pinpointing the phosphorylation moiety is not straightforward and remains a challenging task. Furthermore, it has been suggested that not all phosphorylation sites are of equal functional importance, to the extent that some may even lack function altogether. Clearly, such sites will confound the efforts towards functional characterisation. The work in this thesis is aimed at these two issues; accurate site localisation and functional annotation. To address the first issue, I adopt a multi-tool approach for identification and site localisation; utilising the different underlying algorithms of each tool and thereby allowing an orthogonal perspective on the same tandem mass spectra. Doing so enhanced accuracy over any single tool by itself. The power of this multi-tool approach stemmed from its ability to not predict more true positives but rather by removal of false positives. For the second issue, I first investigated the hypothesis that those of functional consequence exhibit stronger phosphorylation-characteristic features such as the degree of conservation and disorder. Indeed, it was found that some features were enriched for the functional group. More surprisingly, there were also some that were enriched for the less-functional; suggesting their incorporation into a prediction algorithm would hinder functional prediction. With this in mind, I train and optimise several machine-learning algorithms, using different combinations of features in an attempt to (separately) improve general phosphorylation and functional prediction.


Original languageEnglish
Awarding Institution
Award date1 Aug 2015