A computational investigation of solubility, functionality and the adaptation in subcellular compartments of proteins

UoM administered thesis: Doctoral Thesis

  • Authors:
  • Pedro Chan


A cell is considered to be the smallest unit of life. It carries out a variety of biochemical reactions through the activities of proteins and protein enzymes. In order to perform functions, proteins must be in their native folded state together with the correct environmental conditions. A slight change in pH or temperature could cause disruption to the electrostatic interactions within the protein, thus leading to conformational change and the loss of activity. Studies have shown that solubility could be enhanced by increasing the number of charges on the protein surface. And from the studies of extremophiles, we learned that the presence of non-polar aromatic residues could be a key for thermostable proteins. Thus, charges are important to determine the function and adaptation of proteins.Over the decades, large amount of protein sequence and structure information relating to molecular biology has been produced. By employing algorithms, computational and statistical techniques, it is possible to analyse these data to solve biological problems. Often these investigations are based mainly on sequences since their numbers outstrip the number of available structures. However, adding structures would allow us to investigate problems such as the relationship between charges, sequence, structure and functions, which is the aim of this study.In this thesis, the relationships between proteins and function were examined by various electrostatic features derived from charges and also geometric properties from structures. One interesting finding is that the averaged value of pH of maximum stability of proteins within a subcellular location was highly correlated to the pH of that subcellular compartment, which was due to pKas (of histidines), and their locations on the proteins. We also found that the size of the largest non-charged patch on the protein surface correlates with solubility and provides a predictor with a maximum accuracy of 76%. The use of novel charge-based methods shows little improvement in distinguishing between enzymes and non-enzymes. However, the method of using real charges with grid size of 1 angstrom has paved a way into the idea of using charges and dipoles pattern from enzyme active site to distinguish different enzymes. Finally, a web-tool for displaying conserved residues on 3D protein structure is made available to the public for identifying residues that may be of functional importance.


Original languageEnglish
Awarding Institution
Award date1 Aug 2012