Improved Descriptors for the QSAR Modeling of Peptides and Proteins

Research output: Contribution to journalArticle


The ability to model the activity of a protein using Quantitative Structure Activity Relationships (QSAR) requires descriptors for the 20 naturally coded amino acids. In this work we show that by modifying some established descriptors we were able to model the activity data of 140 mutants of the enzyme epoxide hydrolase with improved accuracy. These new descriptors (referred to as Physical descriptors) also gave very good results when tested against a series of four dipeptide datasets. The Physical descriptors encode the amino acids using only two orthogonal scales: the first is strongly linked to hydrophillicity/hydrophobicity, and the second to the volume of the amino acid residue. The use of these new amino acid descriptors should result in simpler and more readily interpretable models for the enzyme activity (and potentially other functions of interest; e.g., secondary and tertiary structure) of peptides and proteins.

Bibliographical metadata

Original languageEnglish
JournalJournal of Chemical Information and Modeling
Early online date16 Jan 2018
Publication statusPublished - 2018