Information Extraction from Pharmaceutical Literature

UoM administered thesis: Phd


With the constantly growing amount of biomedical literature, methods for automatically distilling information from unstructured data, collectively known as information extraction, have become indispensable. Whilst most biomedical information extraction efforts in the last decade have focussed on the identification of gene products and interactions between them, the biomedical text mining community has recently extended their scope to capture associations between biomedical and chemical entities with the aim of supporting applications in drug discovery. This thesis is the first comprehensive study focussing on information extraction from pharmaceutical chemistry literature. In this research, we describe our work on (1) recognising names of chemical compounds and drugs, facilitated by the incorporation of domain knowledge; (2) exploring different coreference resolution paradigms in order to recognise co-referring expressions given a full-text article; and (3) defining drug-target interactions as events and distilling them from pharmaceutical chemistry literature using event extraction methods.


Original languageEnglish
Awarding Institution
Award date1 Aug 2014