Graphical and Deep Learning Models for Natural Language Labelling Tasks with Multiple Annotators

UoM administered thesis: Phd

  • Authors:
  • Maolin Li


In many natural language processing applications, the performance of supervised machine learning models depends on the quantity and quality of the training data used to train models. Labels are usually collected from multiple expert annotators. However, manually annotating a large scale dataset is often time-consuming and costly, especially in low-resource situations and domains. A rapid and cost-effective alternative is to obtain labels through crowdsourcing. In crowdsourcing, each example is presented to multiple non-expert annotators for labelling. However, labels collected in this manner can be noisy, since some annotators can produce a significant number of incorrect labels. Moreover, when annotating the same examples in both scenarios, the annotators may produce conflicting labels, making it harder to train a model. In order to address these problems, firstly the use of active learning to reduce the number of required labelled examples was investigated for training high-performance supervised models of sequence labelling tasks. It is not necessary to ask annotators to label the whole unlabelled dataset because active learning methods present only representative and informative examples to annotators. These methods can dramatically reduce the annotation cost, in terms of annotator fees and deadlines. Various active learning criteria were explored in order to select the most representative and informative unlabelled examples. The best criterion was then applied to build a corpus of neuroscience literature. Secondly, to enhance the original active learning method, a proactive learning strategy by taking into account an annotator's varying levels of expertise and reliability was introduced. In proactive learning, the most appropriate (e.g., cost-effective) annotators are chosen to label each unlabelled example which is selected by active learning. If there is a high probability that certain annotators will provide the correct label for an unlabelled example with a lower cost than the others, then proactive learning will send this example to be annotated by these annotators. This aims to ensure a simultaneous cost saving and maintenance of the quality of the data. The strategy was evaluated by using three corpora belonging to different domains. The results demonstrated that the appropriate annotator selection strategy provided by the proactive learning method can help to build a high-quality dataset at a reasonable cost. Various natural language labelling tasks were further explored in a more complex and challenging situation, i.e., crowdsourcing where the annotators are usually non-experts and have significantly different levels of reliability. There are two remaining problems: how to automatically estimate the reliability of annotators without human intervention, and how to handle noisy and conflicting labels. To address these problems, several novel methods were proposed which involved probabilistic graphical and deep learning models. In these methods, the reliability of annotators was estimated in an unsupervised learning manner and the noisy and conflicting labels were aggregated to predict the best label for each example. In addition, the effectiveness of these methods was demonstrated by applying them to four different and important natural language labelling tasks, i.e., text classification, natural language inference, sequence labelling, and coreference resolution.


Original languageEnglish
Awarding Institution
Award date1 Aug 2021