Searches for pulsars during the past fifty years, have been characterised by two problems making their discovery difficult: i) an increasing volume of data to be searched, and ii) an increasing number of `candidate' pulsar detections arising from that data, requiring analysis. Whilst almost all are caused by noise or interference, these are often indistinguishable from real pulsar detections. Deciding which candidates should be studied is therefore difficult. Indeed it has become known as the `candidate selection problem'. This thesis presents an interdisciplinary study of the selection problem, with the aim of developing a new method able to mitigate it. Specifically for future pulsar surveys undertaken with the Square kilometre Array (SKA). Through a combination of critical literature evaluations, theoretical modelling exercises, and empirical investigations, the selection problem is described in-depth here for the first time. It is shown to be characterised by the dominance of Gaussian distributed noise signals, a factor that no existing selection method accounts for. It also reveals the presence of a significant trend in survey data rates, which suggest that candidate selection is transitioning from an off-line processing procedure, to an on-line, and real-time, decision making process. In response, a new real-time machine learning based method, the GH-VFDT, is introduced in this thesis. The results presented here show that a significant improvement in selection performance can be achieved using the GH-VFDT, which utilises a learning procedure optimised for data characterised by skewed class distributions. Whilst the principled development of new numerical features that maximise the separation between pulsars and Gaussian noise, have also greatly improved GH-VFDT pulsar recall. It is therefore concluded that the sub-optimal performance of existing selection systems, is due to a combination of poor feature design, insensitivity to noise, and an inability to deal with skewed class distributions.