Robust solutions to image classification is a challenging task since approaches should be able to successfully discriminate between the different classes, whilst being able to generalise across a large amount of intra-class variations. In an extension of image classification to the temporal domain, video classification aims to assign accurate human action labels to video sequences. Recently deep learning and in particular the convolutional neural network (CNN) has made great strides in many computer vision and machine learning tasks. CNNs implicitly learn data-specific hierachies of salient features with multiple levels of abstraction. However, the increased capacity of these convolutional networks require vast labelled datasets in order to optimise their parameters. Unsupervised learning offers potential solutions to this problem as it doesn't require labels and can simply learn the structure of data. In this work, the current state-of-the-art convolutional networks for both image and video classification and alternative strategies for feature learning using unsupervised learning are investigated. In particular the use of the self organising map (SOM) to learn unsupervised features, to be used independently or in conjunction with other supervised feature learning methods, in the application of image and video classification, is explored. Firstly, the versatile nature of SOMs is exploited to extend and improve a simple multi-layer unsupervised architecture inspired by PCANet named SOMNet; secondly SOMNet is further extended via the proposal of novel unsupervised feature aggregation layers; thirdly SOMs are used as fixed lower layer weights of CNNs in a novel approach to deep learning pre-training. Comprehensive experiments are conducted on a wide range of datasets and SOM-based filters are found to maintain or improve classification performance in the majority of cases even when labelled data is scarce. The wide variety of uses and applications explored in this work demonstrates the robust and versatile nature of simple unsupervised SOM-based approaches and warrants their continued relevance in feature learning, even in the age of deep learning.