The ultimate goal of machine vision is image understanding-the ability not only to recover image structure but also to know what it represents. By definition, this involves the use of models which describe and label the expected structure of the world. Over the past decade, model-based vision has been applied successfully to images of man-made objects. It has proved much more difficult to develop model-based approaches to the interpretation of images of complex and variable structures such as faces or the internal organs of the human body (as visualized in medical images). In such cases it has been problematic even to recover image structure reliably, without a model to organize the often noisy and incomplete image evidence. The key problem is that of variability. To be useful, a model needs to be specific-that is, to be capable of representing only 'legal' examples of the modelled object(s). It has proved difficult to achieve this whilst allowing for natural variability. Recent developments have overcome this problem; it has been shown that specific patterns of variability in shape and grey-level appearance can be captured by statistical models that can be used directly in image interpretation. The details of the approach are outlined and practical examples from medical image interpretation and face recognition are used to illustrate how previously intractable problems can now be tackled successfully. It is also interesting to ask whether these results provide any possible insights into natural vision; for example, we show that the apparent changes in shape which result from viewing three-dimensional objects from different viewpoints can be modelled quite well in two dimensions; this may lend some support to the 'characteristic views' model of natural vision.