Active appearance models (AAMs) are widely used to fit statistical models of shape and appearance to images, and have applications in segmentation, tracking, and classification of structures. A limitation of AAMs is that they are not robust to a large set of gross outliers. Using a robust kernel can help, but there are potential problems in determining the correct kernel scaling parameters. We describe a method of learning two sets of scaling parameters during AAM training: a coarse and a fine scale set. Our algorithm initially applies the coarse scale and then uses a form of deterministic annealing to reduce to the fine outlier rejection scaling as the AAM converges. The algorithm was assessed on two large datasets consisting of a set of faces, and a medical dataset of images of the spine. A significant improvement in accuracy and robustness was observed in cases which were difficult for a standard AAM.