My primary research interest is estimating the effect of treatment from observational studies (i.e. looking at subjects who receive different treatments for clinical reasons, rather than from a randomised clinical trial). In a randomised trial, a given person is equally likely to receive either treatment. However, in an observational study, subjects with more active disease are more likely to receive treatment, and are also more likely to have a poor outcome. This can lead to the treatment correlating with poor outcomes, even if it does provide benefit to the patient. This is called "Confounding by indication" (confounding is when two variables are correlated not because one causes the other, but because they are both caused by a third variable).
In order find out what effect the treatment has, we need to compare the outcome in treated subjects to the outcome they would have had if they had not been treated. Finding ways to estimate their expected outcome if they had not been treated is the focus of my research. Most established methods for doing this revolve around the propensity score (the probability that a given person will receive treatment, calculated from all of the potential confounders that were measured). Two subjects with the same propensity score can be expected to have the same outcome if they were not treated, so comparing treated and untreated subjects with the same propensity score can give an unbiased estimate of the effect of treatment.
Although the performance of various propensity-score based estimators is know when the predictor variables follow known distributions, less is known about how the estimates degrade when the predictors do not follow the assumed distribution. Since real data is likely to deviate from the theoretical distribution to some extent, it would be useful to know how these deviations affect different estimators, and to compare the robustness of estimators.
Most of my work revolves around estimating the effect of some kind of exposure (often a treatment of some description) on an outcome. Generally, there will be some confounding (i.e. the exposure and the outcome are both affected by a third variable, such as age or disease activity). Methods used to get around the confounding include generalised linear models, propensity methods and instrumental variables. In addition, in most studies, there is some missing data, so I am interested in methods that can eliminate the bias that can arise when the probability that data is missing is affected by the outcome, exposure or confounding variables.