In vitro fertilisation (IVF) comprises a sequence of interventions delivered to the treated woman and her embryos. Typically, IVF begins with a period of stimulation, where a patientâs ovaries will be stimulated with drugs. This encourages the growth of follicles, which contain eggs. When the follicles are sufficiently developed, a trigger is given which causes eggs to be released. These are collected in a clinical procedure, and are fertilised with sperm, to produce embryos. The embryos are graded on the basis of morphological features, and an embryologist will select the best to be transferred to the patientâs uterus. Once embryos are transferred, the hope is that the patient will have a successful pregnancy, culminating in the birth of one or more children. The multistage treatment structure complicates measurement and modelling of IVF data. First, since patient responses to each of these interventions can be measured, outcome reporting is complicated by the sheer variety of outcome measures on offer. Second, since each intervention influences not only the immediate patient response, but also responses to interventions delivered subsequently, it is difficult to untangle the causal web underlying the IVF process. This in turn obfuscates the mechanisms by which IVF interventions ultimately influence the birth outcome, representing a barrier to the design of new treatment strategies. Routine statistical models are not capable of addressing this challenge. Bespoke approaches are required. Our aims were to address methodological issues relating to the measurement and modelling of multistage IVF data. After reviewing the existing literature, we investigated outcome reporting practices on IVF clinic websites and randomised controlled trials (RCTs). This highlighted the multiplicity of measures in use. We identified 815 distinct outcomes in use in IVF RCTs and 51 on clinic websites. In relation to trials, this represents a barrier to both data synthesis and comparison between treatments. In relation to clinic websites, there is a concern that prospective patients will struggle to interpret the different measures, rendering truly informed decision-making impossible. Selective reporting is another inevitable consequence of outcome heterogeneity, common to both research and advertising of IVF. While recognising that different measures are suitable for different purposes, we argue for greater standardisation of outcome reporting. National reporting schemes offer one route to clear and consistent reporting for consumers. Next, we adapted and extended joint modelling approaches used in econometrics, education, and toxicity research, to develop methods for the joint analysis of multistage outcomes. These methods can accommodate mixed response types (eg: count, ordinal, binary) measured at different levels of a multilevel data structure (eg: women and their embryos). We represented each response variable by a standard regression submodel, and linked these by specifying an underlying multivariate latent structure. Finding that this did not yield useful estimates of effects of upstream events on downstream responses, we extended the approach by introducing response variables as covariates in downstream submodels. Throughout, we emphasise real datasets and research questions. We conclude by building a model to estimate the effects of ovarian stimulation on uterine receptivity, which is complicated by the fact that stimulation also contributes to the pool of embryos available for transfer. Our results suggest deleterious effects of stimulation on embryo implantation and live birth.