Learning to discover: expressive Gaussian mixture models for multi-dimensional simulation and parameter inference in the physical sciences

Research output: Contribution to journalArticlepeer-review


We show that density models describing multiple observables with (i) hard boundaries and (ii) dependence on external parameters may be created using an auto-regressive Gaussian mixture model. The model is designed to capture how observable spectra are deformed by hypothesis variations, and is made more expressive by projecting data onto a configurable latent space. It may be used as a statistical model for scientific discovery in interpreting experimental observations, for example when constraining the parameters of a physical model or tuning simulation parameters according to calibration data. The model may also be sampled for use within a Monte Carlo simulation chain, or used to estimate likelihood ratios for event classification. The method is demonstrated on simulated high-energy particle physics data considering the anomalous electroweak production of a $Z$ boson in association with a dijet system at the Large Hadron Collider, and the accuracy of inference is tested using a realistic toy example. The developed methods are domain agnostic; they may be used within any field to perform simulation or inference where a dataset consisting of many real-valued observables has conditional dependence on external parameters.

Bibliographical metadata

Original languageEnglish
JournalMachine Learning: Science and Technology
Early online date11 Jan 2022
Publication statusPublished - 11 Jan 2022