## Encapsulating models and approximate inference programs in probabilistic modules

We introduce the ‘probabilistic module’ interface, which allows encapsulation of complex probabilistic models with latent variables alongside custom stochastic

approximate inference machinery, and provides a platform-agnostic abstraction

barrier separating the model internals from the host probabilistic inference system.

The interface can be seen as a stochastic generalization of a standard simulation

and density interface for probabilistic primitives. We show that sound approximate

inference algorithms can be constructed for networks of probabilistic modules, and

we demonstrate that the interface can be implemented using learned stochastic

inference networks and MCMC and SMC approximate inference programs.

The interface has two procedures, ‘simulate’ and ‘regenerate’:

- ‘Simulate’ runs the original stochastic computation being encapsulated and returns the output value along with a harmonic mean estimate of the output density at the returned value.
- ‘Regenerate’ proposes an execution history of the computation that is responsible for a given output, and returns a importance sampling estimate of the output density at the given value.

The two estimators must be defined using the same auxiliary probabilistic computation that samples execution histories of the original computation, given the output. This proposal program is called the ‘regenerator’. The optimal regenerator samples from the conditional distribution on execution histories given output. Generally, we resort to suboptimal, approximate regenerators. Regenerators can be learnt from data or constructed from stochastic approximate inference algorithms such as sequential Monte Carlo.

Encapsulating a stochastic computation such as a simulator for a generative probabilistic model with an accurate regenerator serves to ‘approximately collapse out’ the latent variables in the computation. As the accuracy of the regenerator increases, the module behaves more like a collapsed computation with an available density.

Poster [pdf]

Paper [pdf]

In a related work, we show that the same interface is sufficient for estimating bounds on the KL divergence between the output distributions of stochastic computations, and we apply this to estimate the approximation error of SMC approximate inference samplers including those using MCMC transitions.

The interface is also related to the ‘stochastic procedure’ interface used in the Venture probabilistic programming platform. There are also connections to pseudo-marginal MCMC and particle MCMC.

Authors: Marco Cusumano-Towner, Vikash Mansinghka