Stan: Probabilistic Modeling and Bayesian Inference

Coding Club
January 2019

Summary


Stan is a probabilistic programming language for specifying statistical models. Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling.

Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm.

Stan can be called through R using the rstan package, and through Python using the pystan package. Both interfaces support sampling and optimization-based inference with diagnostics and posterior analysis.

In this talk it is shown a brief glance about the main properties of Stan. It is shown, also a couple of examples: the first one related with a simple Bernoulli model and the second one, about a Lotka-Volterra model based on ordinary differential equations.

What is Stan?


  • Stan is named in honor of Stanislaw Ulam (1909-1984): Co-inventor of the Monte Carlo method.

  • Stan is an imperative probabilistic programming language.

  • A Stan program defines a probability model.

  • It declares data and (constrained) parameter variables.

  • It defines log posterior (or penalized likelihood).

  • Stan inference: fits model to data and makes predictions.

  • It can use Markov Chain Monte Carlo (MCMC) for full Bayesian inference.

  • Or Variational Bayesian (VB) for approximate Bayesian inference.

  • Or Maximum likelihood estimation (MLE) for penalized maximum likelihood estimation.

What does Stan compute?


  • Draws from posterior distributions

  • Stan performs Markov chain Monte Carlo sampling

  • Produces sequence of draws θ(1),θ(2),,θ(M)

  • where each draw θ(i) is marginally distributed according to the posterior p(θ|y)

  • Draws characterize posterior distributions

  • Plot with histograms, kernel density estimates, etc.

  • See

    https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started

Obtained Results


  • Rhat near 1 signals convergence; n_eff is effective sample size

  • 10%, … posterior quantiles; e.g., P[α(0.46,0.64)|y]=0.8

  • posterior mean is Bayesian point estimate: α=0.55

  • standard error in posterior mean estimate is 0 (with rounding)

  • posterior standard deviation of α estimated as 0.07

Other references and examples of Stan


https://andrewgelman.com/2018/10/12/stan-on-the-web-for-free-thanks-to-rstudio

https://rstudio.cloud/project/56157

  • But I had problems to run these codes in the Cloud: RStudio in Cloud is version alpha :-(

  • Anyway, all examples of his blog can be donloaded from

https://github.com/stan-dev/example-models/archive/master.zip


Development Team of STAN


Andrew Gelman, Bob Carpenter, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Allen Riddell, Marco Inacio, Jeffrey Arnold, Mitzi Morris, Rob Trangucci, Rob Goedman, Brian Lau, Jonah Sol Gabry, Robert L. Grant, Krzysztof Sakrejda, Aki Vehtari, Rayleigh Lei, Sebastian Weber, Charles Margossian, Vincent Picaud, Imad Ali, Sean Talts, Ben Bales, Ari Hartikainen, Matthijs Vakar, Andrew Johnson, Dan Simpson