Short introduction to Bayesian statistics


  • Likelihood determines the probability of data conditional on the model parameters.
  • Prior encodes beliefs about the model parameters without considering data.
  • Posterior quantifies the probability of parameter values conditional on the data.
  • The posterior is a compromise between the data and prior. The less data available, the greater the impact of the prior.
  • The grid approximation is a method for inferring the (approximate) posterior distribution.
  • Posterior information can be summarized with point estimates and posterior intervals.
  • The marginal posterior is accessed by integrating over nuisance parameters.
  • Usually, Bayesian models are fitted using methods that generate samples from the posterior.

Stan


  • Stan is a tool for efficient posterior distribution sample generation.
  • A Stan program is specified in a separate text file that consists of code blocks, with the data, parameters, and model blocks being the most crucial ones.

Markov chain Monte Carlo


  • Markov chain Monte Carlo methods can be used to generate samples from a posterior distribution.
  • Values of the chain are generated from a proposal distribution.
  • Proposals towards higher areas of the target distribution are accepted with higher probability.
  • MCMC convergence should always be monitored.

Hierarchical models


  • Hierarchical models are appropriate for scenarios where the study population naturally divides into subgroups.
  • Hierarchical models borrow statistical strength across the population groups.
  • Population distributions hold information about the variation of the model parameters over the whole population.

Model comparison


  • Bayesian model comparison can be performed (for example) with posterior predictive checks, information criteria, and cross-validation.

Gaussian processes


  • GPs provide a means for non-parametric regression.
  • A GP has two parameters: mean, and covariance.
  • GPs can be used a part of more complex models.

Stan extensions


  • There are several R packages that provide more user-friendly ways of using Stan.
  • brms package can be used to fit a vast array of different Bayesian models.
  • bayesplot package is a library for various plotting tools.
  • Approximate leave-one-out cross-validation can be performed with the loo package.

Exercises1. Basics2. Stan3. MCMC4. Hierarchical models5. Model comparison6. Gaussian processes7. Other topics