Short introduction to Bayesian statistics
- Likelihood determines the probability of data conditional on the model parameters.
- Prior encodes beliefs about the model parameters without considering data.
- Posterior quantifies the probability of parameter values conditional on the data.
- The posterior is a compromise between the data and prior. The less data available, the greater the impact of the prior.
- The grid approximation is a method for inferring the (approximate) posterior distribution.
- Posterior information can be summarized with point estimates and posterior intervals.
- The marginal posterior is accessed by integrating over nuisance parameters.
- Usually, Bayesian models are fitted using methods that generate samples from the posterior.
Stan
- Stan is a tool for efficient posterior distribution sample generation.
- A Stan program is specified in a separate text file that consists of code blocks, with the data, parameters, and model blocks being the most crucial ones.
Markov chain Monte Carlo
- Markov chain Monte Carlo methods can be used to generate samples from a posterior distribution.
- Values of the chain are generated from a proposal distribution.
- Proposals towards higher areas of the target distribution are accepted with higher probability.
- MCMC convergence should always be monitored.
Hierarchical models
- Hierarchical models are appropriate for scenarios where the study population naturally divides into subgroups.
- Hierarchical models borrow statistical strength across the population groups.
- Population distributions hold information about the variation of the model parameters over the whole population.
Model comparison
- Bayesian model comparison can be performed (for example) with posterior predictive checks, information criteria, and cross-validation.
Gaussian processes
- GPs provide a means for non-parametric regression.
- A GP has two parameters: mean, and covariance.
- GPs can be used a part of more complex models.
Stan extensions
- There are several R packages that provide more user-friendly ways of using Stan.
-
brms
package can be used to fit a vast array of different Bayesian models. -
bayesplot
package is a library for various plotting tools. - Approximate leave-one-out cross-validation can be performed with the
loo
package.