Will borrow from…
In our moose sighting example, we could determine the posterior distribution using calculus.
In many cases, there will be no closed form solution to:
\[p(\theta | y) = \frac{L(y | \theta)\pi(\theta)}{\int_{-\infty}^{\infty}L(y | \theta)\pi(\theta)}\]
MCMC = Markov Chain Monte Carlo is a way to draw a sequence of random variables that will converge in distribution to \(p(\theta | y)\)
Goal: generate samples that we can use to summarize the posterior distribution, \(p(\theta | y)\)
There are a variety of MCMC algorithms and samplers. We are going to consider 1 approach to gain somegeneral insights into how these methods work.
\[p(\theta | y) = \frac{L(y | \theta)\pi(\theta)}{\int_{-\infty}^{\infty}L(y | \theta)\pi(\theta)}\]
Consider two possible values of \(\theta\) = {\(\theta_1\) and \(\theta_2\)}. Without the denominator, we cannot evaluate \(p(\theta_1 | y)\) or \(p(\theta_2 | y)\).
We can, however, evaluate the relative likelihood of \(\theta_1\) and \(\theta_2\):
\[\text{R } = \frac{p(\theta_2 | y)}{p(\theta_1 | y)} = \frac{L(y | \theta_2)\pi(\theta_2)}{L(y | \theta_1)\pi(\theta_1)}\]
Initiate the Markov chain with an initial starting value, \(\theta_0\)
Generate a new, proposed, value of \(\theta\) from a symmetric distribution centered on \(\theta_0\) (e.g., \(\theta^{\prime}\) = rnorm(\(\theta_0\), mean =0, sd=1)).
Decide whether to accept or reject \(\theta^{\prime}\):
If R = \(\frac{p(\theta^{\prime} | y)}{p(\theta_0 | y)} > 1\): Accept and set \(\theta_1 = \theta^{\prime}\)!
If R \(<1\) accept \(\theta^{\prime}\) with probability = R. Otherwise (i.e., if reject), set \(\theta_1 = \theta_0\)
Back to step 2, and …
Continue to sample until:
The distribution of \(\theta_1, \theta_2, \ldots, \theta_M\) appears to have reached a steady state (i.e., reached convergence).
The MCMC sample, \(\theta_1, \theta_2, \ldots, \theta_M\) is sufficiently large to summarize \(p(\theta | data)\)
There are no foolproof methods for detecting convergence. Some things that we can and will do:
JAGS will attempt to determine how best to sample once we give it a likelihood and set of prior distributions (one for each parameter).
Steps:

Mandible lengths in mm:
Do males and females have, on average, different mandible lengths?
\[H_0: \mu_m = \mu_f \text{ versus } H_a: \mu_m \neq \mu_f\]
Likelihood:
Priors:
Notes:
Good question. I tried to make sure:
It is a good idea to check whether:
JAGS/BUGS (hereafter JAGS) code looks just like R code, but with some differences:
There are 6 types of objects
Modeled data defined with a \(\sim\) (“distributed as”). For example y \(\sim\) followed by a probability distribution. The variable y here is the response in our regression model.
Unmodeled data: objects that are not assigned probability distributions. Examples include predictors, constants, and index variables.
Modeled parameters: these are given informative “priors” that themselves depend on parameters called hyperparameters. These are what a frequentist would call random effects. We won’t consider till later in the course.
Unmodeled parameters: these are given uninformative priors. [So in truth all parameters are modeled].
Derived quantities: these objects are typically defined with the assignment arrow, <-
Looping indexes: i, j, etc.
Types of objects for JAW example
Modeled data = males, females
Unmodeled data = nmales, nfemales
Modeled parameters (none in this example)
Unmodeled parameters = mu.male, mu.female, sigma
Derived quantities = tau, mu.diff
Looping indexes: i (used twice)
Start simple, then build up.
Lots of good tricks and tips in the Appendix of Kery’s Introduction to WinBugs for Ecologists, especially:
Numbers: 2, 3, 4, 9, 11, 12 (use %T% in JAGS), 14, 16, 17, 20, 23,24, 25, 26, 27
Googling error messages is often useful for diagnosing problems.
Work with 12-BayesMCMC.R to fit your first model in JAGS