This document describes possible Bayesian approaches to fitting some of the models discussed in detail in the book, using existing functions within R. For an excellent practical resource on these Bayesian multilevel modeling approaches, readers should also consider Finch et al. (2014).

1. CHAPTER 3: THE RAT PUP DATA

First, we consider the Rat Pup data from Chapter 3. We assume that the data have already been read into an R data frame object. The existing function that implements MCMC methods for Bayesian analysis of multilevel models is MCMCglmm(), which is located within the MCMCglmm package. We first install (from a CRAN mirror) and load this package in R:

library(MCMCglmm)

We then fit the final model from Chapter 3 as follows, using the default non-informative MCMCglmm prior distributions for each of the possible parameters in an LMM [see help(MCMCglmm)]:

ratpup$sex1[ratpup$sex == "Female"] <- 1
ratpup$sex1[ratpup$sex == "Male"] <- 0
ratpup$trtgrp[ratpup$treatment == "Control"] <- 1
ratpup$trtgrp[ratpup$treatment == "Low" | ratpup$treatment == "High"] <- 2

model3.3.bayes <- MCMCglmm(weight ~ sex1 + litsize + treatment, random = ~litter, data = ratpup, rcov = ~idh(treatment):units)

we note that this model is slightly different from the final model in Chapter 3, in that heterogeneous error variances are defined by the three-category treatment factor, rather than the recoded trtgrp factor. When using the idh() function in MCMCglmm to specify heterogeneous error variances as a function of a factor, the levels of that factor need to be consistent with a fixed factor in the model.

By default, MCMCglmm uses 13,000 MCMC iterations, resulting in several thousand (10,000) simulated draws of the parameters from their posterior distributions based on the specified model and the non-informative prior distributions. A default of 3,000 burn-in draws are used for the MCMC algorithm. We can plot the post burn-in draws of these parameters using a simple plot function:

plot(model3.3.bayes)

Users can hit ENTER to cycle through the plots, which show the sequence of draws for each parameter (which should stabilize over iterations, indicating convergence of the sequence to random draws around an overall mean) and the density function defined by all of the posterior draws of each parameter. The density functions provide an initial sense of whether the distribution of the draws of a given parameter from the posterior distribution for all of the parameters in the model includes 0 as a plausible value.

The default "thinning" rate for MCMCglmm, indicating the rate at which posterior draws will be sampled to generate posterior inferences, is 10 (i.e., every 10th draw from the algorithm is actually used to compute credible sets and medians of the posterior distribution). We need to make sure that these draws have zero autocorrelation (or something close to it) for a given thinning rate (or lag). The MCMCglmm package provides a convenient function for assessing these autocorrelations at different choices of the thinning parameter:

autocorr(model3.3.bayes$VCV) 

Assessment of these autocorrelations for the Rat Pup example at lag 10 suggests no major issues with this rate of thinning, and that posterior inferences based on every 10th draw (i.e., 1,000 of the simulated draws) would be reasonable.

To generate inferences, we can apply the summary function to the object generated by MCMCglmm():

summary(model3.3.bayes)

 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000 

 DIC: 313.9606 

 G-structure:  ~litter

       post.mean l-95% CI u-95% CI eff.samp
litter    0.1066  0.04921   0.1813      884

 R-structure:  ~idh(treatment):units

              post.mean l-95% CI u-95% CI eff.samp
Control.units    0.2694  0.20757   0.3388     1000
High.units       0.1116  0.07475   0.1586     1000
Low.units        0.0861  0.06453   0.1080     1000

 Location effects: weight ~ sex1 + litsize + treatment 

              post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)     8.32723  7.78502  8.91543     1000 <0.001 ***
sex1           -0.34294 -0.43046 -0.26315     1000 <0.001 ***
litsize        -0.13026 -0.16666 -0.09339     1000 <0.001 ***
treatmentHigh  -0.85954 -1.23652 -0.47429     1000 <0.001 ***
treatmentLow   -0.43554 -0.72088 -0.12940     1000  0.006 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


The DIC criterion can be used to compare competing models, similar to the AIC and BIC criteria. We see means of the 1,000 draws of each parameter from the posterior distribution, along with 95% credible sets for the parameters. The Bayesian approach is especially convenient for inference about variance components, where we can say, for example, that 95% of values of the variance of the random litter effects fall between 0.049 and 0.181. The key difference of the Bayesian approach is that parameters are considered random, and inferences are based on updating prior beliefs about the distributions of these random parameters. Here we see sufficient evidence that the distribution of the variance of the random litter effects does not include zero. we also note the same differences in the error variances that were found in Chapter 3, with the control group having noticeably higher error variance. Finally, considering the fixed effects, we arrive at the same general conclusions found in Chapter 3.

Finally, if a researcher wishes to use informative priors for the parameters in a model, rather than relying on the default non-informative prior distributions, the prior argument in MCMCglmm() can be used. The prior argument can be an object defined by up to three lists: the mean vector and variance-covariance matrix for the normally distributed fixed effects (B), and the expected covariances and "degree of belief" parameter (nu) for the inverse-Wishart distribution, for variance components. For example, if prior analyses have suggested that the intercept and treatment effect in a model with a single indicator for treatment are 10 and 5, with variances 1 and 1, respectively (and no covariance), the object for the prior argument can be defined as follows:

var <- matrix(c(1,0,0,1), nrow = 2, ncol = 2)
prior.dist <- list(B = list(mu = c(10,5), V = var))
model.example <- MCMCglmm(dv ~ treatment, random = ~cluster, data = example, prior = prior.dist)

Many other examples of specified prior distributions for MCMCglmm() can be found online. If sample sizes are large, the contributions of the likelihood function to the posterior distribution will generally dominate the contributions of informative priors, making the specification of informative priors unnecessary. Informative priors may play an important role in increasing the efficiency of inferences for smaller samples (IF prior data are available).

2. CHAPTER 4: THE CLASSROOM DATA

[FORTHCOMING]


REFERENCES

Finch, W.H., Bolin, J.E., and Kelley, K. (2014). Multilevel Modeling using R. Chapman & Hall / CRC Press.