An Introduction to Bayesian VAR (BVAR) Models

with tags r bvar var bayes bayesian-var bvartools gibbs-sampler -

Bayesian methods have significantly gained in popularity during the last decades as computers have become more powerful and new software has been developed. Their flexibility and other advantageous features have made these methods also more popular in econometrics. This post gives a brief introduction to Bayesian VAR (BVAR) models and provides the code to set up and estimate a basic model with the bvartools package.

BVAR models

Bayesian VAR (BVAR) models have the same mathematical form as any other VAR model, i.e.

\[ y_t = c + \sum_{l=i}^{p} A_i y_{t-i} + \epsilon_t,\] where \(y_t\) is a \(K \times 1\) vector of endogenous variables in period \(t\), \(A_i\) is the cofficient matrix corresponding to the \(i\)th lag of \(y_t\), \(c\) is a constant deterministic term and \(\epsilon\) is an error term with zero mean and variance-covariance \(\Sigma\).

The only difference between usual VAR models and BVAR models is the way parameter estimates are obtained and interpreted. VAR models are usually estimated by OLS, which is a simple and computationally fast estimator. By contrast, Bayesian estimators are slightly more complicated and more burdensome in terms of algebra and calculation power. The coefficients obtained by so-called frequentist estimators like OLS are interpreted based on the concept of the sampling distribution. In Bayesian inference, the coefficients are assumed to have their own distribution. A more detailed treatment of the difference between frequentist and Bayesian inference can be found in Kennedy (2008, ch. 14), which provides a short introduction to the Bayesian approach and a series of references for interested readers. Koop and Korobilis (2010) provide a very good introduction to Bayesian VAR estimators.

As already mentioned, Bayesian inference can be algebraically demanding. However, Bayesian estimators for linear VAR models can be implemented in a straightforward manner. A standard implementation is a so-called Gibbs sampler, which belongs to the family of Markov-Chain-Monte-Carlo (MCMC) methods. A detailed treatment of this method is beyond the scope of this post, but Wikipedia might be a good start to become familiar with it. Personally, I like to think of the Gibbs sampler as throwing a bunch of random numbers at a model and see what sticks. The remainer of this text provides the code to set up and estimate a basic BVAR model with the bvartools package.

Model and data

For this illustration the dataset E1 from Lütkepohl (2007) is used. It contains data on West German fixed investment, disposable income and consumption expenditures in billions of DM from 1960Q1 to 1982Q4. Following Lütkepohl (2007) the VAR model has two lags, i.e. \(p = 2\) and only the first 73 observations are used for inference.


data("e1") # Load the data
e1 <- diff(log(e1)) # Calculate first-log-differences

plot(e1) # Plot the series

To assist with the set-up of the model the gen_var function produces the inputs y and x for the estimator, where y is a matrix of dependent variables and x is the matrix of regressors for the model

\[y_t = A x_t + u_t,\] with \(u_t \sim N(0, \Sigma)\). This is a more compact form of the model above, where the lags of the endogenous variables and the constant are already included in \(x_t\).

data <- gen_var(e1, p = 2, deterministic = "const")

y <- data$Y[, 1:73]
x <- data$Z[, 1:73]


Frequentist estimator

We calculate frequentist VAR estimates using the standard formula \(y x' (x x')^{-1}\) to obtain a benchmark for the Bayesian estimator. The parameters are obtained by OLS:

A_freq <- tcrossprod(y, x) %*% solve(tcrossprod(x)) # Calculate estimates
round(A_freq, 3) # Round estimates and print
##        invest.1 income.1 cons.1 invest.2 income.2 cons.2  const
## invest   -0.320    0.146  0.961   -0.161    0.115  0.934 -0.017
## income    0.044   -0.153  0.289    0.050    0.019 -0.010  0.016
## cons     -0.002    0.225 -0.264    0.034    0.355 -0.022  0.013

And \(\Sigma\) is calculated by

u_freq <- y - A_freq %*% x
u_sigma_freq <- tcrossprod(u_freq) / (ncol(y) - nrow(x))
round(u_sigma_freq * 10^4, 2)
##        invest income cons
## invest  21.30   0.72 1.23
## income   0.72   1.37 0.61
## cons     1.23   0.61 0.89

These are the same values as in Lütkepohl (2007).

Bayesian estimator

The following code is a Gibbs sampler for a simple VAR model with non-informative priors.

# Reset random number generator for reproducibility

iter <- 30000 # Number of iterations of the Gibbs sampler
burnin <- 15000 # Number of burn-in draws
store <- iter - burnin

t <- ncol(y) # Number of observations
k <- nrow(y) # Number of endogenous variables
m <- k * nrow(x) # Number of estimated coefficients

# Set (uninformative) priors
a_mu_prior <- matrix(0, m) # Vector of prior parameter means
a_v_i_prior <- diag(0, m) # Inverse of the prior covariance matrix

u_sigma_df_prior <- 0 # Prior degrees of freedom
u_sigma_scale_prior <- diag(0, k) # Prior covariance matrix
u_sigma_df_post <- t + u_sigma_df_prior # Posterior degrees of freedom

# Initial values
u_sigma_i <- diag(.00001, k)
u_sigma <- solve(u_sigma_i)

# Data containers for posterior draws
draws_a <- matrix(NA, m, store)
draws_sigma <- matrix(NA, k^2, store)

# Start Gibbs sampler
for (draw in 1:iter) {
  # Draw conditional mean parameters
  a <- post_normal(y, x, u_sigma_i, a_mu_prior, a_v_i_prior)

  # Draw variance-covariance matrix
  u <- y - matrix(a, k) %*% x # Obtain residuals
  u_sigma_scale_post <- solve(u_sigma_scale_prior + tcrossprod(u))
  u_sigma_i <- matrix(rWishart(1, u_sigma_df_post, u_sigma_scale_post)[,, 1], k)
  u_sigma <- solve(u_sigma_i) # Invert Sigma_i to obtain Sigma

  # Store draws
  if (draw > burnin) {
    draws_a[, draw - burnin] <- a
    draws_sigma[, draw - burnin] <- u_sigma

After the Gibbs sampler has finished, point estimates for the coefficient matrix can be obtained as, e.g., the mean of the posterior draws:

A <- rowMeans(draws_a) # Obtain means for every row
A <- matrix(A, k) # Transform mean vector into a matrix
A <- round(A, 3) # Round values
dimnames(A) <- list(dimnames(y)[[1]], dimnames(x)[[1]]) # Rename matrix dimensions

A # Print
##        invest.1 income.1 cons.1 invest.2 income.2 cons.2  const
## invest   -0.318    0.152  0.960   -0.161    0.116  0.930 -0.017
## income    0.044   -0.152  0.287    0.050    0.020 -0.012  0.016
## cons     -0.002    0.225 -0.266    0.034    0.356 -0.023  0.013

Point estimates for the covariance matrix can also be obtained by calculating the means of the posterior draws.

Sigma <- rowMeans(draws_sigma) # Obtain means for every row
Sigma <- matrix(Sigma, k) # Transform mean vector into a matrix
Sigma <- round(Sigma * 10^4, 2) # Round values
dimnames(Sigma) <- list(dimnames(y)[[1]], dimnames(y)[[1]]) # Rename matrix dimensions

Sigma # Print
##        invest income cons
## invest  22.71   0.76 1.31
## income   0.76   1.46 0.65
## cons     1.31   0.65 0.95

The means of the coefficient draws are very close to the results of the frequentist estimatior, which would be expected with non-informative priors.

bvar objects

The bvar function can be used to collect relevant output of the Gibbs sampler into a standardised object, which can be used by further functions such as predict to obtain forecasts or irf for impulse respons analysis.

bvar_est <- bvar(y = y, x = x, A = draws_a[1:18,],
                 C = draws_a[19:21, ], Sigma = draws_sigma)

Posterior draws can be thinned with the function thin:

bvar_est <- thin(bvar_est, thin = 15)


Forecasts with credible bands can be obtained with the function predict. If the model contains deterministic terms, new values can be provided in the argument new_D. If no values are provided, the function sets them to zero. The number of rows of new_D must be the same as the argument n.ahead.

bvar_pred <- predict(bvar_est, n.ahead = 10, new_D = rep(1, 10))


Impulse response analysis

Currently, bvartools supports forecast error, orthogonalised, and generalised impulse response functions.

Forecast error impulse response

FEIR <- irf(bvar_est, impulse = "income", response = "cons", n.ahead = 8)

plot(FEIR, main = "Forecast Error Impulse Response", xlab = "Period", ylab = "Response")

Orthogonalised impulse response

OIR <- irf(bvar_est, impulse = "income", response = "cons", n.ahead = 8, type = "oir")

plot(OIR, main = "Orthogonalised Impulse Response", xlab = "Period", ylab = "Response")

Generalised impulse response

GIR <- irf(bvar_est, impulse = "income", response = "cons", n.ahead = 8, type = "gir")

plot(GIR, main = "Generalised Impulse Response", xlab = "Period", ylab = "Response")

Forecast error variance decomposition

Default forecast error variance decomposition (FEVD) is based on orthogonalised impulse responses (OIR).

bvar_fevd_oir <- fevd(bvar_est, response = "cons")

plot(bvar_fevd_oir, main = "OIR-based FEVD of consumption")

It is also possible to calculate FEVDs, which are based on generalised impulse responses (GIR). Note that these do not automatically add up to unity.

bvar_fevd_gir <- fevd(bvar_est, response = "cons", type = "gir")

plot(bvar_fevd_gir, main = "GIR-based FEVD of consumption")


Koop, G., & Korobilis, D. (2010). Bayesian multivariate time series methods for empirical macroeconomics. Foundations and trends in econometrics, 3(4), 267-358.

Koop, G., Pesaran, M. H., & Potter, S.M. (1996). Impulse response analysis in nonlinear multivariate models. Journal of Econometrics 74(1), 119-147.

Lütkepohl, H. (2007). New introduction to multiple time series analysis (2nd ed.). Berlin: Springer.

Kennedy, P. (2008). A guide to econometrics (6th ed.) Malden, MA: Blackwell.

Pesaran, H. H., & Shin, Y. (1998). Generalized impulse response analysis in linear multivariate models. Economics Letters, 58(1), 17-29.