Causality in Structural Vector Autoregressions: Science or Sorcery?

Links:    Paper      Raw Data     Processed Data (CSVdta)     Stata Code     R Code

The structural vector autoregression (SVAR) model has become the staple method for generating causal estimates from time series, but skepticism lurks among many applied economists. This paper aims to de-mystify SVARs for applied microeconomists.  First, we show a close connection between SVARs and the linear instrumental variables (IV) model.  Then, we present an SVAR analysis of global supply and demand of agricultural commodities.  

SVAR vs IV
The top panel shows how a yield shock affects both price and quantity in an SVAR.  The bottom panel shows how an instrumental variable (yield) can be used to identify the slope of the demand curve.  The slope of the demand curve from IV equals the ratio of the price and quantity responses in the SVAR.

Serial correlation complicates causal inference because it implies that treatments and responses persist for multiple periods. If a serially correlated treatment variable jumps above its mean one period and remains above the mean for several periods, then we expect economic agents to respond as though they received a single treatment that lasted multiple periods rather than a sequence of independent treatments. Put differently, we expect them to respond to the treatment path. In addition to the treatment potentially lasting for multiple periods, the responses to treatment may also play out over multiple periods.  

For example, in response to a crop price increase (treatment), farmers may convert pasture to cropland if they expect prices to remain high for a long period, but they will not do so if they expect the price increase to be shortlived. Thus, the response of agricultural supply to price varies depending on the persistence of the price change. Moreover, some producers may respond to a persistent price change by converting land immediately; others will wait and convert later.

The impulse response function in an SVAR estimates the dynamic responses to treatment paths. We show that the population analogue of the Wald IV estimator is identical to a ratio of two impulse responses from an SVAR under certain conditions (see figure on the right).  

To illustrate the framework, we present an SVAR analysis of global demand and supply of agricultural commodities.  Following Roberts and Schlenker (2013), the variables in our SVAR are calorie-weighted aggregates of global yield, acreage, inventory and price for corn, wheat, rice and soybeans, which constitute about 75% of calories consumed by humans. 

Our SVAR identification strategy exploits the natural sequence of events in the agricultural growing season: Farmers plant crops at the beginning of the growing season, then weather events affect yields, which subsequently influence wholesale traders' inventory decisions and result in an equilibrium price.  From the four observed variables, our SVAR extracts two supply shocks and two demand shocks.  

We plot the impulse response functions below. The second row shows responses to yield shocks; this is the same variation that Roberts and Schlenker (2013) use to identify supply and demand elasticities in their static IV model. 

The impulse responses show that the yield shocks are short-lived, which raises concern that the IV estimates of demand elasticity may not reflect consumer response to long-lived shocks such as those caused by climate change or changes in government policy.  This concern is however alleviated by the similarity in the estimate of demand elasticities identified from weather shocks and the longer-lived acreage shock (numbers in the paper).  This suggests that consumer response is not affected by the horizon of the shock. On the other hand, our estimated supply elasticities do vary depending on the persistence of the shocks used to identify them as producers may respond to long-lived shocks by making capital investments to increase production. They are less likely however to make such investments if a price shock is expected to only last for a single year. 

Our main points carry over to different identification schemes, model specifications, and estimators. Time series settings typically contain multiple continuous variables that are serially correlated and potentially mutually dependent. Causal analysis of such data requires the analyst to consider the persistence of the "treatments" (i.e., identify treatment paths) and to estimate the dynamic effects of these treatments. These points also extend to panel data settings, especially those with a long time series dimension.

Citation: Causality in Structural Vector Autoregressions: Science or Sorcery?. Ghanem, D.; and Smith, A. Working Paper. 2020IRF