class: center, middle, inverse, title-slide .title[ # Econometrics - Lecture 8 ] .subtitle[ ## Estimation of Dynamic Causal Effects ] .author[ ### Jonas Björnerstedt ] .date[ ### 2023-11-22 ] --- ## Lecture Content - Ch. 16. HAC standard errors - Ch. 15.6 Nonstationarity I: Stochastic trends - Ch. 15.7 Nonstationarity II: Breaks - _In the spring_ --- class: inverse, center, middle # HAC errors --- ## [HAC Standard Errors<sup> 🔗 </sup>](http://192.121.208.72:3939/time_series08-figs.Rmd#section-random-walk) Let `\(Y_t = \beta_0 + \beta_1 X_t + u_t\)` with autocorrelated `\(u_t\)` - Section 15.4 HAC errors - Estimation of `\(\hat\beta_0\)` and `\(\hat\beta_1\)` unbiased and consistent - Estimation of `\(Y_t = \beta_0 + \beta_1 \color{red}{Y_{t-1}} + u_t\)` _biased_ but consistent however (last lecture) - Estimation of variance depends on autocorrelation - Similar to discussion of Heteroscedasticity - But solution much more complicated - High positive autocorrelation leads to underestimated standard errors - Variability lower than with independent errors --- ## Variance of `\(\bar Y\)` with independent draws - Variance of the mean `\(\bar Y\)` of two observations with independent random sampling? (repetition). Assume that `\(E(Y_i)=0\)`. Then `$$\bar Y = \frac{Y_1+Y_2}{2}$$` `$$E[\bar Y] = E[ \frac{Y_1+Y_2}{2}]= \frac{1}{2}(E[Y_1+Y_2] ) = \frac{1}{2}(E[Y_1]+E[Y_2] ) = E[Y]$$` $$Var(\frac{Y_1+Y_2}{2}) = \frac{1}{4} Var(Y_1+Y_2) = \frac{1}{4} \left( Var(Y_1)+Var(Y_2)\right) $$ * Thus with independent sampling we have: `$$Var(\bar Y) = Var(\frac{Y_1+Y_2}{2}) = \frac{1}{2} Var(Y_i)$$` * With `\(n\)` observations we have: `$$Var(\bar Y) = Var \left( \frac{Y_1+Y_2+ \ldots +Y_n}{n} \right) = \frac{1}{n} Var(Y_i)$$` --- ## Variance of `\(\bar Y\)` with autocorrelation Let `\(Y_t = \beta_0 + u_t\)`. With independent `\(u_t\)` we have `\(Var(\bar Y) = \frac{\sigma_u^2}{2}\)`. When `\(u_t\)` and thus `\(Y_t\)` are autocorrelated we have $$ `\begin{aligned} Var(\bar Y) &= Var(\frac{Y_1+Y_2}{2}) = \frac{1}{4} E\left[(Y_1+Y_2 - 2\beta_0)^2\right] = \frac{1}{4} E\left[(u_1+u_2)^2\right] \\ &= \frac{1}{4} E\left[(u_1+u_2)(u_1+u_2)\right] = \frac{1}{4} E\left[u_1^2+u_2^2 + 2u_1 u_2\right] \\ &= \frac{1}{4} \left( Var(u_1) + Var(u_2) + 2Cov(u_1,u_2) \right) \\ &= \frac{1}{4} \left( \sigma_u^2 + \sigma_u^2 + 2 \frac{Cov(u_1,u_2)}{\sigma_u^2} \sigma_u^2 \right)\\ &= \frac{\sigma_u^2}{2}(1 + \rho_1) \end{aligned}` $$ * Variance of `\(Y_t\)` is greater if there is positive autocorrelation `\(\rho_1>0\)` with correction: `\(f = 1+\rho_1\)` --- ## Variance and autocorrelation with 3 obs * With mean of 3, we have both first and second autoregressive terms $$ `\begin{aligned} Var(\bar Y) &= Var(\frac{Y_1+Y_2+Y_3}{3}) \\ &= \frac{1}{3^2} E\left[(u_1+u_2+u_3)(u_1+u_2+u_3)\right] \\ &= \frac{1}{3^2} E\left[\color{red}{u_1^2+u_2^2+u_3^2} + \color{blue}{2u_1 u_2 + 2u_2 u_3} +\color{green}{ 2u_1 u_3}\right] \\ &= \frac{1}{3^2} \left(\color{red}{3 \sigma_u^2} + \color{blue}{2*2 \rho_1 \sigma_u^2} + \color{green}{2 \rho_2 \sigma_u^2} \right)\\ &= \frac{\sigma_u^2}{3}(1 + \frac{4}{3}\rho_1 + \frac{2}{3}\rho_2) \end{aligned}` $$ * Variance depends on __all__ autocorrelations. For the mean of `\(10\)` observations: $$ Var(\bar Y) = \frac{\sigma_u^2}{10}(1 + 2(\frac{9}{10}\rho_1 + \frac{8}{10}\rho_2 + \ldots + \frac{1}{10}\rho_9))$$ --- ## Variance of `\(\hat\beta_1\)` with autocorrelation - With regressor `\(X_t\)` the corresponding calculation hold for `\(\beta_1\)`. (advanced) - We have `$$\hat\beta_1 - \beta_1 = \frac{\sum_i X_i u_i }{\sum_i X_i^2} = \frac{X_1 u_1+\ldots+X_n u_n }{\sum_i X_i^2}$$` - With independent draws we had `$$Var(\hat\beta_1) = E[(\hat\beta_1 - \beta_1)^2] = E\left[\frac{\sum_i X_i u_i }{\sum_i X_i^2}\frac{\sum_i X_i u_i }{\sum_i X_i^2}\right] = E\left[\frac{\sum_i X_i^2 u_i^2 }{\left(\sum_i X_i^2\right)^2}\right]$$` - If `\(u_t\)` and `\(u_s\)` are correlated, the variance formula is more complicated `$$Var(\hat\beta_1) = E\left[\frac{\left(X_1 u_1+\ldots+X_n u_n\right)\left(X_1 u_1+\ldots+X_n u_n\right) }{\left(\sum_i X_i^2\right)^2}\right]$$` - All autocorrelations of `\(u_t\)` important! --- ## HAC error formula A regression with a single variable `\(X\)` with autocorrelated `\(u_t\)`, the variance `\(\mathrm{Var}(\hat\beta_1)\)` has to be modified $$ \mathrm{Var}(\tilde\beta_1) = \mathrm{Var}(\hat\beta_1) f_T$$ where `\(f_T\)` depends on **all** correlations of `\(u_t\)` and `\(X_t\)` `$$f_T = 1 + 2 \sum_{j=1}^{T-1} (\frac{T-j}{T}) \rho_j$$` - Depends on _all_ the autocorrelations in the sample - But there will be few observations for high autocorrelations - Only **one** observation for `\(\rho_{T-1}\)` - Truncate calculation at some lag `\(m<T\)` --- ## Newey-West variance estimator - uses `\(m \ll T\)` lags, the **truncation parameter** - take weighted sum (_kernel_) over `\(m\)` lags - where a guideline for `\(m\)` is `$$m = 0.75T^{1/3}$$` - With 1000 observations this means `\(m = 7.5\)` - Same principles with multiple regressors - Alternatives different weights --- ## HAC errors - Various methods to choose how to calculate adjustment - Newey-West method - `vcovHAC` in R uses Andrews' method - Correlation over long lags can be important - Few observation of correlation over long lags - Using thes can potentially result in calculating _negative_ variance! - Restrict attention to weighted sum of shorter lags --- ## Newey-West estimation ```r library(sandwich) library(lmtest) library(huxtable) est = lm(mpg ~ gear + I(gear^2), data = mtcars) est_hac = coeftest(est, vcov=NeweyWest) huxreg(est, est_hac) ```
(1)
(2)
(Intercept)
-78.653 **
-78.653
(26.888)
(38.800)
gear
48.957 **
48.957 *
(14.228)
(20.973)
I(gear^2)
-5.790 **
-5.790 *
(1.823)
(2.690)
N
32
32
R2
0.429
logLik
-93.408
-93.408
AIC
194.815
194.815
*** p < 0.001; ** p < 0.01; * p < 0.05.
--- ## Detecting autocorrelation - Consider model `$$Y_t = \beta_0 + \beta_1 X_t + u_t$$` - Assume that the errors are autocorrelated with AR(p) `$$u_t = \phi_1 u_{t-1} + \ldots +\phi_p u_{t-p} + v_t$$` - Persistent unobservables that affect `\(Y_t\)` - `\(v_t\)` are uncorrelated --- ## Detect autocorrelation - steps 1. Estimate model and generate residuals `\(\hat u_t\)` * Unbiased and consistent estimate 2. Estimate autocorrelation of residuals `\(\hat\phi\)` * Check joint significance of estimated `\(\hat\phi_1 \ldots \hat\phi_p\)` * _Breusch-Godfrey test_ * Or look at the Autocorrelation function (ACF) --- ## Consequences of autocorrelation - Effects - Unbiased estimate of parameters `\(\beta\)` - Standard errors are inconsistent - Solution - Change model - Add lags of variables - Model the error term - Time series analysis - Adjust the estimate of `\(\sigma^{2}\)` - Use Newey-West standard errors --- class: inverse, center, middle # Nonstationarity I: stochastic trends --- ## Nonstationarity I: stochastic trends ### Two problems 1. Bias of AR(1) estimator if `\(\beta_1 \approx 1\)` 2. Estimating model with nonstationary `\(Y_t\)` and `\(X_t\)` finds correlations even when there is no relationship. ### Solutions? * How do we detect nonstationarity? * How do we deal with nonstationarity if it is present? --- ## Problem 1: Bias of AR(1) estimator What happens when we estimate `$$Y_t = \beta_0 + \beta_1 Y_{t-1}+ u_t$$` when `\(\beta_1\)` is close to 1? * Biased estimates * Bias disappears as the number of observations increase * OLS is consistent estimator * If `\(\beta_1\)` is close to 1, many observations can be required --- ## [Problem 2: _Spurious regression_ <sup> 🔗 </sup>](http://192.121.208.72:3939/time_series08-figs.Rmd#section-spurious-relationship) What happens if we estimate the model `$$Y_t = \beta_0 + \beta_1 X_{t}+ u_t$$` when `\(X_t\)` and `\(Y_t\)` are nonstationary and `\(\beta_1 = 0\)`? * no relationship between `\(X_t\)` and `\(Y_t\)` in the population * Significant relationship will exist in data much more often than expected * Due to stochastic trends --- ## Detecting stochastic trends - Testing for a _Unit root_ - Dickey-Fuller test of AR(1) model - `\(H_0: \beta_1 = 1\)` with alternative `\(H_1: \beta_1 < 1\)` in `$$Y_t = \beta_0 + \beta_1 Y_{t-1}+ u_t$$` #### Rewritten equation - To get a null hypothesis testing whether a parameter is zero, testing whether a parameter is zero, we rewrite `$$Y_t - Y_{t-1} = \beta_0 + (\beta_1 - 1 ) Y_{t-1} + u_t$$` or `$$\Delta Y_t = \beta_0 + \delta_1 Y_{t-1} + u_t$$` * We have: `\(\delta_1 \le 0\)` --- ## Hypothesis tests for a single coefficient - Test null hypothesis `\(H_0\)` that the estimated coefficient `\(\hat\beta_1 = 1\)` is 1 - Compute `$$t^{act}= \frac{\hat\beta_1-1}{SE(\hat\beta_1)}$$` - Reject at 5 percent level if is below a critical level - `\(t^{act} < -1.6448536\)` - One tailed test, as `\(\beta_1\)` cannot be greater than 1 - Assumes that the estimate `\(\hat\beta_1\)` is approximately normally distributed - This is not the case if `\(\hat\beta_1 = 1\)` however --- ## Dickey-Fuller (DF) test for AR(1) - Estimate `$$\Delta Y_t = \beta_0 + \delta Y_{t-1} + u_t$$` - As the regressor `\(Y_{t-1}\)` does not have a stationary distribution, the estimate `\(\hat\delta\)` will not have a normal distribution - _Dickey-Fuller statistic_ - Find t-value of `\(\hat\delta\)`, _do not_ use robust estimation - The t-value will have mean=0 and variance=1 - The t-value will _not_ have the usual critical values - Check if t value is below critical value --- ## Dickey-Fuller critical values .pull-left[ - The critical values for the DF statistic are higher - The 5% (or 10%) area under the lower end of the distributions - One tailed test as `\(\delta \le 0\)` ] .pull-right[ ![](figures/DF.png) ] --- ## Dickey-Fuller test - urca package <style type="text/css"> .tiny .remark-code { /*Change made here*/ font-size: 50% !important; } </style> .tiny[ ```r library(urca) summary(ur.df(USMacro$LogGDP, type = 'drift')) ``` ``` ############################################### # Augmented Dickey-Fuller Test Unit Root Test # ############################################### Test regression drift Call: lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag) Residuals: Min 1Q Median 3Q Max -0.0272634 -0.0043211 0.0001499 0.0048233 0.0310763 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.030636 0.011393 2.689 0.00777 ** z.lag.1 -0.002831 0.001255 -2.256 0.02514 * z.diff.lag 0.310092 0.067178 4.616 7.01e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.007839 on 199 degrees of freedom Multiple R-squared: 0.1364, Adjusted R-squared: 0.1278 F-statistic: 15.72 on 2 and 199 DF, p-value: 4.583e-07 Value of test-statistic is: -2.2562 24.8777 Critical values for test statistics: 1pct 5pct 10pct tau2 -3.46 -2.88 -2.57 phi1 6.52 4.63 3.81 ``` ] --- ## Dickey-Fuller test - Can handle trend and selection of number of lags .tiny[ ```r dft = ur.df(USMacro$LogGDP, type = "trend", selectlags = "AIC") dft = ur.df(USMacro$LogGDP, type = "trend", lags = 2) summary(dft) ``` ``` ############################################### # Augmented Dickey-Fuller Test Unit Root Test # ############################################### Test regression trend Call: lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag) Residuals: Min 1Q Median 3Q Max -0.025580 -0.004109 0.000321 0.004869 0.032781 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.2790086 0.1180427 2.364 0.019076 * z.lag.1 -0.0333245 0.0144144 -2.312 0.021822 * tt 0.0002382 0.0001109 2.148 0.032970 * z.diff.lag1 0.2708136 0.0697696 3.882 0.000142 *** z.diff.lag2 0.1876338 0.0705557 2.659 0.008476 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.007704 on 196 degrees of freedom Multiple R-squared: 0.1783, Adjusted R-squared: 0.1616 F-statistic: 10.63 on 4 and 196 DF, p-value: 8.076e-08 Value of test-statistic is: -2.3119 11.2558 4.267 Critical values for test statistics: 1pct 5pct 10pct tau3 -3.99 -3.43 -3.13 phi2 6.22 4.75 4.07 phi3 8.43 6.49 5.47 ``` ] --- ## Augmented Dickey-Fuller - `\(Y_t\)` can be nonstationary even if `\(\beta_1 <0\)`, if `\(\beta_2 \neq 0\)` - Testing for a unit root in an AR(2) `$$Y_t = \beta_0 + \beta_1 Y_{t-1} + \beta_2 Y_{t-2} + u_t$$` - One can derive (difficult) that this is a test of `\(\delta=0\)` in `$$\Delta Y_t = \beta_0 + \delta Y_{t-1} + \beta_1 \Delta Y_{t-1} + \beta_2 \Delta Y_{t-2} + v_t$$` - The equation is 'augmented' with lags: - If the alternative hypotheis to a stochastic trend is a deterministic trend `\(\alpha t\)` `$$\Delta Y_t = \beta_0 + \color{red}{\alpha t} + \delta Y_{t-1} + \beta_1 \Delta Y_{t-1} + \beta_2 \Delta Y_{t-2} + v_t$$` - Different distributions and thus critical values for the two tests - The derivation of these formulas are more complicated than for the Dickey-Fuller --- ## Avoiding the problems caused by stochastic trends - If `\(Y_t\)` has a stochastic trend with drift `$$Y_t = \beta_0 + Y_{t-1} + u_t$$` - Differencing, we get `$$\Delta Y_t = Y_t - Y_{t-1}= \beta_0 + u_t$$` - The variable `\(\Delta Y_t\)` is stationary --- ## Orders of intergration 1. If `$$Y_t = \beta_0 + Y_{t-1} + u_t$$` then `$$\Delta Y_t = Y_t - Y_{t-1} = \beta_0 + u_t$$` and `\(Y_t\)` is said to be _integrated of order one_ I(1) 2. If `$$\Delta Y_t = \beta_0 + \Delta Y_{t-1} + u_t$$` then `$$\Delta^2 Y_t = \Delta Y_t - \Delta Y_{t-1} = \beta_0 + u_t$$` and `\(Y_t\)` is said to be _integrated of order two_ I(2) - Example: Prices - Inflation - Change in inflation --- ## Cointegration - Let `$$Y_t = \alpha + \theta X_t + u_t$$` where `\(X_t\)` is I(1) and `\(u_t\)` is stationary - Then: - `\(Y_t\)` is I(1) - `\(Y_t\)` and `\(X_t\)` are said to be _cointegrated_ - The parameter `\(\theta\)` can be estimated with OLS --- ## Testing for cointegration - We can check that `\(Y_t\)` and `\(X_t\)` are _cointegrated_ - Engle-Granger ADF test (EG-ADF) - Check that residuals `\(\hat u_t\)` are stationary - Critical values are different from ADF --- ## Next Lecture - Chapter 16: Estimation of Dynamic Causal Effects