Microeconometrics - Lecture 7

class: center, middle, inverse, title-slide

# Microeconometrics - Lecture 7
## Instrumental Variables, continued
### Jonas Björnerstedt
### 2022-03-09

---

## Today's lecture

- Chapter 12 - Instrumental Variable Regression

- Continuation from previous lecture

- Endogeneity and Instrumental variables (IV)

- Some optional math included in slides; for completeness

---
## Solution 2: Instrumental variables (IV)

- Let `$X_i$` be endogenous in: `$Y_i = \beta_{0} + \beta_{1} X_i + u_i$`

- `$u_i$` contains omitted variable `$W_i$` correlated with `$X_i$`

- Assume that we can find a variable `$Z_i$`: `$X_i = \pi_{0} + \pi_{1} Z_i + e_i$`

1. that explains `$X_i$`
    
    2. and that `$Z_i$` is uncorrelated with `$e_i$` and `$u_i$` 
    
        - `$Z_i$` is thus uncorrelated with `$W_i$`

- Then the predicted value `$\hat{X}_i = \hat\pi_{0} + \hat\pi_{1}Z_i$`

- splits `$X_i$` into two parts `$X_i = \hat{X}_i + \hat e_i$`

- `$\hat{X}_i$` uncorrelated with `$u_i$` since `$Z_i$` is

---
## Instrumental variable relationship (opt)

- By assumption, in the population:
`$$Cov(Z_i,u_i) = 0 = Cov(Z_i,Y_i - \beta_0 - \beta_1 X_i) = Cov(Z_i,Y_i) - \beta_1 Cov(Z_i, X_i)$$`

- Solving for `$\beta_1$`:
`$$\beta_1  = \frac {Cov(Z_i,Y_i) }{ Cov(Z_i, X_i)}$$`

---
## Two Stage Least Squares estimation

- Let `$X_i$` be endogenous, i.e.: `$\mathrm{Cov}\left(X_i, u_i\right) \neq 0$`
    `$$Y_i = \beta_{0} + \beta_{1}X_i + u_i$$`
    `$$X_i = \pi_{0} + \pi_{1} Z_i + e_i$$`

- Assume that `$Z_i$` is a *valid instrument*:

1.  Not correlated: `$\mathrm{Cov}\left(Z_i, u_i\right) = 0$`
    
    2.  Explains `$X$`: `$\mathrm{Cov}\left(Z_i,X_i \right) \neq 0$`

- We can use many instruments `$Z_{1i},Z_{2i}, \ldots$` for `$X_i$`

- 2SLS estimation: Use `$\hat{X}_i$` instead of `$X_i$` to estimate `$\beta_{1}$`

---
## Two Stage Least Squares estimation

- Then the predicted value 
`$$\hat{X_i}=\hat\pi_{0}+\hat\pi_{1}Z_i$$` 
is uncorrelated with `$u_i$` and
    `$$Y_i = \beta_{0} + \beta_{1}\left(\hat{X_i} + \hat e_i \right) + u_i = \beta_{0} + \beta_{1} \hat{X_i} + \left(\beta_{1} \hat e_i + u_i\right)$$`

- Thus we can use `$\hat{X_i}$` instead of `$X_i$` to estimate `$\beta_{1}$`

- Abbreviated TSLS or 2SLS

---
## Closeness to university as an instrument

- If educ is correlated with unobservable ability, find an instrument

- Distance from home to university
    `$$educ_i = \pi_{0} + \pi_{1} dist_i + e_i$$`
    
    1. Affects likelihood of education
    
    2. Not correlated with ability
    
        - Is this assumption valid?

- The error term `$e_i$` includes ability

---
## Instrumental variable estimation

```r
obs = 1000
iq = 2 + rnorm(obs)
ability = 0.6*iq  + 0.3 * rnorm(obs)
dist = rnorm(obs)
educ = iq + dist + 0.5 * rnorm(obs)

wage = educ + ability + rnorm(obs)
lm( wage  ~ educ)
```

```
## 
## Call:
## lm(formula = wage ~ educ)
## 
## Coefficients:
## (Intercept)         educ  
##      0.6249       1.2848
```

---
## First and second stage regressions

```r
# First stage regression:
stage1 = lm( educ  ~ dist)
educhat = predict(stage1)
# Second stage regression
lm(wage  ~ educhat)
```

```
## 
## Call:
## lm(formula = wage ~ educhat)
## 
## Coefficients:
## (Intercept)      educhat  
##       1.184        1.013
```

---
## TSLS regression

- The `estimatr` packagage has function `iv_robust()` for TSLS regression

- Note that standard errors differ

- The manual TSLS regression neglects that educ_hat is estimated

```r
*library(estimatr)

# TSLS Manually:
stage1 = lm( educ  ~ dist)
educhat = predict(stage1)
iv2 = lm( wage  ~ educhat)

# TSLS using iv_robust:
ivr = iv_robust(wage  ~ educ | dist) 
```

---

```r
huxreg(ivr, iv2)
```

<table class="huxtable" style="border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  " id="tab:unnamed-chunk-4">
<col><col><col><tr>
<th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(1)</th><th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(2)</th></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(Intercept)</th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1.184 ***</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1.184 ***</td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(0.087)   </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(0.142)   </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">educ</th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1.013 ***</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(0.038)   </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">educhat</th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1.013 ***</td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">(0.062)   </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">N</th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1000        </td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1000        </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">R2</th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">0.719    </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">0.211    </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">logLik</th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-2100.309    </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">AIC</th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">        </td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">4206.618    </td></tr>
<tr>
<th colspan="3" style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;"> *** p < 0.001;  ** p < 0.01;  * p < 0.05.</th></tr>
</table>

---
## TSLS estimator (optional)

- We have 
`$$\hat\beta_1  = \frac {s_{\hat{X}Y} }{s _{\hat{X}\hat{X}}} \textrm{ and } \hat\pi_1  = \frac {s_{ZX} }{s _{ZZ}}$$`

- The covariance `$s_{\hat{X}Y} = \frac{1}{n} \sum_i (\hat X_i - \bar{\hat X_i})(Y_i - \bar{Y_i})$`, with:
`$$\hat X_i - \bar{\hat X_i} = \hat \pi_{0} + \hat \pi_{1} Z_i - \frac{1}{n} \sum_i (\hat \pi_{0} + \hat \pi_{1} Z_i) = \hat \pi_{1}( Z_i - \bar Z_i )$$`

- Thus
`$$s_{\hat{X}Y} = \hat \pi_{1} s_{ZY}  \textrm{ and } s_{\hat{X}\hat{X}}^2 = \hat \pi_{1}^2 s_{ZZ}^2$$`

- Combining
`$$\hat\beta_1  = \frac {s_{\hat{X}Y} }{s _{\hat{X}\hat{X}}^2} = \frac {\hat \pi_{1} s_{ZY}  }{\hat \pi_{1}^2 s_{ZZ}^2} = \frac { s_{ZY}  }{\hat \pi_{1} s_{ZZ}^2} = \frac { s_{ZY}  }{\frac {s_{ZX} }{s _{ZZ}^2} s_{ZZ}^2} = \frac { s_{ZY}  }{s_{ZX} }$$`

---
## Limit of TSLS estimator (optional)

- We see that the TSLS estimator `$\hat\beta_1$` satisfies

`$$\hat\beta_1 = \frac { s_{ZY}  }{s_{ZX} } = \frac{ \sum_i (Z_i - \bar Z) (Y_i - \bar Y)  }{s_{ZX} } =\frac{ \sum_i (Z_i - \bar Z) \left(\beta_1(X_i - \bar X)  + u_i\right) }{s_{ZX} }$$`
`$$\hat\beta_1  = \beta_1\frac{ s_{ZX}}{s_{ZX} } + \frac{ \sum_i (Z_i - \bar Z)  u_i}{s_{ZX} } \overset{p}{\longrightarrow}  \beta_1 + \frac{Cov(Z_i,u_i) }{ Cov(Z_i, X_i)}$$`

- TSLS converges in probability if `$Cov(Z_i,u_i) = 0$`

- Can be very inconsistent if `$Cov(Z_i,u_i) \ne 0$` and `$Cov(Z_i,X_i) \approx 0$`

---
## The general IV regression model

- Let `$X_i$` be endogenous, i.e.: `$\mathrm{Cov}\left(X_i, u_i\right)\neq 0$`, `$W_i$` exogenous, and `$Z_{1i},Z_{2i}$` be instrumental variables 
     `$$Y_i = \beta_{0} + \beta_{1}X_i + \beta_{2}W_i + u_i$$`
   - Textbook has several `$W_{ri}$` and instruments `$Z_{mi}$` 
    
    `$$X_i = \pi_{0} + \pi_{1} Z_{1i} + \pi_{2} Z_{2i} + \pi_{3} W_{i} + e_{i}$$`

- We allow here for `$X_i$` being correlated with `$W_i$`

- `$W_i$` is included in the first stage regression

---
## Bad and weak instruments

- The TSLS estimate is:

- **Biased** towards OLS estimate

- Consistent - bias decreases towards zero as sample size
        increases

- Instrument `$Z_i$` is *bad* if
    `$\mathrm{Cov}\left(Z_i,u_i\right)\neq0$`

- TSLS estimate is inconsistent

- Can be _much_ more biased than OLS estimate!

- No formal test to see whether condition holds!!

- Instrument `$Z_i$` is *weak* if `$\mathrm{Cov}\left(Z_i ,X_i\right)\approx0$`

- Very large sample required to reduce bias

- Weak instrument creates `$\hat{X_i}$` that is not completely
        uncorrelated with `$u_i$`

---
## Endogenous instruments

- Bad instruments can make estimates worse

```r
obs = 1000
iq = 2 + rnorm(obs)
ability = iq  + 0.3 * rnorm(obs)
dist = rnorm(obs) + 0.2 * ability
*educ = 0.05*dist + iq + rnorm(obs)
wage = educ + ability + rnorm(obs)

reg = iv_robust(wage  ~ educ | dist) 
reg
```

```
##              Estimate Std. Error  t value     Pr(>|t|)    CI Lower CI Upper  DF
## (Intercept) 0.8067198  0.4069051 1.982575 4.768879e-02 0.008232176 1.605208 998
## educ        1.6041738  0.2014062 7.964869 4.486807e-15 1.208945636 1.999402 998
```

---
## Testing for endogeneity

- Can test wheter TSLS regression is different

- Cannot test whether endogeneity exists

- The instruments could be endogeneous

- Example: instruments are same as endogenous var

```r
iq = 2 + rnorm(obs)
ability = iq  + 0.3 * rnorm(obs)
educ = iq + rnorm(obs)
dist = educ + 0.01*rnorm(obs)    
wage = educ + ability + rnorm(obs)

reg = iv_robust(wage  ~ educ | dist, 
*         diagnostics = TRUE)
```

---

```r
summary(reg)
```

```
## 
## Call:
## iv_robust(formula = wage ~ educ | dist, diagnostics = TRUE)
## 
## Standard error type:  HC2 
## 
## Coefficients:
##             Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
## (Intercept)    1.080    0.06582    16.4  1.006e-53   0.9505    1.209 998
## educ           1.467    0.02657    55.2 1.492e-305   1.4145    1.519 998
## 
## Multiple R-squared:  0.7269 ,	Adjusted R-squared:  0.7266 
## F-statistic:  3047 on 1 and 998 DF,  p-value: < 2.2e-16
## 
## Diagnostics:
##                  numdf dendf     value p.value    
## Weak instruments     1   998 2.076e+07  <2e-16 ***
## Wu-Hausman           1   997 5.103e+00  0.0241 *  
## Overidentifying      0    NA        NA      NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

---
## TSLS estimation bias

- Given the assumptions
    `$\mathrm{Cov}\left(e_i,u_i\right)=\sigma_{e u}\neq0$`
    `$$Y = \beta_{0} + \beta_{1} X_i + u_i$$`
    `$$X = \pi_{0} + \pi_{1} Z_i + e_i$$`

- Assume that `$\mathrm{Cov}\left(X_i,u_i \right)>0$`. Then
    `$$0 < \mathrm{Cov}\left(X_i,u_i \right) = \pi_{1}\mathrm{Cov}\left(Z_i, u_i \right) + \mathrm{Cov}\left(e_i, u_i \right) = \mathrm{Cov}\left(e_i ,u_i\right) = \sigma_{e u}$$`

- It can be shown that
    `$$\mathrm{E}\left(\widehat\beta_{i}^{TSLS}\right)-\beta_i \approx (\beta_{i}^{OLS} - \beta_i) \frac{1}{E(F) - 1}$$`
    where `$\beta_{i}^{OLS}$` is the OLS estimator for a very large (infinite) sample and `$F$` is the F-statistic of the first stage regression

- The bias is

- Towards the OLS estimate

- Closer to OLS the weaker the instruments are

---
## Example: Demand estimation

1.  Lagged variables

1.  Prices in previous period are correlated with costs
    2.  Demand and cost shocks in current period do not depend on last
        period
        
2.  Cost shifters

1.  With supply equation
        `$$p=\pi_{0}+\pi_{1}c+\pi_{2}q+e$$`
    2.  Use costs of factors of production as instruments:
        `$$p=\pi_{0}+\pi_{1}c+\upsilon$$` Assuming that costs `$c$` do
        not depend on demand
        - Example: oil price - set in other bigger market.
        
3.  BLP instruments

1.  How close are products in terms of characteristics?
    2.  Example: Count number of products with similar characteristics

---
## Testing instruments

- We cannot test whether endogeneity is solved

- We can test whether TSLS estimate is significantly different than OLS.
    
        - Do our instruments change anything?

- But our choice of instruments might be bad
    
        - Consider `$X_i$` as an instrument for itself!
        
        - Highly significant, but correlated with `$u_i$`.

- Weak instruments

- F-test of joint significance of instruments in first stage of
        TSLS

- Rule of thumb `$F > 10$`. (Staiger & Stock)

- Overidentification test

- With many instruments, we can test whether they all are all
        uncorrelated with `$u_i$`.

---
## Overidentification test

- Test whether instruments `$Z_{mi}$` are uncorrelated with `$u_i$`

- Regress residuals on `$Z_{mi}$` and `$W$`:
`$$\hat u_i = \delta_0 + \delta_1 Z_{1i} + \delta_2 Z_{2i}+ \delta_3 W_i + e_i$$`

- Test whether `$\delta_1 = \delta_2 = 0$`

- Test statistic has `$m - k$` degrees of freedom ( `$m$` instruments and `$k$` endogenous variables)

- Because `$\hat u_i$` is estimated

- More instruments than endogenous variables necessary

- If test fails we cannot say which instrument is endog.