Econometrics - Lecture 5

class: center, middle, inverse, title-slide

# Econometrics - Lecture 5
## Linear Regression with Multiple Regressors
### Jonas Björnerstedt
### 2022-03-03

---

## Lecture Content

Chapter 6. Linear Regression with Multiple Regressors

---
## Multivariate regression

- Linear Regression with Multiple Regressors

- Allow `$k$` regressors `$X_1, X_2, \ldots,X_k$`

- Estimate `$k+1$` parameters `$\beta_0, \beta_{X}, \beta_{W}, \ldots,\beta_k$`
    
    - Two subscripts are now needed for sample: `$X_{1i}$`, `$X_{2i}$`, `$X_{ki}$`
    
- In lecture we focus on 2 regressors `$X$` and `$W$`

- See textbook for `$k$` regressors

---
## Why control variables?

If we just want to see how `$Y_i$` depends on `$X_i$`, why add variable `$W_i$` to the regression?!

1. Correlation between `$X_i$` and omitted variables

- reduce _omitted variable bias_
    
2. Reduce uncertainty

- Reduce unexplained variation `$u_i$`
    
    - Tighter confidence intervals on parameters of interest

Variables `$X_2,...,X_k$` are _control variables_

---
## Specification
    
- Linear model

`$$E(Y_i|X_i, W_i) = \beta_{0}+\beta_{X}X_i +
\beta_{W}W_i$$`

- Sample data
`$$Y_i=\beta_{0}+\beta_{X}X_{i}+
\beta_{W}W_{i}+u_i$$`

- Estimate will give `$\widehat\beta_{0},\widehat\beta_{X},\widehat\beta_{W}$` and `$\widehat u_i$`
`$$Y_i=\widehat\beta_{0}+\widehat\beta_{X}X_{i}+
\widehat\beta_{W}W_{i}+\widehat u_i$$`

---
## Linear relationship with 2 vars

.pull-left[
![](figures/linrel2.png)
]

.pull-right[

- With two independent vars, the following relationship is a plane
`$$Y_i = -0.1 X_i + 0.5 W_i$$`
    
    - For every `$X_i$` and `$W_i$` there is a unique `$Y_i$`

- `$\beta_0$` is where plane crosses `$Y$` axis

- `$\beta_{X},\beta_{W}$` is the slope in `$X$` and `$W$` directions
]

---
## The OLS estimator in multiple regression

- The OLS estimator:
`$$Y_i = \widehat\beta_{0} + \widehat\beta_{X} X_{i} + \widehat\beta_{W} W_{i} + \widehat u_i$$`

- Find `$\widehat\beta_0,\widehat\beta_{X},\widehat\beta_{W}$` that minimize
`$$SSR = \sum_{i=1}^n \widehat u_i^2$$`

---
## Regression residual

- Residual has variance
`$$\widehat\sigma_u^{2}=\frac{1}{n-3}\sum_{i=1}^{N} \widehat u_{i}^{2}=\frac{SSR}{n-3}$$`

- Degrees of freedom: `$n-3$`

- We have estimated 3 parameters `$\hat\beta_0, \hat\beta_{X}, \hat\beta_{W}$`

---
## Degrees of freedom

- Estimating the average `$\bar Y$` with one observation `$(n=1)$` gives zero variance
 
    - The average `$\bar Y = Y_1$`

- Estimating `$E(Y|X) = \beta_0 + \beta_{X} X$` with two observations also gives perfect fit

- With `$X,W$` and 3 parameters `$\beta_0$`, `$\beta_{X}$` and `$\beta_{W}$` three observations `$(n = 3)$` are fit perfectly

- Degrees of freedom adjustment compensates for this

---
## Adjusted `$R^{2}$`

- Adding regressors always increases `$R^{2}$`

- Better measure Adjusted `$\bar{R}^{2}$`
`$$\bar{R}^{2}=1-\frac{SSR/\left(n-k\right)}{TSS/\left(n-1\right)}$$`

- Adjusted by degrees of freedom `$n-k$`, where `$k$` is the number of parameters `$\beta$`

- Adding parameters can decrease `$\bar{R}^{2}$` if `$SSR$` only decreases a little

---
class: inverse, center, middle
# Multicollinearity

---
## Correlation between random variables

- Positive, zero and negative correlation

---
## [Linear relationship with 2 vars <sup> 🔗 </sup>](http://rstudio.sh.se/content/statistics03-figs.Rmd#section-ols)

.pull-left[

<img src="figures/linrel.png" alt="Drawing" style="width: 400px;"/>
]

.pull-right[

- With two independent vars, the following relationship is a plane
    `$$Y=-0.1 X + 0.5 W$$`
    - For every `$X$` and `$W$` there is a unique `$Y$`
  
    - `$\beta_0$` is intercept and `$\beta_{X},\beta_{W}$` are the slopes

]

---
## Standard error with 2 vars

- Small variance in `$u_i$` and large in `$X_i$` and `$W_i$`

![](figures/nomulticollin2.png)

---
## [Multicollinearity <sup> 🔗 </sup>](http://rstudio.sh.se/content/statistics05-figs.Rmd#section-multicollinearity)

- Many planes fit data almost as well

![](figures/highmulticollin2.png)

---
## Near perfect Multicollinearity

- Detection
 
    - low individual significance
  
    - despite high joint significance
    
- More data needed!

- Does not cause any problems except for identifying single parameters

- Do not ’solve’ by dropping a parameter if it should be included

- Omitted variable bias - next section

- Conceptual problem in model?
  
    - Are the variables capturing the same effect?
 
    - How do we interpret the coefficients?

- Not a technical problem

---
## Perfect Multicollinearity

- _Dummy variable trap_

- Regress on constant variable

- Impossible to separate the effect of intercept from variable

- Stata automatically drops a variable

- Intercept is calculating by adding variable `$X_0 = 1$`

- Makes algebra for solving simpler

- Also facilitates understanding perfect multicollinearity

- A column cannot be just a linear combination of other columns

---
class: inverse, center, middle
# Omitted variable bias

---
## Omitted variables

- Assume
`$$E(Y_i|X_i,W_i) = \beta_{0}+\beta_{X}X_i+
\beta_{W}W_i$$`

- What happens if only one variable is included in the regression?:
`$$Y_i = \alpha_{0} + \alpha_{X}X_{i} + v_i$$`
  - Estimating the conditional expectation
`$$E(Y_i|X_i) = \alpha_{0}+\alpha_{X}X_i$$`

- `$u_i$` can be thought of as the sum of all variables affecting `$Y_i$`

- The effect of variation in `$W_i$` will be in the error term `$v_i$`

- Note that if `$W$` does not vary, it will be incorporated in `$\alpha_0$`

- Thus both the intercept and the error term contain the effect of _everything else_ on `$Y$`

---
## Conditions for omitted variable bias

If `$W_i$` is not included, we get _omitted variable bias_ if

2. `$W_i$` is a determinant of `$Y_i$`

1. `$X_i$` and `$W_i$` are correlated

- Equation (6.1) on page 231 is not very intuitive

---
## Omitted variable bias

- If `$X_i$` and `$W_i$` are correlated, then `$\omega_{X} \neq 0$` in
`$$W_i = \omega_{0} + \omega_{X} X_i + w_i$$`

- Substitute `$W_i$` in the regression 
`$$Y_i = \beta_{0} + \beta_{X}X_{i} + \beta_{W}\overset{W_i}{\overbrace{\big(\omega_{0} +\omega_{X} X_i + w_i\big)}}+ u_i$$`

- Rearrange 
`$$Y_i = (\beta_{0} + \beta_{W}\omega_{0}) + (\beta_{X} +\beta_{W}\omega_{X}) X_i + (\beta_{W}w_i+ u_i)$$` 
`$$Y_i = \alpha_{0} + \alpha_{X}X_{i} + v_i$$`

---
## Omitted variable bias

- Estimating the relationship between only `$X$` and `$Y$` does not estimate `$\beta_{X}$`!
`$$Y_i = (\beta_{0} + \beta_{W}\omega_{0}) + (\beta_{X} +\beta_{W}\omega_{X}) X_i + (\beta_{W}w_i+ u_i)$$`

We get a bias
`$$\alpha_X = \beta_{X} + \beta_{W}\omega_{X}  \neq \beta_{X}$$`

- The bias of this estimate depends on the sign and magnitudes of `$\omega_{X}$` and `$\beta_{W}$`.

- The estimate is an _inconsistent_ estimator of `$\beta_X$`

- increasing the sample size just improves the estimate of `$\alpha_{X}$`

---
## Application to the test scores data

- Omitted variable

```r
library(estimatr)
library(tidyverse)
caschool = read_rds("caschool.rds")

# Regression with both
rboth = lm( testscr ~ str + el_pct, data = caschool)
# Regression omitting el_pct
rstr = lm( testscr ~ str, data = caschool )
# How does el_pct depend on str?:
re = lm( el_pct ~ str, data = caschool )
```

---
## Test scores - Omitted variable equation

- Regressions with and without `el_pct` and with `str` as dependent var
<table class="huxtable" style="border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  " id="tab:unnamed-chunk-2">
<col><col><col><col><tr>
<th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">testscr</th><th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">testscr</th><th style="vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">el_pct</th></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">str</th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-1.101</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">1.814</td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-2.280</td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">el_pct</th><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">-0.650</td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td><td style="vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td></tr>
<tr>
<th style="vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;"></th><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td><td style="vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;">    </td></tr>
</table>

- Omitted variable equation 
`$E(\hat\beta_{X}) = \beta_{X} + \omega_{X}\beta_{W}$`

```r
rboth$coefficients["str"] + re$coefficients["str"]*rboth$coefficients["el_pct"]
```

```
      str 
-2.279808 
```

---
## [Tradeoff bias and precision <sup> 🔗 </sup>](http://rstudio.sh.se/content/statistics05-figs.Rmd#section-omitted)

```

===================================================
X                      0.994***        0.995***    
                        (0.027)         (0.027)    
                                                   
W                        0.020                     
                        (0.028)                    
                                                   
Constant               0.973***        0.973***    
                        (0.025)         (0.025)    
                                                   
---------------------------------------------------
Observations              50              50       
R2                       0.967           0.967     
Adjusted R2              0.966           0.966     
Residual Std. Error 0.177 (df = 47) 0.176 (df = 48)
===================================================
Note:                   *p<0.1; **p<0.05; ***p<0.01
```

---
## Omitted variable - Correlation

Inclusion/omission of `$W$` depends on correlation and on whether it is in the population equation.

Correlation `$X_i$` and `$W_i$`         | `$\beta_W$`        | Included     | Omitted   
------------ | ---| ----------- | --------------
Uncorrelated | `$\beta_W = 0$` |  | 
Correlated   | `$\beta_W = 0$`    |  More uncertain | 
Uncorrelated | `$\beta_W \neq 0$` | | More uncertain  
Correlated   | `$\beta_W \neq 0$` |  | __Biased and Inconsistent__