class: center, middle, inverse, title-slide # B/C Econometrics - Lecture 3 ## Linear Regression with Multiple Regressors ### Jonas Björnerstedt ### 2021-10-18 --- ## Lecture Content Linear Regression with Multiple Regressors --- ## Multivariate regression - Linear Regression with Multiple Regressors - Allow `\(k\)` regressors `\(X_1, X_2, \ldots,X_k\)` - Estimate `\(k+1\)` parameters `\(\beta_0, \beta_1, \beta_2, \ldots,\beta_k\)` - Two subscripts are now needed for sample: `\(X_{1i}\)`, `\(X_{2i}\)`, `\(X_{ki}\)` - In lecture we focus on 2 regressors `\(X_i\)` and `\(W_i\)` --- ## Why control variables? If we just want to see how `\(Y_i\)` depends on `\(X_i\)`, why add variable `\(W_i\)` to the regression?! 1. Correlation between `\(X_i\)` and omitted variables - reduce _omitted variable bias_ 2. Reduce uncertainty - Reduce unexplained variation `\(u_i\)` - Tighter confidence intervals on parameters of interest Variables `\(X_2,...,X_k\)` are _control variables_ --- ## Specification - Linear model `$$E(Y_i|X_i, W_i) = \beta_{0}+\beta_{X}X_i + \beta_{W}W_i$$` - Sample data `$$Y_i=\beta_{0}+\beta_{X}X_{i}+ \beta_{W}W_{i}+u_i$$` - Estimate will give `\(\widehat\beta_{0},\widehat\beta_{X},\widehat\beta_{W}\)` and `\(\widehat u_i\)` `$$Y_i=\widehat\beta_{0}+\widehat\beta_{X}X_{i}+ \widehat\beta_{W}W_{i}+\widehat u_i$$` --- ## Linear relationship with 2 vars .pull-left[ ![](figures/linrel.png) ] .pull-right[ - With two independent vars, the following relationship is a plane `$$Y_i = 0.5 X_i - 0.1 W_i$$` - For every `\(X_i\)` and `\(W_i\)` there is a unique `\(Y_i\)` - `\(\beta_0\)` is where plane crosses `\(Y\)` axis - `\(\beta_1,\beta_2\)` is the slope in `\(X\)` and `\(W\)` directions ] --- ## The OLS estimator in multiple regression - The OLS estimator: `$$Y_i = \widehat\beta_{0} + \widehat\beta_{X} X_{i} + \widehat\beta_W W_{i} + \widehat u_i$$` - Find `\(\widehat\beta_0,\widehat\beta_X,\widehat\beta_W\)` that minimize `$$SSR = \sum_{i=1}^n \widehat u_i^2$$` --- class: inverse, center, middle # Multicollinearity --- ## Correlation between random variables - Positive, zero and negative correlation <img src="figures/errcorr.png" alt="Drawing" style="width: 350px;"/> <img src="figures/errcorrneg.png" alt="Drawing" style="width: 350px;"/> --- ## Linear relationship with 2 vars .pull-left[ <img src="figures/linrel.png" alt="Drawing" style="width: 400px;"/> ] .pull-right[ - With two independent vars, the following relationship is a plane `$$Y=-0.1 X + 0.5 W$$` - For every `\(X\)` and `\(W\)` there is a unique `\(Y\)` - `\(\beta_0\)` is intercept and `\(\beta_X,\beta_W\)` are the slopes ] --- ## Standard error with 2 vars - Small variance in `\(u_i\)` and large in `\(X_i\)` and `\(W_i\)` ![](figures/nomulticollin.png) --- ## Multicollinearity - Many planes fit data almost as well ![](figures/highmulticollin.png) --- ## Near perfect Multicollinearity - Detection - low individual significance - despite high joint significance - More data needed! - Does not cause any problems except for identifying single parameters - Do not ’solve’ by dropping a parameter if it should be included - Omitted variable bias - next section - Conceptual problem in model? - Are the variables capturing the same effect? - How do we interpret the coefficients? - Not a technical problem --- ## [Perfect Multicollinearity](http://rstudio.sh.se/content/statistics04-figs/#section-regression) - _Dummy variable trap_ - Regress on constant variable - Impossible to separate the effect of intercept from variable - R automatically drops a variable - Intercept is calculating by adding variable with value 1 for all observations - Makes algebra for solving simpler - Also facilitates understanding perfect multicollinearity - A column cannot be just a linear combination of other columns --- class: inverse, center, middle # Omitted variable bias --- ## Omitted variables - Assume `$$E(Y_i|X_i,W_i) = \beta_{0}+\beta_{X}X_i+ \beta_{W}W_i$$` - What happens if only one variable is included in the regression?: `$$Y_i = \alpha_{0} + \alpha_{X}X_{i} + w_i$$` - `\(u_i\)` can be thought of as the sum of all variables affecting `\(Y_i\)` - The effect of variation in `\(W_i\)` will be in `\(u_i\)` - Note that if `\(W_i\)` does not vary, it will be incorporated in `\(\alpha_0\)` - Thus both the intercept and the error term contain the effect of _everything else_ on `\(Y\)` --- ## Omitted variable bias If `\(W_i\)` is not included, we get _omitted variable bias_ if 2. `\(W_i\)` is a determinant of `\(Y_i\)` 1. `\(X_i\)` and `\(W_i\)` are correlated --- ## Omitted variable bias - If `\(X_i\)` and `\(W_i\)` are correlated, then `\(\delta_X \neq 0\)` in `$$W_i = \delta_0 + \delta_X X_i + v_i$$` - Substitute `\(W_i\)` in the regression `$$Y_i = \beta_{0} + \beta_{X}X_{i} + \beta_{W}\overset{W_i}{\overbrace{\big(\delta_0 +\delta_X X_i + v_i\big)}}+ u_i$$` - Rearrange `$$Y_i = (\beta_{0} + \beta_{W}\delta_0) + (\beta_{X} +\beta_{W}\delta_X) X_i + (\beta_{W}v_i+ u_i)$$` `$$Y_i = \alpha_{0} + \alpha_{X}X_{i} + w_i$$` --- ## Omitted variable bias - Estimating the relationship between only `\(X\)` and `\(Y\)` does not estimate `\(\beta_X\)`! `$$Y_i = (\beta_{0} + \beta_{W}\delta_0) + (\beta_{X} +\beta_{W}\delta_X) X_i + (\beta_{W}v_i+ u_i)$$` We get a bias $$\alpha_X = \beta_X + \beta_W \delta_X \neq \beta_X $$ - The bias of this estimate depends on the sign and magnitudes of `\(\delta_X\)` and `\(\beta_W\)`. - The estimate is _inconsistent_ - increasing the sample size just improves the estimates of `\(\delta_X\)` and `\(\beta_W\)` --- ## Application to the test scores data - Omitted variable - english learners el_pct - Correlation el_pct and str --- ## Test scores - Omitted variable equation - Regressions with and without `el_pct` and with `str` as dependent var
testscr
el_pct
testscr
str
-1.101
1.814
-2.280
el_pct
-0.650
- Omitted variable equation `\(\beta_X + \beta_W \delta_X\)` ```stata . display -1.101 -0.650 * 1.814 -2.2801 ``` --- ## [Tradeoff bias and precision](http://rstudio.sh.se/content/statistics05-figs#section-omitted) ``` =================================================== X 0.994*** 0.995*** (0.027) (0.027) W 0.020 (0.028) Constant 0.973*** 0.973*** (0.025) (0.025) --------------------------------------------------- Observations 50 50 R2 0.967 0.967 Adjusted R2 0.966 0.966 Residual Std. Error 0.177 (df = 47) 0.176 (df = 48) =================================================== Note: *p<0.1; **p<0.05; ***p<0.01 ``` --- ## Omitted variable - Correlation Inclusion/omission of `\(W\)` depends on correlation and on whether it is in the population equation. Correlation `\(X_i\)` and `\(W_i\)` | `\(\beta_W\)` | Included | Omitted ------------ | ---| ----------- | -------------- Uncorrelated | `\(\beta_W = 0\)` | | Correlated | `\(\beta_W = 0\)` | More uncertain | Uncorrelated | `\(\beta_W \neq 0\)` | | More uncertain Correlated | `\(\beta_W \neq 0\)` | | __Biased and Inconsistent__