class: center, middle, inverse, title-slide # Microeconometrics - Lecture 9 ## Nonlinear models ### Jonas Björnerstedt ### 2022-03-15 --- ## Lecture Content - Chapter 8: Nonlinear models - Marginal effects - Censored data - Chapter 9. Assessing studies based on multiple regressions --- ## Quadratic model .pull-left[ ![](me09_files/figure-html/unnamed-chunk-1-1.png)<!-- --> ] .pull-right[ - Quadratic models can be estimated linearly: `\(Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i\)` - Care must be taken in interpretation of `\(\beta_1\)` and `\(\beta_2\)`, however - `\(\beta_1\)` is no longer the marginal effect of an increase in `\(X_i\)` ] --- ## Marginal effect Given a true model $$E(Y_i|X_i) = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 $$ the effect of a marginal increase in `\(X_i\)` is $$\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + 2 \beta_2 X_i $$ The effect will depend on the value of `\(X_i\)` --- ## Logarithmic model - Logarithms often used - Often from theory (constant elasticity of substitution) - `\(\beta_1\)` has different interpretation - percentage change in logged variable 1. Logarithmic in `\(X_i\)` (sometimes called Linear-Log) `$$Y_i = \beta_0 + \beta_1 \log(X_i) + u_i$$` 2. Log-Linear model `$$\log(Y_i) = \beta_0 + \beta_1 X_i + u_i$$` 3. Log-Log model `$$\log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u_i$$` --- ## Exponential model .pull-left[ ![](me09_files/figure-html/unnamed-chunk-2-1.png)<!-- --> ] .pull-right[ - Some nonlinear models can be estimated linearly `$$Y_i = \exp ( \beta_0 + \beta_1 X_i+u_i)$$` - It can be transformed to the linear model: `$$\log (Y_i) = \beta_0 + \beta_1 X_i+u_i$$` - Sometimes used for models with nonnegative dependent variable ] --- ## Log-log model .pull-left[ ![](me09_files/figure-html/unnamed-chunk-3-1.png)<!-- --> ] .pull-right[ - Some nonlinear models can be estimated linearly `$$Y_i = \beta_0 X_i^{\beta_1} u_i$$` - It can be transformed to the linear model: `$$\log(Y_i) = log(\beta_0) + \beta_1 \log(X_i)+\log(u_i)$$` - Example Cobb-Douglas ] --- ## Interaction terms - With several regressors, products of these can be used `$$Y_i = \beta_0 + \beta_1 X_i + \beta_2 W_i + \beta_3 X_i W_i + u_i$$` - Can be motivaded as an approximation _(linearization)_ of a nonlinear model - Include both linear and product terms - The marginal effect of an increase of `\(X_i\)` depends on the level of `\(W_i\)`: $$\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + \beta_3 W_i $$ --- ## Interactions with dummy variables Consider the estimation of individual earnings on work experience and having a college degree `$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + u_i$$` - We can allow different marginal effects of experience for those having a degree: `$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + \beta_3 degree_i* exper_i + u_i$$` - `\(\beta_3\)` captures the change in slope due to a college degree - The marginal effect of experience on earnings is: - Without a degree: `\(\beta_1\)` - With a degree: `\(\beta_1 + \beta_3\)` --- ## Same or different coefficients `$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + \beta_3 degree_i* exper_i + u_i$$` - The model above is the same as estimating separately for those with a degree and those without - Both intercept and effect of experience can be different for the two categories - More generally, specify that some coefficients can be different with interaction terms --- ## Linear estimation of nonlinear model .pull-left[ ![](me09_files/figure-html/unnamed-chunk-4-1.png)<!-- --> ] .pull-right[ - Linear model sometimes - reasonably close to true model - gives similar average marginal effect - simpler - can be *also* reported in results ] --- ## Testing nonlinearity - In general one cannot compare two model specifications - But models can be compared if one model is _nested_ in another - Special case of other model - Nest linear model in a nonlinear model - Test if the nonlinear parameters are nonzero - Test if model is quadratic with `\(\hat \beta_2 = 0\)` in: `\(Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i\)` --- ## Plotting residuals - Given regression assumptions, residuals should look like white noise - There should be no patterns in `\(\hat u_i\)` - Plot against regressors - `\(\hat u_i\)` does not have to be normally distributed, of course! --- ## Which marginal effects? - One purpose of empirical analysis is to study what happens if we were to change a parameter - *Marginal effect at the mean* - effect at mean values `\(\bar X\)` of regressors `\(X_i\)` - *Average Marginal Effect* - effect on dependent variable for small increase in independent variable - Also called *average partial effect* - Average increase depends on distribution of wages --- ## Quadratic model again .pull-left[ - Quadratic model: `\(Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i\)` - Marginal effect: `\(E(Y_i|X_i) = \beta_0 + \beta_1 X_i + \beta_2 X_i^2\)` - The effect of a marginal increase in `\(X_i\)` is: `\(\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + 2 \beta_2 X_i\)` - The marginal effect at the average `\(\bar X\)` is `\(\beta_1 + 2 \beta_2 \bar X\)` ] .pull-right[ ![](me09_files/figure-html/unnamed-chunk-5-1.png)<!-- --> ] --- ## Average Marginal Effect .pull-left[ - Marginal increase depends on `\(X_i\)` - Calculate average marginal effect by: 1. Calculate marginal effect for each observation: `\(\beta_1 + 2 \beta_2 X_i\)` 2. Take average of individual marginal effects ] .pull-right[ ![](me09_files/figure-html/unnamed-chunk-6-1.png)<!-- --> ] --- ## Nonlinear estimation - Nonlinear models estimated with linear methods - Linear estimation - Analysis of results more complex than with linear model - Nonlinear estimation - Use Maximum likelihood (ML) - or General Method of Moments (GMM) --- class: inverse, center, middle # Censored data --- ## Truncation and censoring - Censoring: Dependent variable limited - Truncation: Observations limited Can be due to either - Limit to actions: ex. demand - negative demand not possible - *Corner solution* models - Limit in data: ex. top income category - data above limit exist but are not reported - *Sample selection* models --- ## Selection - Sample selection - Creating sample data entails selection - ex. telephone survey - Self selection - Individual decides whether to participate - ex. who answers/participates in telephone survey When and how is this important? - Do we want to make inferences on sample or on whole population? --- ## What happens with censoring - Data and true relationship - `\(Y=3+X_i+u_i\)` ![](me09_files/figure-html/unnamed-chunk-7-1.png)<!-- --> - Consider what happens if data for `\(Y>10\)` is limited --- ## Bias of censoring - Slope is decreased: Censored dependent value lower - `\(Y_i=\min (3+X_i+u_i, 10)\)` ![](me09_files/figure-html/unnamed-chunk-8-1.png)<!-- --> --- ## Bias of truncation - Truncation even more problematic: - Less data ![](me09_files/figure-html/unnamed-chunk-9-1.png)<!-- --> --- ## Selection on exogenous variable - Estimation with exogenous selection unproblematic: - consistent (but with less data) ![](me09_files/figure-html/unnamed-chunk-10-1.png)<!-- --> --- ## Latent variables - With censoring we observe `$$\tilde Y_i=\min (3+X_i+u_i, 10)$$` rather than the variable `\(Y_i=3+X_i+u_i\)` - Estimation is on the wrong variable `\(\tilde Y_i\)` rather than `\(Y_i\)` - The unobserved variable `\(Y_i\)` is called a *latent variable* - Can be due to - limits on observations - limits in actual data (nonnegative wages) --- ## Nonlinear estimation - Labor supply - Zero supply for low wages - Negative supply not possible - Conditional expectation nonlinear function of wage - Almost horizontal for low wages ![](me09_files/figure-html/unnamed-chunk-11-1.png)<!-- --> --- ## Estimation technique - Take censoring/truncation limit into account - Assume errors have normal distribution - Cannot see what truncation implies - Homoscedastic errors assumed --- ## Selection correlated with observables Example: - Truman election: Telephone survey of political preferences - Wealthy were more likely to have a telephone - Wealthy also more likely to vote for Dewey rather than Truman - Wealth unobserved but correlated with - selection - dependent variable - **Bias only when wealth is not included as regressor!** --- ## Tobit assumes normal errors - Model with nonnormal errors: `\(Y_i = 3 + X_i + u_i\)` - `\(\varepsilon\)`: 3 with probability 0.8, 0 otherwise - Tobit regression shows bias 0.837 < 1 --- class: inverse, center, middle # Causation --- ## Correlation is not causation Correlation between `\(X_i\)` and `\(Y_i\)` can be due to 1. `\(X_i\)` causes `\(Y_i\)` 2. `\(Y_i\)` causes `\(X_i\)` 3. `\(X_i\)` causes `\(Y_i\)` and `\(Y_i\)` causes `\(X_i\)` - *Self reinforcing system* / *simultaneity* 4. `\(W_i\)` causes both `\(X_i\)` and `\(Y_i\)` - *Spurious relationship* - `\(W_i\)` is a *confounding factor* / *lurking relationship* - `\(W_i\)` is often time 5. `\(X_i\)` and `\(Y_i\)` are independent - *Coincidence in data* - If you look long enough you will find patterns --- ## Spurious correlation (Pearson) - Normalizing `\(X_i\)` and `\(Y_i\)` by `\(Z_i\)` causes `\(X_i\)` and `\(Y_i\)` to become correlated - Example: dividing by population to get per capita data - *Spurious correlation* sometimes used more generally - “Spurious correlation”: <http://www.tylervigen.com/> --- ## [Coincidence in data](http://rstudio.sh.se/content/me/me09-figs/) ![](me09_files/figure-html/unnamed-chunk-12-1.png)<!-- --> --- ## Next Lecture - Chapter 11. Binary dependent variables - Chapter 13. Experiments and quasi-experiments