Microeconometrics - Lecture 9

class: center, middle, inverse, title-slide

# Microeconometrics - Lecture 9
## Nonlinear models
### Jonas Björnerstedt
### 2022-03-15

---

## Lecture Content

- Chapter 8: Nonlinear models

- Marginal effects

- Censored data

- Chapter 9. Assessing studies based on multiple regressions

---
## Quadratic model

.pull-left[

![](me09_files/figure-html/unnamed-chunk-1-1.png)

]
.pull-right[

- Quadratic models can be estimated linearly: `$Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i$`

- Care must be taken in interpretation of `$\beta_1$` and `$\beta_2$`, however

- `$\beta_1$` is no longer the marginal effect of an increase in `$X_i$`
]

---
## Marginal effect

Given a true model
$$E(Y_i|X_i) =  \beta_0 + \beta_1 X_i + \beta_2 X_i^2 $$
the effect of a marginal increase in `$X_i$` is
$$\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + 2 \beta_2 X_i $$
The effect will depend on the value of `$X_i$`

---
## Logarithmic model

- Logarithms often used

- Often from theory (constant elasticity of substitution)

- `$\beta_1$` has different interpretation

- percentage change in logged variable

1. Logarithmic in `$X_i$` (sometimes called Linear-Log)
`$$Y_i = \beta_0 + \beta_1 \log(X_i) + u_i$$`
2. Log-Linear model
`$$\log(Y_i) = \beta_0 + \beta_1 X_i + u_i$$`
3. Log-Log model
`$$\log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u_i$$`

---
## Exponential model

.pull-left[

![](me09_files/figure-html/unnamed-chunk-2-1.png)
]
.pull-right[

- Some nonlinear models can be estimated linearly

`$$Y_i = \exp ( \beta_0 + \beta_1 X_i+u_i)$$`

- It can be transformed to the linear model:

`$$\log (Y_i) = \beta_0 + \beta_1 X_i+u_i$$`

- Sometimes used for models with nonnegative dependent variable
]

---
## Log-log model

.pull-left[

![](me09_files/figure-html/unnamed-chunk-3-1.png)

]
.pull-right[

- Some nonlinear models can be estimated linearly

`$$Y_i =  \beta_0  X_i^{\beta_1} u_i$$`

- It can be transformed to the linear model:

`$$\log(Y_i) = log(\beta_0) + \beta_1 \log(X_i)+\log(u_i)$$`

- Example Cobb-Douglas
]

---
## Interaction terms

- With several regressors, products of these can be used
`$$Y_i = \beta_0 + \beta_1 X_i + \beta_2 W_i + \beta_3 X_i W_i + u_i$$`

- Can be motivaded as an approximation _(linearization)_ of a nonlinear model

- Include both linear and product terms

- The marginal effect of an increase of `$X_i$` depends on the level of `$W_i$`:
$$\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + \beta_3 W_i $$

---
## Interactions with dummy variables

Consider the estimation of individual earnings on work experience and having a college degree

`$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + u_i$$`

- We can allow different marginal effects of experience for those having a degree:
`$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + \beta_3 degree_i* exper_i + u_i$$`

- `$\beta_3$` captures the change in slope due to a college degree

- The marginal effect of experience on earnings is:

- Without a degree: `$\beta_1$`

- With a degree: `$\beta_1 + \beta_3$`

---
## Same or different coefficients

`$$earn_i = \beta_0 + \beta_1 exper_i + \beta_2 degree_i + \beta_3 degree_i* exper_i + u_i$$`

- The model above is the same as estimating separately for those with a degree and those without

- Both intercept and effect of experience can be different for the two categories

- More generally, specify that some coefficients can be different with interaction terms

---
## Linear estimation of nonlinear model

.pull-left[

![](me09_files/figure-html/unnamed-chunk-4-1.png)
]

.pull-right[

- Linear model sometimes

- reasonably close to true model

- gives similar average marginal effect

- simpler

- can be *also* reported in results
]

---
## Testing nonlinearity

- In general one cannot compare two model specifications

- But models can be compared if one model is _nested_ in another

- Special case of other model
    
- Nest linear model in a nonlinear model

- Test if the nonlinear parameters are nonzero

- Test if model is quadratic with `$\hat \beta_2 = 0$` in:
    `$Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i$`

---
## Plotting residuals

- Given regression assumptions, residuals should look like white noise

- There should be no patterns in `$\hat u_i$`

- Plot against regressors

- `$\hat u_i$` does not have to be normally distributed, of course!
    
---
## Which marginal effects?

- One purpose of empirical analysis is to study what happens if we were to change a parameter

- *Marginal effect at the mean* - effect at mean values `$\bar X$` of regressors `$X_i$`

- *Average Marginal Effect* - effect on dependent variable for small increase in independent variable

- Also called *average partial effect*

- Average increase depends on distribution of wages

---
## Quadratic model again

.pull-left[

- Quadratic model: `$Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i$`

- Marginal effect: `$E(Y_i|X_i) = \beta_0 + \beta_1 X_i + \beta_2 X_i^2$`

- The effect of a marginal increase in `$X_i$` is: `$\frac{\partial E(Y_i|X_i)}{\partial X_i} = \beta_1 + 2 \beta_2 X_i$`

- The marginal effect at the average `$\bar X$` is `$\beta_1 + 2 \beta_2 \bar X$`
]
.pull-right[

![](me09_files/figure-html/unnamed-chunk-5-1.png)

]

---
## Average Marginal Effect

.pull-left[
- Marginal increase depends on `$X_i$`

- Calculate average marginal effect by:

1. Calculate marginal effect for each observation: `$\beta_1 + 2 \beta_2 X_i$`
    
    2. Take average of individual marginal effects
]
.pull-right[

![](me09_files/figure-html/unnamed-chunk-6-1.png)
]

---
## Nonlinear estimation

- Nonlinear models estimated with linear methods

- Linear estimation

- Analysis of results more complex than with linear model

- Nonlinear estimation

- Use Maximum likelihood (ML)

- or General Method of Moments (GMM)

---
class: inverse, center, middle
# Censored data

---
## Truncation and censoring

- Censoring: Dependent variable limited

- Truncation: Observations limited

Can be due to either

- Limit to actions: ex. demand

- negative demand not possible

- *Corner solution* models

- Limit in data: ex. top income category

- data above limit exist but are not reported

- *Sample selection* models

---
## Selection

- Sample selection

- Creating sample data entails selection

- ex. telephone survey

- Self selection

- Individual decides whether to participate

- ex. who answers/participates in telephone survey

When and how is this important?

- Do we want to make inferences on sample or on whole population?

---
## What happens with censoring

- Data and true relationship

- `$Y=3+X_i+u_i$`

![](me09_files/figure-html/unnamed-chunk-7-1.png)

- Consider what happens if data for `$Y>10$` is limited

---
## Bias of censoring

- Slope is decreased: Censored dependent value lower

- `$Y_i=\min (3+X_i+u_i, 10)$`

![](me09_files/figure-html/unnamed-chunk-8-1.png)

---
## Bias of truncation

- Truncation even more problematic:

- Less data

![](me09_files/figure-html/unnamed-chunk-9-1.png)

---
## Selection on exogenous variable

- Estimation with exogenous selection unproblematic:

- consistent (but with less data)

![](me09_files/figure-html/unnamed-chunk-10-1.png)

---
## Latent variables

- With censoring we observe 
`$$\tilde Y_i=\min (3+X_i+u_i, 10)$$`
rather than the variable `$Y_i=3+X_i+u_i$`

- Estimation is on the wrong variable `$\tilde Y_i$` rather than `$Y_i$`

- The unobserved variable `$Y_i$` is called a *latent variable*

- Can be due to

- limits on observations

- limits in actual data (nonnegative wages)

---
## Nonlinear estimation - Labor supply

- Zero supply for low wages

- Negative supply not possible

- Conditional expectation nonlinear function of wage

- Almost horizontal for low wages

![](me09_files/figure-html/unnamed-chunk-11-1.png)

---
## Estimation technique

- Take censoring/truncation limit into account

- Assume errors have normal distribution

- Cannot see what truncation implies

- Homoscedastic errors assumed
    
---
## Selection correlated with observables

Example:

- Truman election: Telephone survey of political preferences

- Wealthy were more likely to have a telephone

- Wealthy also more likely to vote for Dewey rather than Truman

- Wealth unobserved but correlated with 
        - selection 
        - dependent variable

- **Bias only when wealth is not included as regressor!**

---
## Tobit assumes normal errors

- Model with nonnormal errors: `$Y_i = 3 + X_i + u_i$`

- `$\varepsilon$`: 3 with probability 0.8, 0 otherwise

- Tobit regression shows bias 0.837 < 1

---
class: inverse, center, middle

# Causation

---
## Correlation is not causation
    
Correlation between `$X_i$` and `$Y_i$` can be due to

1.  `$X_i$` causes `$Y_i$`

2.  `$Y_i$` causes `$X_i$`

3.  `$X_i$` causes `$Y_i$` and `$Y_i$` causes `$X_i$` - *Self reinforcing system* / *simultaneity*

4.  `$W_i$` causes both `$X_i$` and `$Y_i$` - *Spurious relationship*

- `$W_i$` is a *confounding factor* / *lurking relationship*

- `$W_i$` is often time

5.  `$X_i$` and `$Y_i$` are independent - *Coincidence in data*

- If you look long enough you will find patterns

---
## Spurious correlation (Pearson)

- Normalizing `$X_i$` and `$Y_i$` by `$Z_i$` causes `$X_i$` and `$Y_i$` to become correlated

- Example: dividing by population to get per capita data

- *Spurious correlation* sometimes used more generally

- “Spurious correlation”: <http://www.tylervigen.com/>

---
## [Coincidence in data](http://rstudio.sh.se/content/me/me09-figs/)

![](me09_files/figure-html/unnamed-chunk-12-1.png)

---
## Next Lecture

- Chapter 11. Binary dependent variables

- Chapter 13. Experiments and quasi-experiments