Microeconometrics - Lecture 10

class: center, middle, inverse, title-slide

.title[
# Microeconometrics - Lecture 10
]
.subtitle[
## Discrete choice models
]
.author[
### Jonas Björnerstedt
]
.date[
### 2024-03-12
]

---

## Lecture Content

- Binary choice models

- Multinomial logit

---
class: inverse, center, middle

# Binary choice models

---
## Discrete choice models

- Buy/not buy

- Participate

- Used in various sciences (psychology, zoology, ...)

Observations can have two values

- Coded as 0 or 1

---
## Binary choice models

- Focus of lecture mostly on demand choice

- For an individual it is normally a discrete choice

- Not buying common action

- Most people do *not* buy a particular product

- Choices are based on utility

- Utility not observed

- Choose if utility of choice greater than not choosing

- Normalize utility of outside option to zero

---
## Individual Demand function

.pull-left[

<table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td>price</td><td>purchase</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">1</td><td>2.3</td><td>1</td></tr>
<tr><td style="text-align:left">2</td><td>3.4</td><td>1</td></tr>
<tr><td style="text-align:left">3</td><td>3.8</td><td>1</td></tr>
<tr><td style="text-align:left">4</td><td>4.1</td><td>1</td></tr>
<tr><td style="text-align:left">5</td><td>4.9</td><td>0</td></tr>
<tr><td style="text-align:left">6</td><td>5.2</td><td>0</td></tr>
<tr><td style="text-align:left">7</td><td>5.7</td><td>0</td></tr>
<tr><td style="text-align:left">8</td><td>6.2</td><td>0</td></tr>
<tr><td style="text-align:left">9</td><td>7.6</td><td>0</td></tr>
<tr><td style="text-align:left">10</td><td>7.7</td><td>0</td></tr>
<tr><td style="text-align:left">11</td><td>7.9</td><td>0</td></tr>
<tr><td style="text-align:left">12</td><td>8.1</td><td>0</td></tr>
<tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr></table>
]

.pull-right[

- Individual choices observed

- Dependent variable is discrete

- buy/not buy

- Dependent variable not continuous

- Takes two values

- Error not normally distributed

]

---
## Binary choice relationship

![](me10_files/figure-html/unnamed-chunk-2-1.png)

- Observed choices often show randomness

- Reflect some kind of uncertainty

- Can be due to unobserved variation in

a. individual preferences
    
    b. differences in preferences between persons
    
    c. real randomness

---
## Linear model with binary choice

![](me10_files/figure-html/unnamed-chunk-3-1.png)

- Estimation *expected* relationship

- Predicted value (line) probability of choosing

- unreasonable outcomes

- probability `$Pr< 0$` or `$Pr> 1$`

---
## Nonlinear model of binary choice

![](me10_files/figure-html/unnamed-chunk-4-1.png)

- Choose nonlinear model with probabilities `$0\le Pr\le 1$`

- Predicted value (line) probability of choosing

---
## Discrete demand

- Discrete choice model of demand

- Consumer decides whether to buy or not

- Choice based on product characteristics

- Some factors increase utility: package size
  
  - Some factors decrease utility: price
  
      - Less money to buy other goods

- Consumer maximizes utility

---
## Linear utility

Assume that for individual `$j$`

- utility is a linear function of characteristics `$X_j$`.

- Some characteristics (like price) affect utility negatively
    
    - Price is common to all individuals - no subscript
    
    - Here we omit the subscript `$i$` for observations for simpler notation

- some individual characteristics `$u_j$` not observed by the econometrician

- these characteristics are not correlated with the observed `$X_j$`

- the utility of not consuming is set to zero

Let utility depend on some characteristic `$X_j$` and price `$P$`
`$$U_j + u_j = \beta_X X_j - \beta_P P + u_j$$`
Then the consumer buys if `$U_j + u_j> 0$`

---
## Probit specification

- Probit: `$u_j$` has standard normal distribution

- Buy if: `$U_j  + u_j = \beta_X X_j - \beta_P P  + u_j > 0$`

- `$j$` buys if utility from observables is greater than the individuals unobservable disutility : `$U_j > -u_j = v_j$`

- Disutility `$v_j$` also has standard normal distribution

- Buy for all `$v_j < U_j$`

- The probability that `$v_j$` is smaller is given by the CDF `$\Phi(U_j)$` of the normal distribution
`$$Pr(buy) = \Phi(U_j) = \Phi(\beta_X X - \beta_P P )$$`

---
## [Consumers with positive utility buy](http://rstudio.sh.se/content/me10-figs/#section-discrete-demand)

![](me10_files/figure-html/unnamed-chunk-5-1.png)

---
## Consumers have different valuation

![](me10_files/figure-html/unnamed-chunk-6-1.png)

- Share of consumers buying

- Alt. probability of individual consumer buying

- Consumers with a valuation higher than the price purchase

- Shape of curve depends on distribution of unobserved utility

---
## Distribution of unobservable utility

.pull-left[

- Logistic distribution `$\Rightarrow$` Logit demand

- Normal distribution `$\Rightarrow$` Probit demand
]
.pull-right[

Plot of logistic and normal pdf
![](me10_files/figure-html/unnamed-chunk-7-1.png)
]

???

Bunch of people with different valuations, centered around 0

---
## Similar demand with logit and probit

![](me10_files/figure-html/unnamed-chunk-8-1.png)

---
## Marginal effect with logit and probit

* Tthe probability of buying with probit demand is given by

`$$Pr_{buy} = \Phi(\beta_X X_j - \beta_P P )$$`
* The marginal effect of a price change can be derived with the chain rule:
`$$- \beta_P \phi(\beta_X X_j - \beta_P P)$$`
where `$\phi()$` is the probability density function of the normal distribution

* To calculate the marginal effects of probit or logit estimates in practice, we use functions such as the `margins()` function in the `margins` package.

---
class: inverse, center, middle

# Multinomial choice models

---
## Multiple choices

- Individual chooses *one* alternative

- Not buying can be a choice

- Sometimes called the *outside good*

- Usually alternative 0

---
## Logit and Probit

- Logit and probit similar models

- Logit is simpler:

- Logit has analytic solution for probabilities

- Probit requires numerical integration

- With many choices probit is not used in practice

- Logit allows various extensions

---
## Logit

- **Multinomial logit** - `$X_j$` contains characteristics of *individual*, common `$\beta$` parameters for *choice*

- Each choice has the same `$X_j$` values for an individual `$j$`

- Example: Choice of education regressed on previous school, age, parents income...

- **Conditional logit** - `$X$` has characteristics of *choice*, common `$\beta_X$` parameters for characteristics

- Different `$X$` values for different choices for individual `$i$`

- Example: Car choice regressed on price, horsepower, weight of car,...

- **Mixed logit** - Combination of the two

- Note that *Mixed logit* is also the name of a different model (also called *Random coefficients logit*) 
    
---
## Conditional logit demand

- Most common form of demand estimation

- Choices depend on price and product characteristics

- Utility of product `$k$` for individual `$j$`:
`$$\beta_X X_k - \beta_P P_k + u_{jk} = U_k + u_{jk}$$`

- Unobserved variation in individual preferences

- Creates reasonable substitution patterns

---
## Conditional logit choice

- Utility has common and individual components

- utility of good 1 given by: `$U_{1} + u_{j1}$`

- Individual `$j$` chooses alternative with highest utility:
`$$U_{1} + u_{j1} = \beta X_1 + u_{j1}$$`
`$$U_{2} + u_{j2} = \beta X_2 + u_{j2}$$`

- Choose product 1 if `$U_{1}+u_{j1} > U_{k}+u_{jk}$` for all `$k\neq 1$`

- In other words if: `$U_{1} - U_{k} > u_{j1} - u_{jk}$`

- Product 0 has utility 0

---
## Logit error term

.pull-left[
- Error assumed to have *double exponential distribution*

- Cumulative distribution function:
`$$e^{-e^{-x}}$$`
]

.pull-right[
![](me10_files/figure-html/unnamed-chunk-9-1.png)

]

---
## Double exponential and logistic distribution

- Binary choice logit has logistic distribution

- Difference between two double exponential vars has logistic distribution

- Same model:

- logistic has zero utility for nonpurchase

- same as if it had double exponential with zero mean

- The difference will have logistic distribution
    
    - Similar to normal distribution (see earlier slide for binary choice)

---
## Choice probabilities

- Probability of buying goods are given by:
`$$Pr_{1}=\frac{e^{U_{1}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}} = \frac{e^{\beta X_{1}}}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$`
`$$Pr_{2}=\frac{e^{U_{2}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}}=\frac{ e^{\beta X_{2}}}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$`
`$$Pr_{0}=\frac{e^{U_{0}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}}=\frac{1}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$`

* As `$U_0$` has been normalized to zero, we have `$e^{U_0} = e^0 = 1$`

- Given that unexplained utility has double exponential distribution

- Derivation of equations does not help in intuition
    
* Note that the denominator for all three expressions is the same.
    
---
## Independence of irrelevant alternatives

- Relative probability of two alternatives depend only on their characteristics
`$$\frac{Pr_{1}}{Pr_{2}}=\frac{e^{\beta X_{1}}}{e^{\beta X_{2}}}$$`

- Relative probabilities do not depend on properties of other goods!

---
## Blue bus red bus problem

- Assume that there are two alternatives, travelling by bus or car

- Assume that not travelling is not an option

- Assume that consumers value them equally: `$Pr_{1} = Pr_{2}$`

- Each has probability 1/2

- Split the bus alternative in two, red bus and blue bus

- Assume that all relevant properties are the same

- Then car and each bus alternative will get 1/3 of the customers!

---
## Nested logit

- Unobserved utility more or less correlated between products

- Assume structure on correlation

- Create groups

- Example cars: sports cars, suvs, station wagons, ...

- Nests within nests possible

- Equations of probabilities `$Pr_i$` are more complicated

---
## Market data

- Probabilities correspond to shares (many consumers)

- Non-purchases not observed

- Share of outside good not observed

- Make assumption on market size including non-purchase

- Calculate shares relative to this total market

- Demand estimates not that sensitive to market size assumption

- Transform to linear model
`$$\log (s_{k}/s_{0}) = \beta X_k  + \varepsilon_k$$`

- The error term `$\varepsilon$` corresponds to unobservables common to all consumers in the market.

---
class: inverse, center, middle

# Experiments and Quasi experiments

---
## Solution 3: Experiment

1.  Programs and treatments

- Random selection in treatment and control groups

- Independent of individual characteristics

2.  Economic experiments

- Choice experiments

- Vary parameters exogenously

- Note that individual market data does not solve endogeneity

- Market effect is sum of individual effects, all correlated

- Individuals have to be given random prices

---
## Endogeneity and causality

- What is the effect of a treatment?

- Example: How effective is hospital treatment?

- Problem: individual chooses whether to get hospital treatment
    
        - Hospital choice more likely if individual really is ill
        
        - Not all individual health parameters are observed
        
        - Especially for those who do not go to the hospital

- Problems with identifying causality

1.  Only the chosen alternative is observed
    
        - The alternative not chosen (the *counterfactual*) has to be inferred
        
    2.  Data is often on aggregate choices
    
        - Individuals have different preferences

---
## Experiment

- Assign "treatment" randomly to subjects

- Study effect of treatment

- Include additional regressors

- Can estimate _average treatment effect_

- Average causal effect in the population

---
## Threats to validity

1. Failure to follow the treatment protocol

- Do the subjects follow the treatment assigned?
    
2. Attrition

- Do subjects drop out?

- The decision to drop out can depend on the expected outcome for the individual

- Unobservable to us

- Solution: Instrumental variables!

- With data on both assingnment and treatment use assignment as instrument for treatment
    
        - Assignment is exogenous
        
        - Assignment explains participation in part

---
## Empirical exercise

The exercise should be written individually, even if done in a group.

Do the following empirical exercises in Stock and Watson Updated 3rd ed (4th edition in parenthesis):

* Exercise E8.2, page 354 (323)
* Exercise E10.2, page 424 (386)
* Exercise E11.2, page 462 (419)
* Exercise E12.1, page 510 (464)
* Exercise E13.1, page 563 (509)

The exercise has to be done in Rmarkdown and turned in no later than Sunday March 28, 2021.