class: center, middle, inverse, title-slide .title[ # Microeconometrics - Lecture 10 ] .subtitle[ ## Discrete choice models ] .author[ ### Jonas Björnerstedt ] .date[ ### 2024-03-12 ] --- ## Lecture Content - Binary choice models - Multinomial logit --- class: inverse, center, middle # Binary choice models --- ## Discrete choice models - Buy/not buy - Participate - Used in various sciences (psychology, zoology, ...) Observations can have two values - Coded as 0 or 1 --- ## Binary choice models - Focus of lecture mostly on demand choice - For an individual it is normally a discrete choice - Not buying common action - Most people do *not* buy a particular product - Choices are based on utility - Utility not observed - Choose if utility of choice greater than not choosing - Normalize utility of outside option to zero --- ## Individual Demand function .pull-left[ <table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td>price</td><td>purchase</td></tr> <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">1</td><td>2.3</td><td>1</td></tr> <tr><td style="text-align:left">2</td><td>3.4</td><td>1</td></tr> <tr><td style="text-align:left">3</td><td>3.8</td><td>1</td></tr> <tr><td style="text-align:left">4</td><td>4.1</td><td>1</td></tr> <tr><td style="text-align:left">5</td><td>4.9</td><td>0</td></tr> <tr><td style="text-align:left">6</td><td>5.2</td><td>0</td></tr> <tr><td style="text-align:left">7</td><td>5.7</td><td>0</td></tr> <tr><td style="text-align:left">8</td><td>6.2</td><td>0</td></tr> <tr><td style="text-align:left">9</td><td>7.6</td><td>0</td></tr> <tr><td style="text-align:left">10</td><td>7.7</td><td>0</td></tr> <tr><td style="text-align:left">11</td><td>7.9</td><td>0</td></tr> <tr><td style="text-align:left">12</td><td>8.1</td><td>0</td></tr> <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr></table> ] .pull-right[ - Individual choices observed - Dependent variable is discrete - buy/not buy - Dependent variable not continuous - Takes two values - Error not normally distributed ] --- ## Binary choice relationship ![](me10_files/figure-html/unnamed-chunk-2-1.png)<!-- --> - Observed choices often show randomness - Reflect some kind of uncertainty - Can be due to unobserved variation in a. individual preferences b. differences in preferences between persons c. real randomness --- ## Linear model with binary choice ![](me10_files/figure-html/unnamed-chunk-3-1.png)<!-- --> - Estimation *expected* relationship - Predicted value (line) probability of choosing - unreasonable outcomes - probability `\(Pr< 0\)` or `\(Pr> 1\)` --- ## Nonlinear model of binary choice ![](me10_files/figure-html/unnamed-chunk-4-1.png)<!-- --> - Choose nonlinear model with probabilities `\(0\le Pr\le 1\)` - Predicted value (line) probability of choosing --- ## Discrete demand - Discrete choice model of demand - Consumer decides whether to buy or not - Choice based on product characteristics - Some factors increase utility: package size - Some factors decrease utility: price - Less money to buy other goods - Consumer maximizes utility --- ## Linear utility Assume that for individual `\(j\)` - utility is a linear function of characteristics `\(X_j\)`. - Some characteristics (like price) affect utility negatively - Price is common to all individuals - no subscript - Here we omit the subscript `\(i\)` for observations for simpler notation - some individual characteristics `\(u_j\)` not observed by the econometrician - these characteristics are not correlated with the observed `\(X_j\)` - the utility of not consuming is set to zero Let utility depend on some characteristic `\(X_j\)` and price `\(P\)` `$$U_j + u_j = \beta_X X_j - \beta_P P + u_j$$` Then the consumer buys if `\(U_j + u_j> 0\)` --- ## Probit specification - Probit: `\(u_j\)` has standard normal distribution - Buy if: `\(U_j + u_j = \beta_X X_j - \beta_P P + u_j > 0\)` - `\(j\)` buys if utility from observables is greater than the individuals unobservable disutility : `\(U_j > -u_j = v_j\)` - Disutility `\(v_j\)` also has standard normal distribution - Buy for all `\(v_j < U_j\)` - The probability that `\(v_j\)` is smaller is given by the CDF `\(\Phi(U_j)\)` of the normal distribution `$$Pr(buy) = \Phi(U_j) = \Phi(\beta_X X - \beta_P P )$$` --- ## [Consumers with positive utility buy](http://rstudio.sh.se/content/me10-figs/#section-discrete-demand) ![](me10_files/figure-html/unnamed-chunk-5-1.png)<!-- --> --- ## Consumers have different valuation ![](me10_files/figure-html/unnamed-chunk-6-1.png)<!-- --> - Share of consumers buying - Alt. probability of individual consumer buying - Consumers with a valuation higher than the price purchase - Shape of curve depends on distribution of unobserved utility --- ## Distribution of unobservable utility .pull-left[ - Logistic distribution `\(\Rightarrow\)` Logit demand - Normal distribution `\(\Rightarrow\)` Probit demand ] .pull-right[ Plot of logistic and normal pdf ![](me10_files/figure-html/unnamed-chunk-7-1.png)<!-- --> ] ??? Bunch of people with different valuations, centered around 0 --- ## Similar demand with logit and probit ![](me10_files/figure-html/unnamed-chunk-8-1.png)<!-- --> --- ## Marginal effect with logit and probit * Tthe probability of buying with probit demand is given by `$$Pr_{buy} = \Phi(\beta_X X_j - \beta_P P )$$` * The marginal effect of a price change can be derived with the chain rule: `$$- \beta_P \phi(\beta_X X_j - \beta_P P)$$` where `\(\phi()\)` is the probability density function of the normal distribution * To calculate the marginal effects of probit or logit estimates in practice, we use functions such as the `margins()` function in the `margins` package. --- class: inverse, center, middle # Multinomial choice models --- ## Multiple choices - Individual chooses *one* alternative - Not buying can be a choice - Sometimes called the *outside good* - Usually alternative 0 --- ## Logit and Probit - Logit and probit similar models - Logit is simpler: - Logit has analytic solution for probabilities - Probit requires numerical integration - With many choices probit is not used in practice - Logit allows various extensions --- ## Logit - **Multinomial logit** - `\(X_j\)` contains characteristics of *individual*, common `\(\beta\)` parameters for *choice* - Each choice has the same `\(X_j\)` values for an individual `\(j\)` - Example: Choice of education regressed on previous school, age, parents income... - **Conditional logit** - `\(X\)` has characteristics of *choice*, common `\(\beta_X\)` parameters for characteristics - Different `\(X\)` values for different choices for individual `\(i\)` - Example: Car choice regressed on price, horsepower, weight of car,... - **Mixed logit** - Combination of the two - Note that *Mixed logit* is also the name of a different model (also called *Random coefficients logit*) --- ## Conditional logit demand - Most common form of demand estimation - Choices depend on price and product characteristics - Utility of product `\(k\)` for individual `\(j\)`: `$$\beta_X X_k - \beta_P P_k + u_{jk} = U_k + u_{jk}$$` - Unobserved variation in individual preferences - Creates reasonable substitution patterns --- ## Conditional logit choice - Utility has common and individual components - utility of good 1 given by: `\(U_{1} + u_{j1}\)` - Individual `\(j\)` chooses alternative with highest utility: `$$U_{1} + u_{j1} = \beta X_1 + u_{j1}$$` `$$U_{2} + u_{j2} = \beta X_2 + u_{j2}$$` - Choose product 1 if `\(U_{1}+u_{j1} > U_{k}+u_{jk}\)` for all `\(k\neq 1\)` - In other words if: `\(U_{1} - U_{k} > u_{j1} - u_{jk}\)` - Product 0 has utility 0 --- ## Logit error term .pull-left[ - Error assumed to have *double exponential distribution* - Cumulative distribution function: `$$e^{-e^{-x}}$$` ] .pull-right[ ![](me10_files/figure-html/unnamed-chunk-9-1.png)<!-- --> ] --- ## Double exponential and logistic distribution - Binary choice logit has logistic distribution - Difference between two double exponential vars has logistic distribution - Same model: - logistic has zero utility for nonpurchase - same as if it had double exponential with zero mean - The difference will have logistic distribution - Similar to normal distribution (see earlier slide for binary choice) --- ## Choice probabilities - Probability of buying goods are given by: `$$Pr_{1}=\frac{e^{U_{1}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}} = \frac{e^{\beta X_{1}}}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$` `$$Pr_{2}=\frac{e^{U_{2}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}}=\frac{ e^{\beta X_{2}}}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$` `$$Pr_{0}=\frac{e^{U_{0}}}{e^{U_{0}}+e^{U_{1}}+e^{U_{2}}}=\frac{1}{1+e^{\beta X_{1}}+e^{\beta X_{2}}}$$` * As `\(U_0\)` has been normalized to zero, we have `\(e^{U_0} = e^0 = 1\)` - Given that unexplained utility has double exponential distribution - Derivation of equations does not help in intuition * Note that the denominator for all three expressions is the same. --- ## Independence of irrelevant alternatives - Relative probability of two alternatives depend only on their characteristics `$$\frac{Pr_{1}}{Pr_{2}}=\frac{e^{\beta X_{1}}}{e^{\beta X_{2}}}$$` - Relative probabilities do not depend on properties of other goods! --- ## Blue bus red bus problem - Assume that there are two alternatives, travelling by bus or car - Assume that not travelling is not an option - Assume that consumers value them equally: `\(Pr_{1} = Pr_{2}\)` - Each has probability 1/2 - Split the bus alternative in two, red bus and blue bus - Assume that all relevant properties are the same - Then car and each bus alternative will get 1/3 of the customers! --- ## Nested logit - Unobserved utility more or less correlated between products - Assume structure on correlation - Create groups - Example cars: sports cars, suvs, station wagons, ... - Nests within nests possible - Equations of probabilities `\(Pr_i\)` are more complicated --- ## Market data - Probabilities correspond to shares (many consumers) - Non-purchases not observed - Share of outside good not observed - Make assumption on market size including non-purchase - Calculate shares relative to this total market - Demand estimates not that sensitive to market size assumption - Transform to linear model `$$\log (s_{k}/s_{0}) = \beta X_k + \varepsilon_k$$` - The error term `\(\varepsilon\)` corresponds to unobservables common to all consumers in the market. --- class: inverse, center, middle # Experiments and Quasi experiments --- ## Solution 3: Experiment 1. Programs and treatments - Random selection in treatment and control groups - Independent of individual characteristics 2. Economic experiments - Choice experiments - Vary parameters exogenously - Note that individual market data does not solve endogeneity - Market effect is sum of individual effects, all correlated - Individuals have to be given random prices --- ## Endogeneity and causality - What is the effect of a treatment? - Example: How effective is hospital treatment? - Problem: individual chooses whether to get hospital treatment - Hospital choice more likely if individual really is ill - Not all individual health parameters are observed - Especially for those who do not go to the hospital - Problems with identifying causality 1. Only the chosen alternative is observed - The alternative not chosen (the *counterfactual*) has to be inferred 2. Data is often on aggregate choices - Individuals have different preferences --- ## Experiment - Assign "treatment" randomly to subjects - Study effect of treatment - Include additional regressors - Can estimate _average treatment effect_ - Average causal effect in the population --- ## Threats to validity 1. Failure to follow the treatment protocol - Do the subjects follow the treatment assigned? 2. Attrition - Do subjects drop out? - The decision to drop out can depend on the expected outcome for the individual - Unobservable to us - Solution: Instrumental variables! - With data on both assingnment and treatment use assignment as instrument for treatment - Assignment is exogenous - Assignment explains participation in part --- ## Empirical exercise The exercise should be written individually, even if done in a group. Do the following empirical exercises in Stock and Watson Updated 3rd ed (4th edition in parenthesis): * Exercise E8.2, page 354 (323) * Exercise E10.2, page 424 (386) * Exercise E11.2, page 462 (419) * Exercise E12.1, page 510 (464) * Exercise E13.1, page 563 (509) The exercise has to be done in Rmarkdown and turned in no later than Sunday March 28, 2021.