class: center, middle, inverse, title-slide # Econometrics B/C ## Lecture 1 ### Jonas Björnerstedt ### 2021-09-29 --- ## B/C econometrics overview - 10 Lectures 1. Statistics 1. Econometric theory 1. Use statistical software - R and Rstudio - Empirical exercises - Data management - create datasets - Data analysis - analysis of data - Regression analysis - formal analysis of data --- ## Examination - Written Exam 50 points - Econometrics (exam) 2021-10-29 09 10.00 -14.00 __on campus__ - Econometrics (reexam) 2021-12-10 15.00 -19.00 __on campus__ - Empirical Exercises 50 points - Three exercises, distributed during the course - Must be done in the virtual datalab - 50 points to pass course --- ## Textbook - Online textbook [Introduction to Econometrics with R](https://scpoecon.github.io/ScPoEconometrics/) - Florian Oswald, Jean-Marc Robin and Vincent Viers - Included in the course: Chapters 1-7 and 12. - The course information included in the first section _Syllabus_ is not relevant for our course. - The content of chapters 1 and 2 are better covered in the two texts below - R for data science - Hadley Wickham - [chapter 5](https://r4ds.had.co.nz/transform.html) - A ModernDive into R and the Tidyverse - Chester Ismay and Albert Y. Kim - [Chapter 2](https://moderndive.com/2-viz.html) and [Chapter 3](https://moderndive.com/3-wrangling.html) --- ## Web resources #### Econometrics pages - sites.google.com/view/bc-econometrics - Contains additional resources - Lecture slides will be posted on the net - Both lecture and exercise slides #### Other resources - Google for questions - The internet! - Wikipedia is very good in probability and statistics - [Khan academy](https://www.khanacademy.org/math/statistics-probability) - From basic to advanced with [an app with videos and exercises](https://itunes.apple.com/us/app/khan-academy-you-can-learn/id469863705?mt=8) --- class: inverse, center, middle # Today's lecture --- ## Mean and variance * Data in Y <table class="table table-striped table-hover table-condensed" style="width: auto !important; "> <thead> <tr> <th style="text-align:left;"> i </th> <th style="text-align:right;"> Y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 1 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 6 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 2 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 5 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 3 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 4 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 4 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 3 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 5 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 2 </td> </tr> </tbody> </table> --- ## Mean and variance * Calculate the mean of Y <table class="table table-striped table-hover table-condensed" style="width: auto !important; "> <thead> <tr> <th style="text-align:left;"> i </th> <th style="text-align:right;"> Y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 1 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 6 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 2 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 5 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 3 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 4 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 4 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 3 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 5 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 2 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;background-color: red !important;"> mean </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;background-color: red !important;"> 4 </td> </tr> </tbody> </table> --- ## Mean and variance * Calculate the deviation from the mean <table class="table table-striped table-hover table-condensed" style="width: auto !important; "> <thead> <tr> <th style="text-align:left;"> i </th> <th style="text-align:right;"> Y </th> <th style="text-align:right;"> mean </th> <th style="text-align:right;"> dev </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 1 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 6 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 2 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 5 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 3 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 4 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 4 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 3 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -1 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 5 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 2 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -2 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;background-color: red !important;"> mean </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;background-color: red !important;"> 4 </td> <td style="text-align:right;background-color: red !important;"> 4 </td> <td style="text-align:right;background-color: red !important;"> 0 </td> </tr> </tbody> </table> --- ## Mean and variance * Calculate the __square__ deviation from the mean (sq.dev) <table class="table table-striped table-hover table-condensed" style="width: auto !important; "> <thead> <tr> <th style="text-align:left;"> i </th> <th style="text-align:right;"> Y </th> <th style="text-align:right;"> mean </th> <th style="text-align:right;"> dev </th> <th style="text-align:right;"> sq.dev </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 1 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 6 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 2 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 5 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 3 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 4 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 4 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 3 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;"> 5 </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;"> 2 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> -2 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;width: 2em; font-weight: bold;border-right:1px solid;background-color: red !important;"> mean </td> <td style="text-align:right;width: 4em; background-color: yellow !important;border-right:1px solid;background-color: red !important;"> 4 </td> <td style="text-align:right;background-color: red !important;"> 4 </td> <td style="text-align:right;background-color: red !important;"> 0 </td> <td style="text-align:right;background-color: red !important;"> 2 </td> </tr> </tbody> </table> * The __variance__ of Y is given by the _average_ square deviation --- # Apples .pull-left[ - Pick apple from basket - How much do you expect to eat? ] .pull-right[ ![](apples.jpeg) ] --- # Small and large apples .pull-left[ Assume apples come from two different trees - Small apples all weigh 100g - Big apples all weigh 200g - Same number of apples from each tree ] .pull-right[ ![](small-large-apple.jpg) ] --- ## Expected value - If you were given an apple, how much would you expect to eat? -- - Either you eat 100g or 200g apple - Big and small apples equally likely to be picked from the basket - *In expecation* you eat 150g -- - If `\(Y\)` is the amount you eat - What is the expected value of `\(Y\)`? -- - Let `\(E(Y)\)` denote the expected value of `\(Y\)` `$$E(Y) = \frac{1}{2} 100 + \frac{1}{2} 200 = 150$$` - Weighted average --- ## Sampling Probability - Probability - Share of population with property - Share of random sample - frequency of event - Population - Example: Individuals in Sweden - Can be abstract set of states - States of the world where a coin toss gives heads - Sample - Draws of individuals from population - Example: Class --- ## Discrete random variable - Finite discrete variable takes `\(k\)` different values 1. Length of individuals in a class 2. Outcomes of a coin toss - Distribution can be characterized by the frequencies: - Relative frequency of each age or length --- class: inverse, center, middle ![](coin.gif) --- ## Coin toss - Coin toss has two outcomes (heads or tails) - Assign a numerical value to each: -1, 1 - Equal probability of each outcome with a _fair coin_ ![](lecture01_files/figure-html/fig.width==1-1.png)<!-- --> --- ## Discrete random variable - dice - A toss of a die can have various outcomes - Sample space: {1, 2, 3, 4, 5, 6} - Each outcome occurs with equal probability - Frequency with which we expect outcome - _Probability Mass Function (PMF)_ - function that assigns a _probability_ to each outcome in the sample space ![](lecture01_files/figure-html/unnamed-chunk-6-1.png)<!-- --> --- ## Probability and statistics - Random variables - Numerical properties of individuals - Examples: height, weight and gender - Characterized by *probability distribution* - Probability of each value that variable can take - Example: frequencies of all lengths in population - Summarize with a *statistic* - Real or vector valued _function_ of sample - Random variable (it depends on a random sample) - Sampling distribution of statistic? --- ## Expected value - A *Statistic* summarizes properties of distributions - A real valued function of the probability distribution - If `\(Y\)` has a discrete distribution: `$$E(Y)= \sum_i^k Y_i p_i =\mu_Y$$` - For dice: `$$E(Y)=1*\frac{1}{6}+2*\frac{1}{6}+3*\frac{1}{6}+4*\frac{1}{6}+5*\frac{1}{6}+6*\frac{1}{6}=3.5$$` - Populations often have equal weights - Ex: The mean height of the Swedish population is just the average - Sum the weights of everybody and divide by the number of people --- ## Variance - The variance is a measure of the spread around the expected value - How big is the square deviation on average? - Let `\(r\)` be the square deviation: `$$r = (Y - E[Y])^2$$` - Then the variance is the expected value of the square deviation: `$$Var(Y) = E[r] = E\left[(Y - E[Y])^2 \right]$$` --- ## Conditional expectation * Conditional expectation - expected value given something * Two population variables `\(length\)` and `\(woman\)` * Expected length of women: `\(E(length | woman = 1 )\)` * Population average for women * `\(woman\)` is dummy variable, with values 0 and 1 --- ## Transform a random variable - From a random variable `\(Y\)` we can create new random variables: * `\(2Y\)` stretches * `\(Y + 1\)` moves * Squared deviation `\(r\)` is a random variable * [Illustrate with coin toss] --- ## Exercise - Coin toss has two outcomes (heads or tails) - Random variable taking values: -1, 1 - What is the expected value and variance? - Consider a random variable that assigns 0, 1 to outcomes - What is the expected value and variance? --- ## Mean and median of distribution - _Mean_ - Average `\(X\)` value - _Median_ - `\(X\)` with half of the density (area) to the left and right - Differs when distribution is not symmetric (skewed) - Example income ![](lecture01_files/figure-html/unnamed-chunk-7-1.png)<!-- --> --- ## Properties of expectations `$$E(aY) = aE(Y)$$` - Similarly, one can show that `$$E(X + Y) = E(X) + E(Y)$$` --- ## Properties of variance `$$Var(aX) = a^2 Var(X)$$` - For independent `\(X\)` and `\(Y\)`: `$$Var(X + Y) = Var(X) + Var(Y)$$`