[SOUND] Welcome! In this lecture you will learn how to specify an econometric model for binary dependent variables. In the previous lecture we have seen that the linear regression model is not well suited for describing binary variables. In this lecture we introduce an econometric model that is especially designed for such variables. We will call the dependent variable y_i and it can take only two values. We note these values by zero and one. The value one may, for example, correspond to yes and the value zero to no. Instead of assuming a normal distribution for y_i, as is done in previous lecture, we now assume that y follows a Bernoulli distribution with parameter pi. The Bernoulli distribution implies that y_i has two possible outcomes denoted by 0 and 1 the probability that the outcome is one equals pi. The probability that y_i is zero is then 1- pi, as probabilities have to sum up to one. Note that instead of modeling the value of y itself, we now model the probability that y is 1. For many applications, it is very unlikely that the probability that y_i=1 is the same for all observations. Therefore we allow the probability pi to differ across individuals, or time periods in case we have observations over time. Hence we consider pi subscript i. To explain differences in probabilities across individuals, we can relate the probabilities pi_i to one or more explanatory variables. Given the fact that probabilities have to be between 0 and 1, we cannot just take any function for this relationship. Although there are several choices possible, in practice, the logistic function is most frequently chosen. To simplify the discussion, we first consider a situation with only one explanatory variable x_i. As you can see on the slide, the logistic function implies that the probability pi_i is a ratio. The numerator is the exponent of a linear combination of a constant and an explanatory variable x_i. The denominator is 1 plus the same exponent term. The beta_1 parameter is called the intercept parameter, and the beta_2 parameter describes the effect of x on y. The resulting model is called a logit model. It's straightforward to derive the probability that y_i = 0. This graph shows the plot that y = 1 as a function of the explanatory variable. I consider the case where the Intercept parameter beta_1 is 0, and the beta_2 parameter is 1. You can clearly see that the probability is bounded between 0 and 1. The probability that y = 1 is 1/2 when x is 0. In general, the probability increases when x increases. The increase in probability is, however, not linear like in a linear regression model. For small values of x, the probability is really close to zero, and for large x the probability is nearly one. To illustrate the role of the beta_2 parameter, I now change the size of this parameter. The blue line shows the probability that y = 1 in case beta_2 is three times as large. You can clearly see that the logistic function now is steeper. The interval where the probability is not close to the boundaries 0 and 1, is now smaller. Hence, for larger beta_2 values, there is less uncertainty in whether y equals 0 or 1 given the value of x. Now I've made the beta_2 parameter negative. The red line shows a logit probability in case beta_2 equals -1. You can see that the logit probability is now a decreasing function of x. In fact, the probability plot is the mirror image of the original plot if you place a mirror vertically at x equal to 0. Let us now change the value of the intercept parameter beta_1. The purple line shows the logit probability where beta_1 equals 2 instead of 0. You can see that the shape of the logit function stays the same, but that the location has changed. The graph has moved two units of x to the left. The logit probability is now 1/2 at x is -2, while in the original case, the black solid line, this happens at x equals 0. I now ask you to think about the following. What happens to the logit probability as a function of x if you change the beta_1 parameter from 0 to -2? If we set beta_1 to -2 the graph moves two units to the right and the shape of the curve stays the same. As you have already seen in the previous graphs, the probability that y = 1 is a non-linear function of the explanatory variable x. This makes parameter interpretation a bit more difficult than in a linear regression. The easiest way to interpret the parameters of a logit model is to consider the odds ratio. That is, the ratio of the probability that y = 1 and the probability that y = 0. In the odds ratio the denominator of the logit probability cancels out, which results in a simple expression. In fact, it is even easier to consider the log odds ratio, as this is linear in x. It is easy to see that a positive beta implies that increase in x leads to an increase in the log odds ratio. This also implies an increase in the odds ratio itself. This in turn means that an increase in x corresponds to an increase in the probability that y = 1, and hence, a decrease in the probability that y = 0. The opposite holds for negative beta_2. An increase in x leads to a decrease in the log odds ratio and hence a decrease in the probability that y = 1. The odds ratio provides insight into the direction, plus or minus, of the effect of changes in the x variable, but not in the size of the effect. To compute the size of the effect, we consider the marginal effect of a change in x on the probability. It can be shown that for the logit model, the first derivative of the probability that y = 1, with respect to x, can be written as a product of probability that y = 1 times the probability that y = 0 times beta_2. Here we will not prove this technical result, but instead focus on the several conclusions that can be drawn from this result. First of all, as probabilities are always positive, the sign of beta_2 determines direction of the marginal effect. This result was already clear from the odds ratio. A positive beta_2 implies that an increase in x leads to an increase in the probability that y = 1. Secondly, when the probability that y = 1 is almost zero, the effect of a change in x is also almost zero. The same holds when the probability that y = 0 is almost zero. Remember from the graphs of the logit function shown before that this happens for very large and very small values of x. Hence, a change in x has then almost no effect on the outcome of y, and the logit functions are flat for very large and small values of x. Know that this feature of the logit model is not present in a linear regression as we saw in the previous lecture. The value of the marginal effect also depends on the probability and hence on the value of x. This means that one cannot express the marginal effect in a single number. Computer packages often report the average marginal effect in the sample. You can obtain this quantity by computing the marginal effect for every value of x_i in your sample and by taking the sample average. So far, we have only considered logit models with only one explanatory variable. The logit model can easily be extended with more explanatory variables. On the slide you see a logit model with k-1 explanatory variables, which are denoted by X_{j,i}. The exponent term now contains a linear combination of the intercept and the k-1 explanatory variables and k beta parameters. The log odds ratio and marginal effect are similar as before, as can be seen on the slide. In fact, analyzing the effect of a change in one of the x_{j,i} variables can be done the same way as in the single explanatory variable case, if we keep the value of all other x variables fixed. The corresponding beta j parameter determines direction and the relative size of the effect of a change in x_{j,i}. Note that the values of the marginal effect now also depend on the values of the other x variables through the probabilities. Now, I invite you to make a training exercise, to train yourself in the topics of this lecture. You can find this exercise on the website. And this concludes our lecture on model representation and parameter interpretation for binary choice variables.