[SOUND] Welcome. In this lecture you will learn why matrices are of help in multiple regression. In the previous lecture, we compared wages of females and males. Various factors have an effect on wage, such as the age of the employee, the education level, and the number of worked days per week. We include these explanatory variables on the right hand side of a linear equation for log wage. Because other unobserved factors may affect wage such as the personal characteristics and the experience of the employee, we add an error term to represent the combined effect of such other factors. This error term is denoted by epsilon. The precise definition of the variables is shown on the slide. We distinguish four levels of education and we define part time jobs as jobs for at most three days a week. To simplify the notation, we denote the dependent variable by y, and the explanatory factors by x. For each employee, the values of the five explanatory variables are collected in a five times one vector, and the five times one vector beta contains the five unknown parameters. The wage equation can now be written in vector form. If you wish, you can consult the Building Blocks of this course for further background on vector and matrix methods. We collect the wage equation of all five hundred employees in matrix form. As xi is a five times one vector, its transpose is a one times five vector. And the matrix X has five hundred rows and five columns. The wage equation for all 500 employees can now be written in matrix form. Here y and X contain the observed data. Epsilon is unknown, and the parameters beta that measure the effect of each factor on log-wageare also unknown. Our challenge is to estimate these parameters from the data. We generalize the above setup now to the case where the dependent variable y is explained in terms of k factors. We assume that the model contains a constant term which is denoted by beta one. For the notation, it's convenient to define the first x variable as this constant term, which has value one for all observations. We follow the same steps as before, but now for a set of n observations, and a model with k explanatory factors. In our wage example, we had n is 500 observations and k is five factors including the constant term. The resulting multiple regression model has the same form as before, but now the observations are collected in an n times one factor y and n times k matrix X. The challenge is to estimate the unknown parameters beta from the observed data y and X. The purpose of multiple regression is to explain the outcomes for y in terms of the explanatory variables X. Because the real world is much more complex than our simple model, the explanation will be imperfect. And the error terms epsilon represent the imperfections of the model. We wish to find values for the parameter vector beta, so that X times beta is close to y, because we then get a good explanation of y. We would obtain a perfect fit if X times beta is exactly equal to y. Because X is an n times k matrix this is a set of n equations in the k unknown parameters beta. Now I invite you to answer the following test question. The answers are shown on the slide and follow from well-known matrix properties. In practical applications we usually have many more observations than explanatory factors. In our wage example we have five hundred observations and five factors. So in practice n is usually much larger then k. Further the matrix X usually has full column rank, that is, the columns of X are linearly independent. If this were not the case, then one of the X variables is a linear function of the other X variables, so that it is not needed in the model and should be removed. So we will always assume that n is larger than k and that the matrix X has full column rank k. The test answers show that the equations y equals X times beta will in general have no exact solution. So, we need to find an approximate solution, as will be discussed in the next lecture. The multiple regression model specifies a linear equation for y in terms of the x variables. This means that the partial derivatives of y with respect to the explanatory factors do not depend on the value of the x variables. Stated otherwise, the marginal effect of each factor is fixed. Or, more precisely, the parameter beta j is the partial effect on y if the j-th factor increases by one unit, assuming that all other x factors remain fixed. In practice, the x variables are usually mutually dependant, so that it is not possible to change one factor while keeping all other factors fixed. In our wage example, if we compare female and male employees, we cannot keep the education level fixed, because females and males differ in their mean education levels. As keeping all other factors fixed is not possible in practice, this can only be done as a thought experiment called �ceteris paribus�. Meaning that everything else is assumed to stay unchanged. If the value of the j-th factor changes, this has two effects on the dependent variable y. First, it has a direct effect that is measured by beta j. Second, because the j-th factor changes, the other x variables will usually also change. This leads to indirect effects on the dependent variable. The single exception is the first x variable that always has the value one, so that this variable never changes. The combined indirect effects are shown on the slide. We conclude that the total effect is the sum of the partial, or �cetris paribus� effect and the combined indirect effect caused by associated changes in the other x variables. Let me give you an example. Suppose that the chance of having a part-time job is higher for higher education levels. If an employee improves his or her education level, then this will have a positive direct effect on wage because of better education, but possibly a negative indirect effect if the employee chooses to work fewer days per week. The total effect is the sum of these positive and negative effects, and it depends on the size of these effects whether the total wage effect is positive or negative. The estimation of these effect sizes is the topic of our next lecture. Of course, we include factors in a model because we think that these factors help to explain the dependent variable. We should first check whether or not these factors have a significant effect. Statistical tests can be formulated for the significance of a single factor, for the joint significance of two factors, or more generally, for any set of linear restrictions on the parameter beta of the model. Now I invite you to answer the following test question. If beta j is zero, does this mean that the factor x j has no effect? The correct answer is yes and no. The answer is yes in the sense that the partial effect is zero. That is, under the �ceteris paribus� assumption that all other factors remain fixed. But the answer is no if there are indirect effects because of changes in the other factors. Again, as an example, suppose that having a part time job is more common for higher education level. Even if higher education has no partial effect on wage, it may still have an indirect effect because higher education leads more often to a part time job. Now I invite you to make the training exercise to train yourself with the topics of this lecture. You can find this exercise on the website. And this concludes our second lecture.