Hi there and welcome back. In this video, we're going to continue working towards an understanding of covariance and correlation. So in the last video, we looked at mean, variance, standard deviation again, and we looked at it for a function of a random variable, so g(x). So we did that. But everything we've done so far has been involving just a single random variable. Now, what we're going to do here in this video is look at what's called jointly distributed random variables, and we're going to do that for two random variables, x and y. Let's start with an example that illustrates what I'm going to talk about. So suppose you have an insurance agency and it has customers who have homeowner's policies and automobile policies. For each type of policy, a deductible amount must be specified. For the automobile policy, the choices are 100 or $250. For the homeowner's policy, the choices are 0, 100 or 200. And I realized these numbers are artificially low, but we'll just go with them because it makes it easier to write. So suppose an individual men, let's say Bob is selected at random from the agency's files. Let x be the deductible amount on the auto policy. And let y be the deductible amount on the homeowner's policy. And what we want to do is we want to understand the relationship between x and y, and how that information can be conveyed. Now in this example, x and y are both discreet. So, y takes on values 0, 100, and 200 and that's across the top here. And the x value takes on 100 or 250. And that's down the column here on the left. And the information given inside this table is called the joint probability table. And let's just look at how we can interpret this. So, the probability that x equals 100 and y equals 0, and we can find that right here, so x = 100, y = 0. That probability is 0.2. The probability that x = 250 and y = 0, for example, is down here and that's 0.05. Now, if we sum this, and remember this is an intersection, this and here, this is an intersection. So we have two events, the x = 100 event and the y = 0 event. And then we have another probability with the event x = 250 and y = 100. If we sum these two, we actually end up only with the probability that y = 0, because we are summing over all the possibilities for the x random variable. And this is related to the law of total probability that we covered in an earlier module. So this is going to be 0.25. Now, I want to notice a couple of things if we, this sum we just did, that's going to be 0.25. If we sum that column, we get the probability that y = 100 and that's also 0.25. And if we sum on this column, we get y = 200, and that's 0.5. If we sum across the rows, so if we sum across this top row, we're summing over all the possible y values and what we'll get is the probability that x = 100 and that's going to be 0.5. And if we sum across this row, we get the probability that x = 250, and that's also 0.5. Now, I want you to notice a couple of things about this. This is the probability mass function for x all by itself. So the probability that x = 100 is 0.5. The probability that x = 250 is also 0.5. So that's the probability mass function just for x. Across here, this is the probability mass function just for y. Now, if I gave you only the probability mass functions for x just by itself. So just this part, and if I gave you the probability mass function just for y, I would not be giving you as much information as you get by understanding the relationship between x and y. So that's what this joint probability table gives us. The other thing that we should notice is that if we added up all the values inside this table, we'd get 1. We add these values, we get 1, we add the values across here, we get 1. So these are true probability mass functions. This one is the joint and these are just the regular probability mass functions for our individual random variables. We codify that in a definition. So, for any two discrete random variables x and y, we say that p(x, y), that's the probability that x equals x, and y equals y is the joint probability mass function for x and y. And we get a lot richer information when we're dealing with two random variables, we get much richer information by understanding or by knowing the joint probability mass function. Now, there's another definition I want you to remember. So we call Two events. Two events A and B are independent if the probability of A intersect B is the probability of A times the probability of B. Now we take that definition and now we're going to extend it to our joint probability mass function for x and y. So we're going to say x and y are independent random variables if this probability of x equaling x. So think of this as our A, and y equals y I think of that as our B, is equal to the product probability of x equaling x times the probability of y equaling y. Now, it can't be true for just one pair of x and y. It has to be true for all possible values of x and y. All right, so let's ask ourselves. In this example, so in the insurance example, Are x and y independent? And we can answer that question by looking at some examples, X = 100, Y = 100. I just picked that one out, and that's going to be 0.1. And then the probability that x = 100 times the probability that y = 100 that we calculated on the previous slide and that's 0.5 × 0.25, and so we get 0.125. Those are not equal, so X and Y are not independent. Now, I have to caution you. When you're trying to show X and Y are independent, then you actually have to do this calculation that we just went through. You have to do it for all possible Xs and Ys. To show they're not independent, you just have to find one pair, little x and little y where it doesn't work. What about if X and Y are continuous random variables? It gets a little bit more complicated because we have double integral, but it's the same exact idea. So here's our definition, if x and y are continuous random variables, then f of little x, little y is called the joint probability density function for X and Y. If when we calculate the events, so the probability of X being between A and B and Y being between C and D, that's the double integral A to B, C to D, f(x, y) dx dy. And that has to be true. I can do that for every possible A, B, C, and D. Now, the same kind of property for independence is also true. If f(x, y) = f(x)f(y) for all possible values of x and y, then x and y are called independent random variables. An independent random variables are going to be very, very important in our study of statistics and when we do sampling. Let's just do a little example to illustrate how we can use independence. So suppose a room is lit with two light bulbs. X1 is going to be the lifetime of the first bulb and x2 is the lifetime of the second bulb. And suppose they're both exponentially distributed. X1 has parameter lambda1 is 1/2000. X2 has parameter lambda2 = 1/3000. Now we're going to assume the lifetimes of the light bulbs are independent of each other. And I want to find the probability that the room is dark after 4000 hours. So let's just remember. So the expected value of x1 is 1 over lambda1. We calculated that before and that will be 2000 hours. So the first light bulb is expected to last 2000 hours but it has an exponential end, it has an exponential distribution. And the second light bulb is 1 over lambda2 and that's 3000 hours. Now, the light bulbs function independently and x1 measures the lifetime. So, it measures the time when the light bulb goes out. So the light bulbs function independently. That's given in the problem and it's assumed in the problem. So, if I want to calculate x1 less than or equal to 4000 and x2 less than or equal to 4000. So, if they both burn out before 4000 hours, then the room is dark. So I'm looking for x1 to be less than or equal to 4000 and x2 to be less than or equal to 4000. Now, we know they're independent random variables. And so we actually get the probability that x1 is less than or equal to 4000 times the probability that x2 is less than or equal to 4000. This is because we have independence. And so now we can use what we know about the individual exponentials to figure out, this is going to be the integral 0 to 4000. And I'm just going to leave my lambda1 in here for the time being and I'll call that variable x1, and then we multiply by the second integral. So this is a multiplication, lambda2 e to the minus lambda2 x of 2 dx of 2. And when we do this integration, we get e to the minus lambda1 x1, with a minus sign in front. And that's evaluated from 0 to 4000. And for the second one, we get minus e to the minus lambda2 x2. Also evaluated from 0 to 4000. So, when now when we put the lambda1 and lambda2 in, we're going to end up with 1- e to the -4000/2000. And we'll end up here with 1- e to the -4000/3000. And that's the same as 1- e to the -2 and 1- e to the -4/3. And when you put all that into a calculator, you get approximately 0.6368. So, in this video we've looked at jointly distributed random variables. We've defined what it means to have a joint probability mass function and a joint density function. And more importantly, we've also looked at what it means for two random variables to be independent of each other. Now we'll take all of this information into the next video and we'll look at covariance and correlation. We'll see you soon.