Hi and welcome back. In this video, we're going to review the normal and standard normal distributions. We're going to see how they're related to each other, and then we're going to work through a few problems. Just to remind you, if X has a normal distribution with parameters Mu and Sigma squared, then it has this density function. If Z is normal 0, 1 and remember, Z is our customary notation for a standard normal. If Z is normal 0, 1, then it has this density function. The question is how are they related to each other? Here's our proposition. If X is normal Mu is Sigma squared, then we can do what's called standardizing the normal random variable. That means we're going to subtract by the mean and we're going to divide by the standard deviation Sigma. We really want to think of X minus Mu over Sigma as X shifted by Mu. The whole distribution is shifted so it's now centered at zero because everything is shifted by Mu. We're scaling it by 1 over Sigma and then that will give us the variance of one that the normal 0, 1 random variable has. It's shifted and then we're scaling the variance by that 1 over Sigma. Let's look at that a little bit more in depth. Before we do prove this proposition, I have to tell you something about any continuous random variable. Let's just do an aside here, and suppose we have continuous random variable Y just some arbitrary random variable Y with density f_y of y. We know that when we calculate the probability that y is less than or equal to a, we're going to integrate from minus infinity up to a of the density function. We're looking at all the probability from minus infinity up to a. Now, what happens if we have the probability of 2y being less than or equal to a or any constant. We can't really use the density function until we isolate y all by itself because this density function really only applies to y by itself and then we integrate up to whatever this value is. That's the minus infinity to a. If we take that idea and we solve or isolate the Y by itself on the left hand side by dividing by 2. Now we're asking for the probability that Y is less than or equal to a over 2, and that's going to be minus infinity up to a over 2 f_y dy, this is going to be true no matter what transformation of why we have. Let's go back to our problems. With x, we have x minus Mu over Sigma less than or equal to a. We can't use our density function for X yet until we isolate X all by itself so we can multiply up by Sigma and we can add in the Mu. Now we've got X all by itself. This is going to equal the integral for minus infinity up to a Sigma plus Mu of the density function for x. That's one over the square root of 2 Pi times Sigma e minus x minus Mu squared over 2 Sigma squared dx. Now what we do is we do a u substitution. We're going to let u equal x minus Mu over Sigma, our du becomes 1 over Sigma dx. What we end up with now is the integral from minus infinity, we have to change that upper limit of integration and that becomes a and we get 1 over a square root of 2 Pi e to the minus u squared over 2 du. This right in here, this is the density function for a normal 0, 1 random variable. What we have done is we have shown that this new random variable that we have here, this x minus Mu over Sigma, has the same exact density function as a normal 0, 1 random variable. That actually shows that the proposition is true. Let's do an example to see how this would work. Suppose you have x and it's normal with mean 1 and variance 4, and I want to do two things. I want to figure out the probability that x is between 0 and 3.2, and I want to find sum a so that the probability of x less than or equal to a is 0.7. Let's do the first part. Part A says find the probability that x is between 0 and 3.2. Now we could go and integrate from 0-3.2 of the density function for this normal random variable, I didn't bother writing it out again. We could do that. But what I want to do instead is convert it to a normal 0, 1. Rather than doing the integral, what I want to do is I need to convert the x to an x minus 1 over 2. What we'll do is we'll do that same conversion. We're going to subtract 1 and divide by 2 all the way across. Now this is our normal 0, 1, and I'm going to just change it to a z. When we simplify this, we get minus a half less than or equal to z, less than or equal to 1.1. We can use the cumulative distribution function. Let me just do this over here and let me just say recall, whenever I have f_z of a, that's going to be the probability that z is less than or equal to a. If you remember, we gave that its own special name Phi of a. This is going to be the probability that z is less than or equal to 1.1 minus the probability that z is less than minus half. We're taking all the probability up to 1.1 and we're subtracting all the probability up to minus a half. Then what we're left with is the probability that z is between the minus one-half and the 1.1. This is then Phi of 1.1 and minus Phi of minus half. Now you can do integration if you want, there are actually tables of compilations of standard normal values and you can look those up and this is approximately 0.5558. I don't necessarily care about the actual number, what I'm really trying to illustrate is that we're taking a normal random variable with mean 1 and variance 4, we're standardizing it into a z random variable by doing the subtraction across all three parts and dividing by 2, and then we end up getting the cumulative distribution function Phi of 1.1 minus Phi of minus half. Let's look at the second part. In the second part, we want a so that the probability of X, less than or equal to a, is 0.7, and so how can we use a standard normal table to figure that out. What this is actually giving, and this is just a portion of the table so we can see what's going on. Here's our z value and this column represents the tenths column, and this row gives you the hundredths. If I wanted the probability that z. Let me just write that down. Suppose I wanted the probability that z is less than or equal to 0.23, then I would go into the table. I'd go in at 0.2 and I'd find 0.3, so that would be right here, and so this would be Phi of 0.23 and it would be 0.5910. I want to use that idea to find the a that gives me a probability of 0.7. Let's see about how to do that. I still have to normalize or standardize this random variable. We'll do the same thing as what we did before, we'll subtract one and divide by two and so we end up with probability of z now. This is becoming my z again, less than or equal to a minus 1 over 2. We want this to be equal to 0.7. We look in the table for where do we see a 0.7 and you see here we have 0.5, from the table we see Phi of 0.52 is right here, and that's going to be 0.6985, and we see Phi of 0.53 and that's the next one over here, and that's 0.7019 and we want 0.7 and that's about midway between. What we know is we can interpolate and we get 0.525 is equal to about 0.7. What that tells us is that we need this a minus 1 over 2 to be 0.525. We solve for a and we end up with 2.05. Just multiply the two up and add the one over. Therefore the probability that our original X is less than or equal to 2.05, that has a probability 0.7. In this example, we did two things. We learned how to standardize a normal random variable, and we looked at how to use the table to actually compute probabilities. Let's look at another example, so this one has a lot of words, but it's still the same idea. The time that it takes a driver to react to the brake lights on a vehicle that's decelerating in front of them is very important and helping to avoid rear-end collisions. Research suggests that reaction time for an in-traffic response to a brake signal can be modeled by a normal distribution, having mean 1.25 seconds, and standard deviation 0.46 seconds. What is the probability that the reaction time is between 1 and 1. 75 seconds? What assumptions are you making? Let's let X is going to equal the reaction time and we're given that X is normal with mean 1.25 seconds and variance is 0.46 squared. What we want to find out is we want the probability that X is between 1 and 1.75 seconds. We're going to do exactly what we did before. We're going to subtract the mean. So we'll get 1 minus 1.25, X minus 1.25, we get 1.75 minus 1.25 and then we're going to divide. By the standard deviation. This is going to become when we do the calculations we'll get a minus 0.543 less than or equal to z, less than or equal to 1.087. This is going to be fee of 1.087 minus fee of minus 0.543. You can look that up in a table, you can do a numerical integration. In any case, you're going to get about 0.568. The probability that the reaction time will be between 1 and 1.75 seconds is 0.568. Now, what assumption are we making? It's important to understand that a lot of probability distributions are models. A true normal random variable actually can take on values from minus infinity to infinity. Now, the probabilities that you're beyond three standard deviations from the mean are vanishingly small, and most of the probability is concentrated within two or three standard deviations of the mean. But nonetheless, there is a very small probability that you'll be outside of that. The problem is if you think about this, here's 1.25 seconds. Here's our standard normal, not the greatest drawing, but I can't go below zero because it's a reaction time. We're modeling the reaction time to the driver putting the brakes on when they see a driver in front of them, putting their brakes on. We can't go below zero, and if we go too far above, we'll crash into the vehicle in front of us. We're modeling it within the region close to the mean. Within the region close to the mean, it's a very, very good model. All right. Let's do one more example and we're going to see this example again when we get to something called the central limit theorem. That will be in a few videos from now. But it is also appropriate to think about it in this context as well. I want you to remember X is a binomial random variable with parameters n and p, means that X counts the number of successes in n Bernoulli trials and each Bernoulli trial has probability of success p. Our probability mass function is the probability that X equals k and choose k, p to the k 1 minus p to the n minus k. This is for k equals zero through n. We also saw the expected value of X is np and the variance of X is np times 1 minus p. Now, for large N, X can be approximated by a normal random variable with Mu equaling np, so we're matching up the means. We're matching the mean of the normal random variable with the mean of the binomial, and we're doing the same thing with the variance. Think again of Mu as being the location and Sigma squared as measuring the spread of the data. What we're really doing is we're making sure that the means are in the same place and the spread of the data is roughly the same as well. If X is binomial and np times 1 minus p is bigger than or equal to 10, so that makes N fairly large. What happens is, if I draw a histogram for a binomial random variable, it looks something like that, and then what we're doing is we're superimposing upon that, a density function for the normal. We're making sure that the means match up. The mean Mu matches up with np. If X is binomial and np times 1 minus p is bigger than or equal to 10, then X minus np, so minus the mean divided by the standard deviation np 1 minus p is approximately a normal 0, 1 random variable. You'll have some exercises where you'll play around with graphing the binomial and superimposing a normal distribution on top of that. Let's see where this might be useful. Suppose on a given day, there are approximately 1,000 visitors to your website. Of these, 25 percent register for a service. I'd like to estimate the probability that between 200 and 225 people register for a service tomorrow. We're going to let X equal the number of people who register for a service, and we've got X as binomial, our parameters are n equals 1,000, and our probability of success P is 0.25, so we're counting the number of people who register for a service. If we actually write this down, X is between 200 and 225. This is going to be the sum from k equals 200-225. We're going to do some incredible factorial. A thousand choose k. For k from 200-225, P to the k, and 1 minus P to the 1,000 minus k. This is incredibly difficult to calculate and to compute directly. What we're going to do instead, is we're going to convert this to a normal random variable. We'll do Mu, is going to be nP, and that will be 250, and will have Sigma squared, that will be nP times 1 minus P, that's 187.5. Now what we're going to do is we're going to turn our probability of X being between 200 and 225 into a normal random variable. This is a model. It's got some estimates, but it's very good for large n. We're going to have 199.5 minus our mean, 250 over our standard deviation, square root of 187.5, that's going to be less than X minus 250, over the square root of 187.5, and on this side, we'll have 225.5 minus 250 over the square root of 187.5. You might ask yourself, why did we add this 0.5 and subtract off 0.5 on the left? That's to accommodate for the fact that the binomial is discrete and it's taking on discrete values, and we're approximating it by a continuous distribution, a continuous density function. Let me draw a little picture here. Here's 199, here's 200. What we've got in our histogram, and I don't want to just go up to 200, because then I'm losing all this area from the continuous normal random variable, so I'm going midway between them, and it's called the continuity correction. What we're going to be getting here is minus 3.688 less than or equal to our standard normal now, and minus 1.789. This is going to be Phi of minus 1.789, minus Phi of minus 3.688. If you do the calculation, this is approximately equal to 0 to four significant digits. I'm not even going to count that. I'm going to calculate the Phi of minus 1.789, and that's 0.0367. What this is saying is that the probability that you have between 200 and 225 people registering for your service, is only about three and a half percent, and all the rest of the probability is bigger than 225. That concludes our discussion of the normal random variable and the standard normal random variable, and how they're related to each other. In the next module, we'll talk in more depth about expectation and variance, and we'll get into correlation coefficients and covariances. Will see you then. Bye.