Hi, and welcome back. In this video, we're going to begin our study of the normal or Gaussian random variable. This random variable is so important that it will actually take us two videos to cover. In this first part, I'm going to introduce the normal random variable and the standard normal random variable. Then in part 2, we'll see how they're exactly connected and we'll go through examples. The normal, sometimes called Gaussian distribution, is probably the most important and widely used distribution in all of probability and statistics. It has that typical bell-shaped density function. Many populations have distributions that can be fit very closely by an appropriate normal bell curve, which is one of the reasons for it's usefulness and widespread applicability. There are many examples; height, weight, other physical characteristics, scores on some tests, some error measurements and so on, that can be modeled very well by a Gaussian distribution. There's a very long history on the normal random variable. In fact, it's been around for almost 300 years. It has been used widely during that time by many statisticians and scientists. It was first described by Abraham deMoivre in 1733. Later, many others used it, including Carl Friedrich Gauss. Gauss used it so extensively in his astronomical calculations that it came to be called the Gaussian distribution. In 1893, Karl Pearson wrote, "Many years ago, I called the Laplace-Gaussian curve the normal curve, which name, while it avoids the International question of priority, has the disadvantage of leading people to believe that all other distributions of frequency are in one sense or another abnormal." In math and science, usually the first person to discover something has it named after himself or herself, or they actually come up with a different name. Now, Pearson knew the history and he knew that Gauss had not come up with this curve. He also knew that Laplace had been working on it quite a bit as well. To avoid attributing it to one or the other, neither of which had come up with it in the beginning, he started calling it the normal curve. In fact, it applied so widely to different data sets that they were working on at the time that it almost did seem that it was normal. Well, let's get to the definition. A continuous random variable X has the normal distribution with parameters Mu and Sigma squared if it's density function, f of x, is equal to one over the square root of two Pi times Sigma, e to the minus x minus Mu squared divided by two Sigma squared. It's defined for all X's between minus infinity and infinity. Notationally, we'll do x has the distribution of a normal with parameters Mu and Sigma squared, and that will be a very quick and easy notation for us to use. There's several properties of the the normal density function that I want to talk about. The first is that f of x is symmetric about the line x equals Mu. You can see right here, here's our Mu and there's the line x equals Mu, that vertical line right there and it's completely symmetric about that line, x equals Mu. The second property that I want to talk about is that f of x is always positive. It's never equal to 0.You can see that because the e to the minus x minus Mu squared e never is zero. Also the integral from minus infinity to infinity of f of x dx is indeed equal to one. Now, that takes a little bit of calculus to do. It's not germane to this particular course. The third property is that the expected value of x, That's going to be the integral from minus infinity to infinity, x f of x dx. That also takes quite a bit of calculus to muddle through but it turns out that the expected value of x Mu. The 3rd, 4th property is that the variance of x, and that's going to be minus infinity to infinity, x minus mu sub x squared times the density function. That's our Sigma squared. The two parameters that are in the normal distribution are Mu, the mean, and Sigma squared, the variance. Finally, the fifth property that I just want to mention, Sigma is going to be the square root of Sigma squared, and that's the standard deviation. Mu plus Sigma and Mu minus Sigma are the inflection points for our density function. Those inflection points, there's one right about there and another one right about there. As you can see also, a lot of the probability is very closely centered around that mean Mu. We'll look at that in the next slide. Here I wrote the density function again and what I want to observe here is there's two graphs here. One when Mu is 80 and Sigma is 15, the other Mu was 100 and Sigma is five. Notice for the smaller Sigma, the inflection points are closer to the mean for smaller sigma and the density function becomes more peat. smaller Sigma corresponds to a more peat density function and a larger Sigma corresponds to a more spread out density function. You can see that just from these two examples here. The other thing I want to remind you is that if I have this normal random variable and I want the probability that X is between two values, a and b. Those values a and b can be anywhere, maybe the a is here, maybe the b is here. The probability that x is between a and b is equal to the integral from a to b of f of x dx, and that's the area under the curve, from x equals a to x equals b. We'll be using that. We have used it already quite a bit, and we'll continue to use it in future examples. Now for both theoretical and practical reasons, as we continue to develop our examples, it is usually easier to work with what's called a standard normal. In a standard normal, the mean is 0 and the variance is 1. Again customarily, we use a Z for a standard normal, sometimes you hear about z-scores, we'll talk about that in the next video. But Z has a normal 0, 1 random variable, it's pdf, its density function is given here, it's a little bit simpler than the regular normal. We use a special notation to denote the cumulative distribution function. Remember the cumulative distribution function, F of z is the probability that Z is less than or equal to little z, and that's going to be this integral from minus infinity to z of the density function. But again, it occurs so frequently that we use a special symbol, Phi of Z to denote the cumulative distribution function. We'll use this notation quite a bit in future videos. A few comments about the standard normal. It's symmetric just like the regular normal, but it's symmetric about the y-axis. The standard normal almost never occurs naturally. Instead, we use it as a reference distribution and obtain information about other normal distributions via a simple formula. The cumulative distribution function of the standard normal, Phi, can be found in tables and it can also be computed with a single command in R. We'll see later in this course and in the next one that sums of standard normal random variables play a very large role in statistical analysis. Let's look at a few calculations. Suppose we want to calculate the probability that Z is greater than or equal to 1.25. Maybe our 1.25 is right here, and the probability that we want to calculate is the area under this curve. This is the same as 1 minus the probability that Z is less than 1.25, that's because it's the compliment, and that's the same using the cumulative distribution function as 1 minus Phi of 1.25. It turns out that's equal to 0.1056. We can also do the calculation by integrating from 1.25 to infinity of our density function. But it's actually a lot easier to just look up in the table or to compute it using R. Now, let's put in here. Let's put in minus 1.25 here. The question is, why does the probability of Z being less than or equal to minus 1.25, why does that equal the probability that Z is greater than 1.25? These two yellow areas are equal. That's of course by symmetry. There's nothing special about 1.25, it's true for all real numbers. If we wanted to calculate the probability, for example, that Z is between minus 0.38 and 1.25, we could break this up and look at Z less than or equal to 1.25. That's all the probability for minus infinity up to 1.25. Now, if we subtract out the probability of Z being less than minus 0.38, we're going to go all the way up to 1.2. 1.25 then we're going to subtract out from minus infinity up to minus 0.38. What's left is all Zs between minus 0.38 and 1.25. I'll also just remind you because Z is a continuous random variable. It doesn't matter whether I have strictly less than, or less than, or equal to minus 0.38. Using our special cumulative distribution function, we get phi of 1.25 minus phi of minus 0.38. Another calculation that I'd like to do is the probability that Z is within one standard deviation of the mean. The mean is zero and the standard deviation is one, so this is the probability that Z is within one standard deviation of the mean. Using the same ideas that we had in the other slide, this will be the probability that Z is less than or equal to one minus the probability that Z is less than or equal to minus one. This would be phi of one minus phi of minus one, and the numerical answer you get is 0.6826. What this says is that if we collect a lot of data that had or was transformed to have a standard normal distribution, 68 percent of that data should be within one standard deviation of the mean. Let's do the same thing for two standard deviations. That would be Z less than or equal to two minus the probability Z less than or equal to minus two. This turns out to be phi of two minus phi of minus two and that's going to be 0.9544. In other words, here 95 percent of the data is within two standard deviations of the mean. Now, in statistical inference, we will sometimes need Z values that give certain tail areas under the standard normal. In other words, we might want to find a Z sub alpha so that phi of Z sub alpha, which is equal to the probability of Z less than or equal to Z sub alpha is for some number. Maybe, this is 0.95, so this would be our alpha. To calculate this, we would actually want to find from our tables or from R, we would find that Z sub alpha is equal to 1.645. Then, in that situation, the probability that Z is less than or equal to 1.645 is 0.95, and similarly to what we did just above here, the probability of Z being between minus 1.645 and positive 1.645, that's going to be 0.90. We will see in the next course that this is going to be very important for finding confidence intervals and other quantities associated with normally distributed data. Thank you very much. Will see you next time.