In this lecture, we'll learn about a Gaussian distribution that incorporates more than one variable. With the ability to use more variables, multivariate Gaussians can utilize richer features to model a target. Let's consider the previous ball color example, and the 1-D example we used the single property called hue. The color itself can be described in many dimensions. For example color images are typically defined by three color channels of red, green and blue. What if we use all of the available channels? If we have plot the RGB values of all the pixels on a 3D graph, we can get a distribution of colors like these. Now we're interested in modeling the color of the red ball using all of the RGB channels. Let's talk about how a Gaussian distribution works in this case. Mathematically, the multivariate Gaussian is expressed as an exponential coupled with a scalar vector. This is the same as the 1D Gaussian. This might look very complicated, but it has a similar structure as the 1D Gaussian density function. We'll be able to match the corresponding terms between the two. D is the number of dimensions we are going to use. X is the vector of variables whose probability we are attempting to quantify. To signify that x is now a vector, we make x bold. We want to know the probability that x lies within our Gaussian distribution. We call this term p(x). In contrast to the 1D case mu our mean is now a vector and sigma, our covariance matrix is a square matrix. In our covariance matrix, there are two key components. The terms in the diagonal and the terms in the off-diagonal. Here is an example of a 2D covariance matrix. Diagonal terms are independent variances of each variable, x1 and x2. The off-diagonal terms sigma sub x1 x2 squared represents correlations between the two variables. A correlation component represents how much one variable is related to another variable. Lastly, the vertical bars around sigma in the denominator of our equation indicate the determinant of sigma. Let us consider our example with a ball color. We are dealing with 3D variables RGB. The variable vector contains our sample pixels red, green and blue values. The mean will be a three by one vector, and the covariance matrix will be a three by three matrix. P of x is the probability that this sample pixel is generated from the ball given that we know the mean and variance of the RGB model of the ball. As we did for the 1D case, let's look at how this function behaves for the 2D case. It is easier to start from a special case where the distribution has 0 mean, unit variance, and 0 correlation terms. We shall see how our distribution changes in 2D with different values of the parameters. Here's the simplification of the probability density function from the parameters given. Graphically, the distribution of 2D zero-mean spherical Gaussian looks like a mountain with a single peak. If you cut the surface of the peak in half then the cross section would be exactly 1D Gaussian shape. Sometimes it's useful to draw the distribution surface in 2D instead of 3D. Contours drawn would connect values of x with the same probability value. The Spherical Case where the covariance is a diagonal maxtix with equal values on the diagonal, the contours appear as circles. The innermost circle is where the peak is, the outer circles represents less probable regions of the graph. Now, as we did for the 1D Gaussian, let's briefly talk about what other Gaussians with different means and different covariance matrices look like. As before, when only the mean is changed the distribution is merely shifted. Again similar to the 1D case, as the variance terms increase the distribution spreads out with smaller peak value of p of x. The distributions tightens with large peak value of p(x), as the variance terms decrease. However, the covariance matrix of multivariate Gaussian has some properties that we don't see in the 1D Gaussian. As I mentioned, sigma includes correlation terms in the off-diagonal elements. If sigma has non-zero off-diagonal turns, then the shape of the Gaussian appears skewed, and apparently, this cannot happen when we deal with our a single variable. There are two other properties I'm going to mention about sigma. And they require some linear algebra background. First, the covariance metrics must remain symmetric and positive definite. Meaning, the elements of sigma are symmetric about its diagonal line. The Eigenvalues of sigma must be positive. Second, even when the covariance matrix has none zero correlation terms we can always find the coordinate transformation which makes the shape appear symmetric. We can decompose the covariance matrix to reveal the basis of transformation using algorithms for Eigenvalue decomposition. Today, we have learned about the multivariate Gaussian density function. And we graphically examine the parameters, mu and sigma via 2D cases. The following lecture, we'll talk about how to compute the maximum likelihood estimate of the parameters from data.