In this lecture, we are going to learn about Gaussian Mixture Model. We will see the advantages and disadvantages of using a mixture model. I will also show how a GMM can be expressed mathematically, as well as graphically. We have learned Single Gaussian so far. because, they are important, relatively easy to handle. But there are certainly limitations in using Single Gaussians. Now let's think about what they are. The real world, we may have a distribution like this, for target to learn. But what happens when you try to fit a Gaussian model to the data is this. As you can see, a single Gaussian cannot properly model a distribution if it has multiple modes or there is a lack of symmetry. On the other hand, as you will see GMM is very expressive. Expressive enough to model any type of distributions. Let's start talking about it. Simply speaking, GMM is the sum of Gaussians. I will graphically show what happens when we add many Gaussians. The colorful lines are ten random Gaussian curves. The black line is the sum of all the Gaussians. A mixture curve can have very weird shapes that we cannot write as a simple function. So, if we choose the right Gaussian elements, than we can express any unusual distribution. Let's look at the ball color example again. This time the 2D plot on the right side shows the color distribution in the red and green channels. We can use a 2D Gaussian to model the color distribution like this. Or we can also use a mixture of two Gaussians which seems to better express how the green and red values of the ball are distributed. Now, we're going to use our favorite tool, that is mathematics, to express the mixture of Gaussians in a rather formal way. If you let g be a single Gaussian density of some mean and co-variance, then a GMM can be written as a weighted sum of Gaussians of different means and co-variances. The weights, Ws should be all positive and they must sum to 1. This make sure that the distribution of GMM is a probability density that integrals to 1. Here, the large K indicates the number of Gaussian components. If you're allowed to have arbitrarily a large K and arbitrarily small variances, you can express any shape of distribution in theory. That is what makes GMM so powerful. We just have tasted the good side of GMM, but we also need to understand that there are costs to pay. A GMM has more parameters than a single Gaussian of the same dimension. The number of means and co-variance matrices to be specified increases as the number of mixture increases. Also, now we have a new parameter the weights. In addition the number of Gaussian components itself is a parameter you have to decide somehow. Having more parameters have some unfavorable side effects. First, it is hard to estimate the parameters. As you will see in the next lecture, we don't have analytic solution for the GMM parameters. Second, there are more chances for things to go wrong. Specifically, we might the overfitting problem. Keeping this is mind, we should be careful about the complexity of the model we choose. In this course, we are going to consider given K in constant uniform weights for simplicity. Then we will continue to talk about estimation of means and co-variance matrices in the following lecture.