[MUSIC] In our study case, we use the same data that was on week two, but for this week we artificially erase some labels from the data set. So in order to simulate the situation that we do not observe part of our target values well, in real life, we of course, always have those UN observed target values. But for this study, we would like to keep them aside just to you compare finally, how our approximate benefit curve after rejecting France, will be far or close to the true benefit curve. That would actually take place if we applied our model on historical data. And of course, in real life you can never make this comparison. But we want to kind of show some standard facts about it that when you use these techniques, you are likely to overestimate you expected benefit well again. Why should we use reject inference at all? Well, I would say that basically, the reason behind that is that this is the only feasible solution to kind of extract the data from the rejected clients sample. And also this is the only way to plot this expected or approximated benefit curve. It all because if you do not use the portion of data that was previously rejected, it means that when you try to give away the benefit of your model. You will not use the whole general population of clients, which will lead you definitely to a biased outcome. We have the target. Why hat that we estimated and for those clients that were accepted, And for those clients which we know the target value. It's obviously equals y itself. And for those that are unlabeled and we do not observe the target event, we put our reject inference approximation that we acquired at the period step. Well, in order to put a benefit curve, you go right through different threshold levels and corresponding acceptance rates. C, and you forage value of see you blood a benefit value. So it calculates. As we discussed in the lecture, and for clients with known label, it's kind of binary. Either loss or profit, and for clients that we don't know the target value, this will be a probabilistic estimate. Something like mathematical expectation, because of our reject influence model, we have probability of target event there. And we can combine those probability with error costs. Because if we know that this client has a high probability of default, then we will have something closer to the loss, which is close to false negative error, cost and otherwise. So, basically, these benefit for some clients is kind of binary and for some clients is something like average. So now we will look how our benefit curves look like we plot two lines. The first one, the blue one, is our approximated benefit curve based on rejecting France Model Probability of target event for rejected clients. And the green one is the benefit curve for our decision tree that is built using true target labels. Again, we want to emphasize that in your life you cannot observe the green one. But for our study case, we want to show you in the important idea that actually the blue line, the approximated benefit. Based on rejecting France is likely to overestimate expected benefit comparing to the true one. What does this happen? Basically, I would like to say that it's really typical for credit scoring. It happens because even if you use reject inference techniques, you are mainly trained your model on the clients that were previously accepted by some decision policy because precisely for them you can observe the target event. And, obviously that portion of clients is kind of better then the rejected portion of clients in terms of payment discipline. And because every decision policy actually has some reasoning behind it, it's never random. And what happens? Just the average default rate or target event rate on rejected sample is likely higher than the one we observe and have a sample of accepted clients. It means that our model kind of didn't have any chance to see the Boris clients. Therefore, the benefit itself is kind of overestimated, because in real life those clients would be likely more risky. Well, but still, the good news is that, well, kind of the amplitude of this curves are similar. So we didn't have any strange I would say discrepancies. Now we will build another model, a competitive model to our first model. In order to try to compare benefit between these two models. Now, this will be more sophisticated, al algorithm reading, boosting classifier. But we won't dive deep into feature engineering and tuning of this algorithm, we will use default one. So again, our key focus here is to avoid financial effects between the models not to build the best model we can. But of course, when you pay more kind of time and effort, you will have a better model. So again, I'm pretty much the same scheme for boosting classifier because we built initial model on label data. Then we apply this model to the rejected sample, and for ejected sample, we kind of simulate the target event. What does this mean? You have to lines of code which assigns weights to clients based on the probability of a default event based on this initial model. So you kind of duplicate your rejected clients. And for each observation you got two observations with target value equals zero and equals one. And you assign special values which are equal to the probability of default based on that initial model. After that, you pandas to diet cassettes and you rerun your exhibits classifier with weights and therefore you kind of get a new version of that. So again you reiterate this until there's some conversion criteria is met in our case it's also fisher importance. So when, for kind of each variable, each variable importance doesn't change dramatically for all variables. We say, okay, stop. Now we'll fix this model is the final one, and then we will blossom benefit curves for this model. Well, again, we have this. I'm boosting prediction on rejected clients that allows us to build a proxy for target variable. It's called y hat. And for that portion of data that is already labeled we, of course, do not do anything we just keep from, it is found in extreme data set. And after that, we can plot benefit curse for edgy boost. So here we can see something like we did when we discussed the first model decision tree. We have also this effect off overestimation of our benefit. The story is the same. Our model mainly trains on that portion of clients with label data, and they are the ones that were accepted by previously settled decision policy. Which was not random, which had some reasoning behind it. And this policy likely accepted better clients in terms off payment discipline and rejected the worst ones. So basically our model hasn't ever seen the worst clients. Therefore, it somehow overestimates the benefit. Again blue line is Marge arising the green one. And again just to mention this green line is built on true labels on our rejected sample. And this is not possible in real life. We can just use this for our study case. So in real life you will have only blue curves and you will have blue curve for the first model and blue curve for the second one. How to compare them. We will discuss in the next section.