Hello everyone. Today we're going to finish our work course project with the final report review. Earlier, as we were wrapping up with the project, trying to finish it with the final report, I want to make sure we do a good job in terms of, one, summarizing and presenting the key findings of the data mining project. But also we want to look at the overall process and see how we can improve it further. By now you should have submitted your final report. That includes both the final presentation slides and also your final project report. Those are just the updated expanded version from your checkpoint slides and report. As I've said, when you're finishing, and wrapping up the project, so the key aspects that you'll highlight is about the key accomplishments. What has been accomplished in the project, and also what has been learned in this process. There's two parts, the raw content or the core content of the project itself, but also about the procedure, like experience and the learning. As a reviewer, we'll be looking at again, these two pieces of your submissions. One is the final presentation slides and also the other one is the final project report. We start with the slides. The slides should work as actually a very good summary of the project. As a reviewer, we take the slides, starting point, just pay attention to how many slides you are seeing. Because like I said, by now when you're wrapping up all data mining project, I will talk about roughly 15-20 slides. When you're looking at a particular submission, just see whether the content or the numbers of slides is reasonable. It's not a hard limit, but as we said, we don't want this to be too many because the goal is for the slides really to be provide us a summary rather than all the details. But you don't want that to be too few. Because if it's too few though you may not convey all the important information. You can maybe missing some key pieces or it may be unclear to the reviewer in terms of some aspect of the project. Just pay attention to the overall organization of the slides of the [inaudible]. The next piece is the executive summary. This is very important, as I said this is the final stage. The project has been accomplished. Then the executive summary is usually one to two slide here, should really have very compact but also to the point representation of the core pieces of the project, what has been done, what are the key findings? That's really what we really want. Just to make sure to check out the executive summary of the slides and see whether the author has done a good job conveying the core pieces. Then of course you get to some of the more standard components of the slides because the slides should at least give you a concise about a good summary, across those aspects. Like what is a problem statement? What is the some of the related work, in a way highlighting the contribution with this work. Then specifically, what has been done in the work itself? What four tasks have been accomplished, and how the evaluations are done, and what are the key results? These are the most important part, and the timeline, just to give you a better sense of how the project was carried out, Again, style. We have talked about this many times, randomly we keep reminding people. As a reviewer when you're looking at those slides, know that this is the final presentation slides. Should it be like self content, so then it should have all the necessary information in it, and it should really highlight the most significant contributions of this work. As you look at the style of the slides, where I see whether there are things you can see that working particularly well, really like common answer. Yeah, I really like how this is organized is clean, or I like how this picture really gives a good overview, or like this slide is a bit busy, there is too much text or the color is a little bit too much, or something. Just try to pay attention to the overall style of the slides. Then, of course, we're going to look at the full report. The slides hopefully should give you a good understanding of the overall project. Now, when you're reading the final report, this one of course is longer, it takes a little bit more time, but it has hopefully all the valuable information that you're trying to capture as a reviewer. Where do we start? Generally, by now they should all be okay, in terms of the ACM proceedings template. We do want to pay a little attention to the pages. Five to ten pages is the guideline we provided. It's not a hard limit, but you want to see whether the length of the report is appropriate. Typically we're not looking for something that's too long. I decide if it's too long that it's got too much information and people actually now spend that amount of time to read it. It should be reasonably concise. They should provide all the necessary pieces but without being too long. But of course, on the other end, if it's too shorter, then apparently they may not include some of the core information there. Just as you're reviewing the report, pay attention to the page events and then just generally have a bit of understanding about whether the demands of the report is appropriate for the project itself. Then of course, what gets you the specific aspects. Abstract, the abstract should be a very good summary. It should be a few sentence, each just capturing the problem setting, the poor contributions, and also the key results. It shouldn't be long, 1-2 paragraph is what we need, and it should really give you a good summary of this project. If it's vague, is not clear, or it's too long, all those signs you can note down, because that's basically helping others to improve this further. Introduction, this is the final report. Of course, as a reviewer, you may be seeing the project for the first time, but just make sure as you're reading through, the introduction should really have a convincing story. Because you want a successful project, you want to set the stage by stating clearly what the problem is, why this problem is important to address, limitations of existing work, and also what is the contribution of this particular project. You want to make sure that is all clear, as you're progressing through the report. This is the introduction part. It's very important, should give you a good sense of the scope of the project. Then you read the related work section. This is the one the author should really do a good job highlighting related work. We have said that it's usually good to group those, the work independent paragraphs with different focuses, but then it is important that you say how this project builds upon prior work. It cannot be just a listing of few prior work, how they're all related, but they're not talking about why specifics are related or they're all building upon those work. You want to see that aspect, not just the list of things that's relevant, but also how that is contributing or helping the whole project itself. Then the core section, proposed work. Of course our projects can be very different, so there's no how to say checklist, in terms of when I say you have to do 1, 2, 3, 4, 5. Instead, you do want to think about the whole pipeline. We did this for our whole discussion about proposal checkpoint and a final report. You want to really see whether the project is carried out throughout the whole data mining pipeline. See whether there's a good discussion about the data, the tools that's being used and also the specific steps, in terms of getting the data, understanding the data, preprocessing, managing our data warehousing part. Then when it gets to the modern angle, see whether it's clear, what problem they are looking at, what methods they using, why this particular methods, and be clear in terms of the design. All of those need to be clear to you as a reviewer. Of course, if there are pieces that's missing, or it's not clear, then note those down, because that can be important, actually can be very valuable feedback for the author. Now you get to the evaluation section. As a reviewer, you really want to see whether the evaluation is carried out effectively. That includes a good evaluation setup, and also the evaluation metrics, if there's any good comparison, but also the core part is about the results. The evaluation section is where you present some of your key results. See whether the results are convincing. We need that out of the way, is the set happen or whether is presented, is actually clear to you, and actually has some interesting findings there. Also another important piece in the evaluation section for the reviewer to look at is that, whether there's a good reasoning. You want to see they also actually, does a good job interpreting the results, instead of just presenting the results as is. You want to see that there's good reasoning behind the results that's been presented. Then discussion section. I usually personally really enjoy reading the discussion section, because I think that that's where you actually see how the project has progressed, and also the behind the scene thinking that goes into the whole project. You want to really read about what the authors view in terms of how the project was carried out, the timeline, how things worked out, how they had those particular challenges, and how they addressed those challenges, or how things had to be adjusted or changed because of certain circumstances. This is actually very useful for both the author in terms of self reflection, but as a reviewer, I think it's also important for the reviewer to be able to see how the project progressed, and also actually learn for themselves what is some good lessons. What worked, what didn't work, what could have been better. The discussion section actually always is very interesting to read about. Then conclusion. This section usually is not long, probably one or two paragraphs. But it actually should be really effective in terms of one; providing good summary. What is this project. What has been done. But also really highlighting the key findings like say, ''Oh wow, this is a repetitive'' but the from the reviewers point of view would appreciate. A good starting point in the abstract talk about what this is and then the summary or the conclusion really gives another review of what has been accomplished, so that's very important. But also there's one more piece we'll talk about the section. That's the future work discussion. As a reviewer, you want to be able to really pay attention to that part of the discussion and see whether in the conclusion there has been some discussion about future work and to see whether the author has a done a good job in terms of visioning what can be done in terms of future improvements or just like a more analysis. See whether there's some good thinking goes into that paragraph by talking about further improvements of the project. That also is always a fun paragraph to read as well because that's really talking about the vision of the author after having accomplished the project and now have gained a lot much better understanding of the project. One last thing. That's the references section. As you are wrapping up for the report, while you're reading it, just make sure you check the reference and quickly see whether there are missing references or just things that are not correct in terms of formatting or missing information in the particular references. Those are about style, but also it's about conveying the message. Because if you're providing a reference, reference means that people should be able to look it up and get that particular article or particular paper or tool, whatever. You want to make sure that is actually correct. From the reviewers point of view, of course, you don't need to hand check each one, but as you're glancing through the references, just make sure it has all the proper information incorporated with them. With that, we are really finishing our data mining project. Looking back, we started with brainstorming, what is a data mining project. I was really pushing everybody to be the architect of your project. Because the idea is that instead of being provided with a readily defined project, you just go and finish the tasks one by one. Here, I really want to get everybody to propose their own project. I feel that's actually very important skill to have. Actually it's a very useful in many real-world settings because you want to be able to identify projects. Identify the problem setting, think about what you would like to do specifically in that project and explore the different questions you could answer with your data mining project. We went to the steps of generally brainstorming and then really concretely defining your project through the proposal. Then we had our checkpoint discussion base just seeing how things are and be able to see whether there has been any changes, whether things are on track and of course finishing the project with the final report. Those are the core steps involved in the data mining project. But also I really want to highlight this review aspect. By now all of you, of course, have submitted your proposal checkpoint final report and you should have done the reviews of other projects. I hope that you really see the value of being the reviewer. Because many times when we're working on our own project we become too focused on just finishing it and we in a way may lose the bigger picture or see the whole process. By reviewing other people's report hopefully provides a bit of more perspective. In terms of, oh yeah, I wasn't thinking in my projects but when I look at other people's project, I see those aspects which I can do better in my project. Or I can provide good feedbacks or useful suggestions to my classmates in terms of their project. That is really how we learn from each other and how we help each other. I hope you, of course the review add a bit more effort, but I hope you enjoy that and really see how you can learn from that process and how that augments your own projects aspect by providing this reviewer's perspective. To summarize, if you check our definition of the course and that we actually really highlighted those four core pieces as the objectives of our data mining project. The first one is about being able to identify the key components of a real-world data mining project and propose your own project. That's why as I've said that we are really pushing everybody to define your own project. It makes it a little harder because of course it takes more effort for you to come up with your own project. But hopefully, you are able to pick a project that you are interested in and you're able to carry out and actually really learn something that you feel is useful. That's the proposal stage. But then the next piece, of course, is the actual hands-on experience of designing and developing the actual solution across the full data mining pipeline. That's also important because in the real world, data mining project you just need to work through all the different pieces. There's no predefined checklist or a guideline for you to say just follow the instructions and you'll be good. Here there's a lot of reasoning. You really need to think about at every step. For every particular design or every particular result, think about where things are, what does mean and how you can go further from there. This is really the whole process. There's no predefined set of rules or procedures to follow, but really it's about how you know the whole process and then being able to reason about your progress throughout the whole thing. Then we get to this about the racking up piece, because you may have a very interesting project idea, you have done some of this solution. But ultimately, it's about how you summarize your results and you have to present effectively. Because your report should not be just something like I have done all this, I have a full lists of results or pages or figures, so go check it out. Actually, a big part of value of a data scientist is being able to synthesize, extract what do you have done, and really highlighting the key findings, so that you can convey that knowledge to your audience rather than just providing a full report. That's also very important that you really can always think about it as you're presenting your results. As I said, result wouldn't be just say a table or figure and just say go figure. Whether you say, what this mean? When I show you this table, when I show you this figure, I want to be able to add and also provide this integration about what is it. Why this is important. The last piece, but also very important. This actually goes through the whole process, because we went through the concrete aspects of writing a proposal, doing the work, writing a checkpoint in the final report. But a main piece that's outside of the core component of your specific project is about this bigger picture. Architect review of how a project is carried out. Some of you may be doing this project for the first time. Some of you may have already had experience in other settings. This is all good, this is really how we learn to the real project, but also go beyond a single project. Be able to see just generally, what is a good way to carry out a project, going from the design, going from the execution, summarizing, and all that, and in this processing about how you can do it better. With that, we have actually come to the end of the data mining specialization. In this specialization, we have three courses. We started with data mining pipeline, we really went to the individual steps of the whole pipeline, how the pipeline are important, and how it connects together, to provide a good data mining setup. We then dive into the data mining methods. That's the second course. Then we talked about specifically the different methods you can use. We talked about frequency and pattern analysis. We talked about classification, clustering, anomaly detection, and also a little bit more, but the more advanced methods. That's really the frontiers of a data mining field, like what's happening in the researcher field, but also some of the complex of data mining scenarios. Hopefully, that gives you a bit of good foundation. In this course, through a concrete data mining project, which is a defined by yourself and carried out by yourself and of course reviewed by your peers. Hopefully, we went through the whole process and really getting a much better understanding of this hands-on experience. Just to summarize. Data mining is really applied. It's happening in the real world, it's happening many in very diverse studies. What I hope is that you have enjoyed learning all this materials, but also see beyond what we're learning in the course, but see really how to connect to the real world. As a takeaway message, stay curious. Data mining is actually very powerful and it has great potentials in many real-world settings. So just keep asking a question. What do you could do if you had those data or what if you get those data, what other particular things you can learn from it? Throughout the whole process, keep in mind, we want to use analytical thinking. It's not about the mechanics of just applying the data mining methods. It is about the reasoning behind the whole process that makes data science particularly interesting and valuable. With that, I'd like to thank all of you and I wish you best for your future adventure. Thank you.