Okay. Here we've got our next lab on driving insights for advanced SQL functions. And largely this is going to get you very familiar with the concept that we really hit hard on in the last lecture, which is making efficient use of that WITH clause for those complicated queries. And we're going to try it without copying the code block there. But if you haven't attempted this already, feel free to give it a go. And special bonus points if you can do it without using the pre-built SQL query there for you. Okay. So here's what we want to do. We want to write a SELECT statement that calculates average income and sorts results by average income, highest to lowest. But largely, it's the count of filings and the revenue and expenses since 2013. We're going to get some stats on the IRS kind of as a whole when it comes to nonprofits. So to do that, let's start with blank query space. And here, we have our question that we want to answer starting from scratch. So, what we got to do is we got to get the count of filings. And this is generally how I'll start writing these complex SQL statements. It's hard to start writing out in English what exactly I want. I want the revenue. And I want the, we'll see what expenses. This is all since 2013, right? Okay. So how do I get the count of filings? The filings themselves are in the 2015 table. I'm going to copy all this to my clipboard, because I want to do my normal trick of clicking on the table, queering the table, and pasting over what I had before. Okay. So let's see what fields are available on this table. Again, if you don't have that scheme pulled up, you could do the trick of holding down the command or control, and then clicking on that from within your query. Very very useful. And let's see what we actually want to pull up. So, we need the count of filings. So in order to do that, we can just do. Let's see. We want the count of, you can do count of all records. Doesn't really matter. There's no performance difference between, say just counting just one column or counting all the columns. We're just going to do the count of the Employer Identification Number as the count of nonprofits. And let's get revenue, right? So revenue, again, let's do a quick search to pull up revenue. There is total revenue. Total revenue and then let's get those functional expenses. Again, searching for those. Trying to get the scheme memorized now. So total functional expenses, boom, add that in there. I really don't like the way these things are named. So I'm just going to call that, well eventually we're going to do an average function on these rates will call this as average revenue, as average expenses, and that invoke our average function. Why not, right? So we've got three aggregate functions. We're basically going to get the average revenue and average expenses in the count of all of them. And this is just for 2015 so far, let's make sure that query works and then we'll worry about since 2013. Trailing comma boom, I got caught. Delete that last comma there and run this as selected. Okay. Cool. Now we don't have the table name where it came from. So, we can do a little bit of magic there. So now we want to include 2013, 2014, and 2015. Remember back from the merging and joining datasets lecture, we could do something like this, which is called a table wildcard. Now that we've invoked the table wildcard that unleashes the beast that is the table suffix, if I remember correctly. The table suffix is exactly what that wildcard is replacing. And the great thing about table suffix, is you can now invoke that as a new field and potentially filter on it in your work laws as well. And this is going to be like filing year, something like that. I'm going to purposefully just execute this just so you can see the value of table suffix. What do you guys think? Do you think the value is going to be something like 2015, or potentially just 15, or potentially just broken? So I committed I was going to say I was doing this on purpose, but I wasn't. I committed the other cardinal rule of SQL which is, anytime you invoke something that's not an aggregate function, you immediately need to have a group by. So you can have a group by the index of the column, or you can actually group by the alias in this particular case. Give ourselves a little bit more space here. So and I'm going to delete these, since I can start remembering them and they'll allow me to get benefits of the query validator that's here, because it's looking at all that static text and going on, I really don't understand what you're doing. Okay. So, to answer that question filing year 15, 13, 12, 14. Again that's because we've hard-coded this prefix. This prefix is pretty long so what we need to do here. If you set concatenation, and you're exactly correct. So you can concatenate back that prefix in there, separate that with a column as the input, you already have the alias. And we're slowly moving our way up there. So now we have the filing year, and we have the kind of nonprofits, the average expenses, and the average revenue. And now we do not want to include 2012 for some reason, just 2013, 2014, 2015. So immediately after the from is WHERE we can add a row filter. Row filter is the WHERE clause in SQL again, and we want to filter on the table suffix. Here again, it's not going to be, it's going to be greater than, you've got to do greater than 2012, or you could be greater than or equal to 2013, keeping in mind that this is going to be a string value. But what do you think is going to happen when we execute this? Do you think we're going to get a lot of records, or no records? If you said no records, you're correct. And that's because again, it's not going to 2013, it's actually going to be13. Keeping in mind what that wildcard is set as. All right. Slowly moving our way up towards the analytical function that we actually want to practice here. Not the analytical function, but the common table expression in that WITH clause. Okay. So what we want to do now is, do a subtraction of revenue and expenses, but this is going to be too messy. It's a do on this one table. This table is already done a ton of work for us which is great. So we want to actually store it as a temporary table, maybe a permanent table or all of it gets promoted. And we're just going to call that one, we can call it, key metrics by year. Don't forget you need to give it an alias, right? And then wrap everything in. Make sure to delete that terminating semicolon, because our query is just going to keep going. So with key metrics by year as this entire query, we're going to remove that limit now, because we're just going to go on bound in. Now we can do some fun things, that now that the table exists, like select filing year, select, let's do the average revenue. We can access all these aliases, right? Since that table has already been run, the temporary table. Average expenses, just making sure not to make typos here. Count of nonprofits. And you can very easily do calculated fields. And so we can do something like, average revenue minus average expenses, and keeping mind that that is going to be a calculated field. So, we're going to store that as average income, right? Revenue minus expenses, I think so. Add a comma there. We need to specify where this is pulling it from. From the temporary table it was created immediately prior. And then we want to order it by, I believe the filing year. We'll let that go from oldest and newest. Okay. Add a semicolon on that. Hopefully when I format this it's not going to make it too ugly. And we have our results. Plus the thing I'm going to throw in there for good measure is, I'm actually going to do that in here. This is already where a lot of our calculations are happening. Going to round this to two, because maybe I just want to, you know our final reporting table have that result right there. Okay. Run it, and see if we did well. So we only have 2013, 2014, 2015, great average revenue, average expenses, average income, I did not round. I could do that here if I wanted to, and the count of nonprofits. So now again, one of the things that you could do here is, it's kind of hard if you remember the visualization lecture, right? On counting the fives that we did there. This is a little bit harder to read as all of these numbers are in. So what you could then do after this is just pump it into a visualization tool, or Google data studio, and then look more visually at those trends over time. Perfect. All right. Why is that query taking so long? I'm just going to cancel that, re-run it. There we go. Generally I find if anything it takes more than 20 seconds just cancelling, so a fluke. And there you have it. Let's compare that to the solutions that we have. Again I do like practicing these things from scratch whenever you can. And it actually had its order by a different value. This is average income descending but again, you can do whatever you'd like. And for the year with the highest average income was 2015. We can just add in the value here as well. Average income, descending highest to lowest, re-run that. And the filing you should have thought boom, actually is 2015. So what can you say is an insight about these charities over time? Is their income is growing. Again since their charities, their main goal is not to make, it means literally in the name nonprofit, but it's an interesting metric to have. Whereas the counted them actually decreased from 2014 to 2015. They get it much easier to see that in something like a bar graph. Okay. That's the end of this lab. Let's continue on to deeper and more advanced topics.