So, now getting into the basic SQL syntax, let's do a quick number of basic queries and then we will see what we can find as far as insights in the dataset. So you want total number of unique visitors for the dataset. So, if we've pull open our schema, let's query on all sessions now. So, let's grab the VisitorID from back to our dataset ecommerce just sessions now. If I executed this, do you think it'll give me the distinct number of VisitorIDs? It's not going to give me a single row doing this. I don't know, if you can imagine. Returns every single row and just has the column full of VisitorID. BigQuery is churning the background saying, "Why on earth are you making me return 21 million full VisitorIDs when you can't even do anything with that information." So as you can see here, 21 million records returned with a full VisitorID. Now, while you could do this, you could do a massive GROUP BY. This would give us the same answer. Hence, during a GROUP BY, so you're going to get the total number of distinct values here but it's only going to be visible through the pagination right here, right? So, this is about 390,000. How do we get that actually to return? How do we aggregate this? Well, introduce an aggregate function. So, we're going to do this. As unique_visitors, we no longer need the GROUP BY since we're doing the aggregation on the field itself. Now immediately, some of you might be asking, "Is the result going to be 390,000?" It should match right, 390,000? Well, if you're doing a count, what on earth, we have 21 million. Again, that's because the full VisitorID is not the primary key of the dataset. So, it's counting duplicate VisitorIDs. In order to distinctify that, I just made that word up, distinctify we need to add in the qualifier distinct into the count. So again, we're looking for that about $390,000 against 389. So we got to 389,000. Great. Your data analyst team has told you that each individual row you can think of at least for simplicity sake, that it is a view of a particular product. So, let's see how many different product views. Again, that's just going to be 21 million. So I'm not sure I'm going to like rerun that. So, now we need to find a little bit more information about our visitors. Where did our visitors come from? We're referring location or their channel. In BigQuery, if you wanted to comment out multiple lines at once, you can hold down the Command or Windows key and the forward slash, and that'll help you comment or un-comment. Very useful in case you want to run just a portion of your query or if you want that multi-line comments as well. If you want that multi-line comment and you want to type it out manually, you can also do this and have anything inside there be commented out or not executed. So, a total number of unique visitors by channel grouping, I'm just going to copy and paste this query. The only thing that we're going to add in here is if you forgot the name of the column, going to "all_sessions", "preview" you can see that direct traffic, referring traffic, maybe a few others. We're going to bring copy and paste. I always prefer copy and pasting than writing out myself because as you can see, I can make a lot of mistakes typing these things out. Now, if I executed this, we got two aggregations and something that's not aggregated. That's going to land us a fancy arrow that says, "Hey, channel grouping is feeling left out. You're aggregating these with aggregate functions. How do you aggregate or group the channel groupings?" All direct with direct, all referring with referring, and you do that with a GROUP BY. So, we're going to include all the non-aggregated fields in our SELECT clause in that GROUP BY, and let's see what we return. How many different ways can a person visit our site? Different channels. So, we've got the number of views and the number of visitors for each channel. We have eight distinct channels, things like visiting it from social media, paid search advertising, organic search, affiliates programs, referrals, other, display advertising or directly typing in our website URL, the Google merchandise store directly in the browser. So this is useful. You can start getting your top pro, say if you wanted the top three referring channel sources. You could do something like ORDER BY. It depends whether or not you want unique visitors or product views. We're going to ORDER BY the total visitors, and we're going to do descending, highest to lowest, limiting just those top three, ending it with a semicolon just to terminate the query. Runs selected on that. Let's see our top three. Who's giving us the most visitors? Are folks visiting our website directly? So through search, it could be through a search engine and then direct, and then through the referral. Great. Now, if you're curious about how do you get that percentage? How do I actually get the total number of unique visitors for this group divided by the sum of all of these? We're going to cover that using a partition or a Window function inside of our advanced SQL course a little bit later on. Of course, you could do the ratio within the same row, like visitors divided by product_views just by dividing those two columns. All right, moving on. Now, we want to look at the product catalog. What are the total number of unique product names listed alphabetically? So again, back to our raw data table, making sure that we're not going to making any copy and paste errors. Let's take a look at some of those product names that we have. Product names. Boom. We got foam, can, and bottle coolers. Boom. That's the field that we want, product name. Give me all the product names. We know we can't just select that. I'm going to copy and paste the table that we're pulling it from. All the product names grouped together again, by the column index or the name, and ordered alphabetically. You can order it alphabetically with a to z which is the default. That's why you never see this inside of SQL or you usually don't. Ascending because it defaults as ascending. Let's see what types of products we have and how many just by looking at the amount of query results that are returned. So, we have 633 products in our catalog. Now, the caveat there which is a very interesting one is, does that mean that we have 633 products total in our catalog as I just said or does that mean there's 633 products that have been interacted with as a whole? So giving an example of a product or when would you find a product that we have that does not appear in this list here? Again, the key is that we're looking only at session data from visitors. The answer to that is, if you had a product that's brand new but nobody has interacted with, let's say you just launched it on your site, nobody's clicked on it and nobody's added to the cart, you can have it in your catalog, but if it's not part of the web analytics, it's not going to show up in this catalog lists. So again, that's a very interesting caveat that you should keep in mind. So, you've got those products that are ordered alphabetically as you see numerical first. All right. Moving right along, we have another question. Which five products has the most views from unique visitors? So, it's a little bit of a combination of the queries that we've written before. So, I'm going to pull that same query that we had before for products. Here, I'm just going to add in the product views. Again, a little bit of a safe assumption assuming that each of the different rows here has a different product view. Making sure to do the GROUP BY, and in this particular case, we were only filtering where there are page views and not necessarily events. So you want to filter for the different types of page and we want to ORDER BY the- instead of ordering it by the product name, we can ORDER BY which products have the most views, highest to lowest is descending. If we wanted to do just the top five, we can introduce a LIMIT clause there. Executing selected on that will give us the top five product views. Cool. So a little insight, we have product views with high in the list, 300,000, and we have a men's T-Shirt and then a YouTube bottle infuser. Curious to see what that is. A 200,000 view count on the short sleeve YouTube shirt, and then a ton of views for the custom decals. So, let's go a little bit further. Since this is Ecommerce, we have views but how many of those are actually ordered? So let's expand our previous query. Block comment that out. To include the total number of distinct products that are ordered as well as the total number of units that were ordered. All right. So we got views, we're going to keep that. We're going to keep building on this query here. Now, going back into the sessions table, taking a look at preview, there's got to be something in here that has to do with the quantity of products or order. So, you've got product quantity in here looks. It looks like that's going to be our winner. We've got product quantity, but this is for every single product view in there. We need to do a little of an aggregation function on there. So, here's where it gets interesting. If I wanted to count up the total amount of YouTube decals that were ordered, I can just sum the total quantity from all carts that were added. Now, if I wanted to, say if I had a cart with 10 YouTube decals on it but I only want to count that as that one order as having YouTube products, then I can actually just do a count. So, to explain that a little bit further, if this product quantity is a one, say that was at 10, I ordered 10 different YouTube decals, I only want to count that as one order. Then that's when count comes into play. It only counts those once where the sum will actually take the mathematical sum of all those together, not forgetting a comma, and not forgetting our aliases here. So, this sum is going to be the quantity of products that's ordered in terms of the number of units, the count ordered is going to be our order. Now, an additional filter that you can add on here, this just means that these products have been added to cart, it doesn't necessarily mean that they've been checked out. So, all the way on the right, I want to just give you a little sneak preview where we're looking at a page, filter, and the eCommerceAction_type is equal to, this is a string, as you can see, just by hovering here and string takes a single quotes. EcommerceAction_type is equal to six minutes actually qualified. But if you did not want to filter, if you just want to take a look at the number that were incurred as part of those orders even if they did not check out, then we won't exclude checkout from here. So, and you can also do additional filters if you want to look at how many were in carts. We're going to look at abandoned carts later where it's been added to cart but that session has not checked out a little bit later, that's a funny one. That'll be in your challenge exercises later. So, we've got products, we've got quantity ordered, and we've got orders, let's see how much data we have for the products that are actually ordered. Here's the orders versus the products were ordered but notice that we now lose a ton of product views because we've filtered on only those views that actually turned into orders, now product views is synonymous with orders here. So, I'm going to keep that filtered out, this is going to be, we'll call that potential orders, potential or completed orders because that could be in the cart, and the quantity of products ordered again, that's potential as well. So, let's take a look at these results because we're going to expand this query yet again. This gives an interesting insight of, there are 11,336 decals in the cart and a fraction of those have been checked out across 1,700 orders. So, you can see that although there's only 1,700 orders there's 11,000 in the cart, that indicates that for anytime someone buys a YouTube decal, they typically buy more than one. So, let's see exactly what that ratio is. Or before we do that, what is the most product of the customers views did they get the most potential orders? Most views since we're ordering it by that is the Google's men T-shirt. Most ordered from the top five viewed is actually those custom decals. Actually most ordered, if you're talking about a strict order sense, units ordered is the YouTube decals, but number of orders through checkout distinct orders is still that 3,000 for the T-shirt there. Now, a caveat here is we're listing by the top five viewed. You could have a product that hasn't been viewed that much say number six, but has a ton of very high conversion rate, a ton of products that are ordered as well. So, keep all that in mind when you're using limits and you're using your order bias for filters. All right. Last but not least before we move on into a separate walk-through on the challenge questions for those who already very experienced in SQL, I'm going to throw some harder ones your way. We want to expand the query to include the ratio of products ordered. So, here we're just going to bring this back. By the way, if you wanted to know what eCommerce Action_type is equal to six, I just went to the schema, eCommerceAction_type equals to six is completed purchase. I'm going to show you how you can actually get the verbiage of completed purchase in there a little bit later using a case statement, and in future courses ingesting that actually into a lookup table. So, you're not constantly wondering like, "Wait a minute, is six checkout or is six refund?" So, I'll show you how to do that. So, ratio of products as you saw before, you can just do ratios of aggregation function is not a problem, so we're going to take the sum of all the units that were ordered. Now, you cannot use the aliases here as your numerator and denominator because those don't exist when this query is running unless you're going to be using something like a sub-query or a common table expression with a width clause that we'll cover a little bit later on in this specialization. So, we're going to name that as the average per order keeping in mind that these are also potential orders somewhere within the checkout process. Let's see if that runs. If you're like I said that a lot when I'm working with SQL. Let's see how it runs. Hopefully it runs. Great. So, we have the results here, and you can see we knew heuristically that we had a ton of YouTube decals and then we had the parcel completed orders. So, we had the actual ratio of this when folks add a YouTube decal to the cart, they typically add on average six. But look at this one. When folks add a YouTube bottle infuser, they typically add nine quantity whereas the YouTube men's short sleeve shirt is more of a one for one. A person adds it to the cart that typically only buying one as opposed to the Google men's cotton short sleeve hero tee white, they will actually order on average two of those. Now, keep in mind averages. One person could have ordered a thousand of these and everyone else orders one, and that can drive up the average as well. So, you could think about using another statistical measure like median or standard deviation or something like that which we'll cover when we talk about advanced functions. So, don't be fooled by your data. That's the end of this lab which is exploring a lot of your SQL, just a quick recap, some of the things that I like to view from a look at tips and tricks standpoint. Never forget I constantly move between the schema of the data if it's available, and if you're creating a new datasets, good on you to create a good scheme with good descriptions, for the ecommerce data set, I have the schema always open in a separate tab. Inside of your SQL make sure you save your lab. If you have your own pre-written queries before you click end on this lab, copy and paste those into a notepad or somewhere else because those queries can be deleted when your project spins down at the end of your lab, and each different lab is going to be a brand new environment. So, make sure you don't lose any good notes or any good queries you have. I'll provide you with this query that we've written here together in case you want to copy and paste that and use it in your future work. The last two tips that I'll leave you with, don't forget to use the command key or the Windows key then click on the table assuming that you've already have access to the dataset here on the left side panel. That gets you to the schema. You can click on items in the schema, you can view items in the details to get total row count and total signs, and we spent a lot of time in the preview panel taking a look at a lot of the data that you have access to and then query it. So, that's the two major ways you can look at your data, clean it, execute a lot of basic commands to take a look at the insights. If you are sticking around for the challenge questions, be sure to give those a good attempt and then we will see you back together for the challenge video walk-through.