In the demo at the end of the previous section, I copied the ingested and transformed files out of the compute instance into Cloud Storage. This way, we could stop our compute instance and retain access to the data. While we've discussed the demand for a great amount of computing power for today's big data and ML jobs, we still need a place to store all the data that's generated. As with the demo, this needs to be separate from the compute instances, so that we can analyze it, transform it, and feed it into our models. This is one major way that Cloud computing differs from desktop computing. Compute and storage are independent. You don't want to think of disks attached to the compute instance as the limit of how much data you can process and store. Getting your data into your solution and transforming it for your purposes, should be your first priority. In the roles and team structured discussions that I'll talk about later in this module, I'll talk about the need for data engineers to build data pipelines, before you can start building machine learning models from that data. When the pipeline is built, the job isn't done. Because once the data is in your system, data engineers are still needed to replicate the data, back it up, scale it, remove it as needed, all at scale. Instead of data engineers managing storage infrastructures themselves, data engineers can use Google Cloud Storage which is a durable global file system. Creating an elastic storage bucket, it is as simple as using the WebUI in console.cloud.google.com, or the Google Storage utility function called gsutil in the command line interface. As an additional level of flexibility, a data engineer can choose what type of storage class they want for their data. There are four storage classes for you to choose from based on your data needs: multi-regional, regional, nearline, and coldline. For big data analytic workloads, the most common thing is to use a regional cloud storage bucket for staging your data. Since we have mentioned a GCP resource, a Cloud Storage bucket, it's important to cover some of the account management logistics before we get too far in. Starting from the most granular objects, you see that resources like a Cloud Storage bucket or Compute Engine instance, these resources belong to specific projects. Bucket names have to be globally unique and GCP assigns your project ID that's globally unique too, and so you can use that project ID a unique name for your bucket. But what's a project? A project is a base-level organizing entity for creating and using resources and services for managing billing APIs and permissions. Zones and regions, physically organize the GCP resources you use. Whereas projects logically organize them. Projects can be created, managed, deleted, even recovered from accidental deletions. Folders are another logical grouping, you can have for collections of projects. Having an organization is required to use folders. But what's an organization? The organization is a root node of the entire GCP hierarchy. While it's not required, an organization is quite useful, because it allows you to set policies that apply throughout your enterprise to all the projects and all the folders that are created in your enterprise. Cloud Identity and Access Management, also called IAM or IAM, lets you fine tune access control to all the GCP resources you use. You define IAM policies that control user access to resources. Remember, if you want to use folders, you must have an organization. Now that you have a Google Cloud Storage bucket created, how do you get your data on the cloud, and work with the data once it's there in the bucket? In the demo, I used gsutil commands. Specifically, we can use cp for copy and specify a target bucket location. If you spin up a Compute Engine instance, the command-line tool gsutil is already available and we can do gsutil copy. On your laptop, you can download the Google Cloud SDK, so that you can get gsutil. Gsutil uses a familiar Unix command-line syntax. So if you know how to use the cp command on Unix, you know how to use gsutil cp.