Case study two. This case involves a media company that's decided to move their in-house data processing into BigQuery. This example is focused on security and compliance. As part of the migration, they've been moving their data centers from on-prem to BigQuery and the cloud. They have a lot of concerns about security. Who has access to the data they're migrating into the cloud? How is access audited and logged? What kind of controls can be placed on top of that? They're very concerned about data exfiltration. They're worried about potential bad actors within the company, who as part of their role have access to certain data. They want to make sure that employees who have access to that data can not then take the data, load it onto their own computer or load it onto another cloud project and from there perhaps take that data somewhere else. A customer had this interesting business requirement. Capture data, reading, and updates events to know who, what, when, and where. Separation of who manages the data and who can read the data. Allocate cost appropriately, cost to read process versus cost to store. Prevent exfiltration of data to other GCP projects and to external systems. We worked together to understand these business requirements and to help turn them into more technical requirements. We wanted to focus the technologies and the capabilities already available in BigQuery. So we introduced some of the concept of audit logs on TCP and specifically the default logs available from BigQuery. We presented them with admin logs that record creating and deleting data sets, and then the more detailed access logs that identify when people are reading datasets or perhaps even reading or accessing parts of the BigQuery UI. We encouraged them to have everything managed by IAM. We develop groups based on role, then assign members to groups and establish permissions and applied those to the groups based on role. We mapped that to technical requirements like this. Requirements, all access to data should be captured in audit logs, all access to data should be managed via IAM. Configure service perimeters with VPC service controls. This is how we implemented that technical requirement. Each group was isolated and separate projects and allowed limited access between them using VPC service controls. BigQuery allows separation of access by role. So we were able to limit some roles to only loading data and others to only running queries. Some groups were able to run queries in their own project using datasets for which they only had read access and the data was stored in a separate repository. We made sure that at the folder level of the resource hierarchy, we had aggregated log exports enabled. That ensured that even if you were the owner of a project and had the ability to redirect exports, you wouldn't be able to do so without specific exports because those rights were set at the folder level, where most team members didn't have access. So by using aggregated log exports, we were able to scoop up all the logs, store them in cloud storage, and create a record of who's running what query at what time against what dataset. The VPC perimeter enabled us to allow APIs within the perimeter to run and only talk with other APIs blowing into other projects within the same perimeter. So if someone had a separate project and started a BigQuery job that was to read from a dataset within the perimeter, even though they have credentials and access to the dataset, they would not be able to use or run the queries because the APIs would not allow it at the perimeter.