Here's some things you should know about Cloud Dataflow. You can write pipeline code in Java or Python. You can use the open source Apache Beam API to define the pipeline and submit it to Cloud Dataflow. Then Cloud Dataflow provides the execution framework. Parallel tasks are automatically scaled by the framework and the same code does real-time streaming and batch processing. One great thing about Cloud Dataflow is that you can get input from many sources and write output to many sinks but the pipeline code in-between remains the same. Cloud Dataflow supports side inputs. That's where you can take data and transform it in one way and transform it in a different way in parallel so that the two can be used together in the same pipeline. Security and Cloud Dataflow is based on assigning roles that limit access to the Cloud dataflow resources. So, your exam tip is, for Cloud Dataflow users use roles to limit access to only dataflow resources not just the project. The dataflow pipeline not only appears in code, but also is displayed in the GCP Console as a diagram. Pipelines reveal the progression of a data-processing solution and the organization of steps which make it much easier to maintain than other code solutions. Each step of the pipeline does a filter, group transform, compare, join, and so on. Transforms, can be done in parallel. Here are some of the most commonly used Cloud Dataflow operations. Do you know which operations are potentially computationally expensive? GroupByKey for one, could consume resources on big data. This is one reason you might want to test your pipeline a few times on sample data to make sure you know how it scales before executing it production scale. Exam tip, a pipeline is a more maintainable way to organize data processing code than for example, an application running on an instance. Do you need to separate dataflow developers of pipelines from dataflow consumers, users of the pipelines? Templates create the single-step of indirection, that allows the two classes of users to have different access. Dataflow Templates enable a new development in execution workflow. The Templates helps separate the development activities and the developers, from the execution activities and the users. The user environment no longer has dependencies back to the development environment. The need for recompilation to run a job is limited. The new approach facilitates a scheduling of batch jobs and opens up more ways for users to submit jobs and more opportunities for automation. Your exam tip here is that Dataflow Templates open up new options for separation of work. That means better security and resource accountability.